Commit Graph

1874 Commits

Author SHA1 Message Date
Peter Krempa
11bdab02c2 maint: include ignore-value in internal.h
The ignore_value macro is used across libvirt. This patch includes it in
the internal header and cleans all other includes.
2012-06-28 16:36:30 +02:00
Viktor Mihajlovski
6a6c347118 S390: Override QEMU_CAPS_NO_ACPI for s390x
Starting a KVM guest on s390 fails immediately. This is because
"qemu --help" reports -no-acpi even for the s390(x) architecture but
-no-acpi isn't supported there.
Workaround is to remove QEMU_CAPS_NO_ACPI from the capability set
after the version/capability extraction.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-06-25 16:43:18 +02:00
Osier Yang
968b6c60e9 qemu: Improve error if setmem fails for lacking of balloon support
"cannot set memory of an active domain" is misleading, it sounds
like setting memory of active domain is not supported.
2012-06-25 21:34:22 +08:00
Daniel P. Berrange
d7f9d82753 Include the default listen address in the live guest XML
If no 'listen' attribute or <listen> element is set in the
guest XML, the default driver configured listen address is
used. There is no way to client applications to determine
what this address is though. When starting the guest, we
should update the live XML to include this default listen
address
2012-06-25 13:05:55 +01:00
Gerd Hoffmann
fd4fd420b4 qemu: Add xhci support
qemu 1.1 features a xhci controller,
this patch adds support for it.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-06-21 16:33:00 +02:00
Peter Krempa
33dc8cf018 drivers: Implement virListAllDomains for drivers using virDomainObj
This patch adds support for listing all domains into drivers that use
the common virDomainObj implementation: libxl, lxc, openvz, qemu, test,
uml, vmware.

For drivers that don't support managed save images the guests are
treated as if they had none, so filtering guests that do have such an
image on this driver succeeds and produces 0 results.
2012-06-20 13:35:26 +02:00
Dipankar Sarma
d1778b7148 Fix default USB controller for ppc64
Fix the default usb controller for pseries systems if none
specified.

Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
2012-06-19 15:41:55 -06:00
Eric Blake
5488612eb0 list: add qemu snapshot list support
The two new functions are very similar to the existing functions;
just a matter of different arguments and a call to a different
helper function.

* src/qemu/qemu_driver.c (qemuDomainSnapshotListNames)
(qemuDomainSnapshotNum, qemuDomainSnapshotListChildrenNames)
(qemuDomainSnapshotNumChildren): Support new flags.
(qemuDomainListAllSnapshots): New functions.
2012-06-19 14:58:45 -06:00
Eric Blake
7e111c6fe6 snapshot: merge domain and snapshot computation
Now that domain listing is a thin wrapper around child listing,
it's easier to have a common entry point.  This restores the
hashForEach optimization lost in the previous patch when there
are no snapshots being filtered out of the entire list.

* src/conf/domain_conf.h (virDomainSnapshotObjListGetNames)
(virDomainSnapshotObjListNum): Add parameter.
(virDomainSnapshotObjListGetNamesFrom)
(virDomainSnapshotObjListNumFrom): Delete.
* src/libvirt_private.syms (domain_conf.h): Drop deleted functions.
* src/conf/domain_conf.c (virDomainSnapshotObjListGetNames):
Merge, and (re)add an optimization.
* src/qemu/qemu_driver.c (qemuDomainUndefineFlags)
(qemuDomainSnapshotListNames, qemuDomainSnapshotNum)
(qemuDomainSnapshotListChildrenNames)
(qemuDomainSnapshotNumChildren): Update callers.
* src/qemu/qemu_migration.c (qemuMigrationIsAllowed): Likewise.
* src/conf/virdomainlist.c (virDomainListPopulate): Likewise.
2012-06-18 15:11:28 -06:00
Eric Blake
06d4a1e429 snapshot: use metaroot node to simplify management
This idea was first suggested by Daniel Veillard here:
https://www.redhat.com/archives/libvir-list/2011-October/msg00353.html

Now that I am about to add more complexity to snapshot listing, it
makes sense to avoid code duplication and special casing for domain
listing (all snapshots) vs. snapshot listing (descendants); adding
a metaroot reduces the number of code lines by having the domain
listing turn into a descendant listing of the metaroot.

Note that this has one minor pessimization - if we are going to list
ALL snapshots without filtering, then virHashForeach is more efficient
than recursing through the child relationships; restoring that minor
optimization will occur in the next patch.

* src/conf/domain_conf.h (_virDomainSnapshotObj)
(_virDomainSnapshotObjList): Repurpose some fields.
(virDomainSnapshotDropParent): Drop unused parameter.
* src/conf/domain_conf.c (virDomainSnapshotObjListGetNames)
(virDomainSnapshotObjListCount): Simplify.
(virDomainSnapshotFindByName, virDomainSnapshotSetRelations)
(virDomainSnapshotDropParent): Match new field semantics.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML)
(qemuDomainSnapshotReparentChildren, qemuDomainSnapshotDelete):
Adjust clients.
2012-06-18 15:11:28 -06:00
Peter Krempa
bc8e15592c conf: Store managed save image existence in virDomainObj
This patch stores existence of the image in the object. At start of the
daemon the state is checked and then updated in key moments in domain
lifecycle.
2012-06-18 21:24:13 +02:00
Michal Privoznik
d97a234c62 qemu_agent: Wait for events instead of agent response
With latest changes to qemu-ga success on some commands is not reported
anymore, e.g. guest-shutdown or guest-suspend-*. However, errors are
still being reported. Therefore, we need to find different source of
indication if operation was successful. Events.
2012-06-16 09:06:57 +02:00
Michal Privoznik
c12d787eb0 qemu_agent: Add some more debug prints
for agent ref count and qemuProcessHandleAgentDestroy
2012-06-16 09:06:57 +02:00
Eric Blake
350583c859 build: hoist qemu dependence on yajl to configure
Commit 6e769eba made it a runtime error if libvirt was compiled
without yajl support but targets a new enough qemu.  But enough
users are hitting this on self-compiled libvirt that it is worth
erroring out at compilation time, rather than an obscure failure
when trying to use the built executable.

* configure.ac: If qemu is requested and -version works, require
yajl when qemu version is new enough.
* src/qemu/qemu_capabilities.c (qemuCapsComputeCmdFlags): Add
comment.
2012-06-15 19:49:00 -06:00
Wen Congyang
cdef31c562 qemu: allow the client to choose the vmcore's format
This patch updates qemu driver to allow the client to choose the
vmcore's format: memory only or including device state.
2012-06-15 20:36:14 +08:00
Wen Congyang
6fe26d89cc qemu: implement qemu's dump-guest-memory
dump-guest-memory is a new dump mechanism, and it can work when the
guest uses host devices. This patch adds a API to use this new
monitor command.
We will always use json mode if qemu's version is >= 0.15, so I
don't implement the API for text mode.
2012-06-15 20:36:14 +08:00
Wen Congyang
5136c5799f qemu: fix potential dead lock
If we lock the qemu_driver, we should call qemuDomainObjBeginJobWithDriver()
not qemuDomainObjBeginJob().
2012-06-15 20:25:35 +08:00
Peter Krempa
e0f0131d33 qemu: Enable disconnecting SPICE clients without changing password
Libvirt updates the configuration of SPICE server only when something
changes. This is unfortunate when the user wants to disconnect a
existing spice session when the connected attribute is already
"disconnect".

This patch modifies the conditions for calling the password updater to
be called when nothing changes, but the connected attribute is already
"disconnect".
2012-06-14 15:14:20 +02:00
Peter Krempa
0f4660c878 qemu: Fix off-by-one error while unescaping monitor strings
While unescaping the commands the commands passed through to the monitor
function qemuMonitorUnescapeArg() initialized lenght of the input string
to strlen()+1 which is fine for alloc but not for iteration of the
string.

This patch fixes the off-by-one error and drops the pointless check for
a single trailing slash that is automaticaly handled by the default
branch of switch.
2012-06-14 10:29:36 +02:00
Daniel P. Berrange
6510c97bf5 Add some missing hook functions
A core use case of the hook scripts is to be able to do things
to a guest's network configuration. It is possible to hook into
the 'start' operation for a QEMU guest which runs just before
the guest is started. The TAP devices will exist at this point,
but the QEMU process will not. It can be desirable to have a
'started' hook too, which runs once QEMU has started.

If libvirtd is restarted it will re-populate firewall rules,
but there is no QEMU hook to trigger for existing domains.
This is solved with a 'reconnect' hook.

Finally, if attaching to an external QEMU process there needs
to be an 'attach' hook script.

This all also applies to the LXC driver

* docs/hooks.html.in: Document new operations
* src/util/hooks.c, src/util/hooks.c: Add 'started', 'reconnect'
  and 'attach' operations for QEMU. Add 'prepare', 'started',
  'release' and 'reconnect' operations for LXC
* src/lxc/lxc_driver.c: Add hooks for 'prepare', 'started',
  'release' and 'reconnect' operations
* src/qemu/qemu_process.c: Add hooks for 'started', 'reconnect'
  and 'reconnect' operations
2012-06-13 18:23:00 +01:00
Michal Privoznik
86032b2276 qemu: Don't overwrite security labels
Currently, if qemuProcessStart fail at some point, e.g. because
domain being started wants a PCI/USB device already assigned to
a different domain, we jump to cleanup label where qemuProcessStop
is performed. This unconditionally calls virSecurityManagerRestoreAllLabel
which is wrong because the other domain is still using those devices.

However, once we successfully label all devices/paths in
qemuProcessStart() from that point on, we have to perform a rollback
on failure - that is - we have to virSecurityManagerRestoreAllLabel.
2012-06-12 11:14:38 +02:00
Michal Privoznik
69dd77149c qemuProcessStop: Switch to flags
Currently, we are passing only one boolean (migrated) so there is
no real profit in this. But it creates starting position for
next patch.
2012-06-12 09:57:02 +02:00
Eric Blake
e3559a6e66 snapshot: implement new APIs for qemu
The two APIs are rather trivial; based on bits and pieces of other
existing APIs.  It leaves the door open for future extension to
qemu to report snapshots without metadata based on reading qcow2
internal snapshot names.

* src/qemu/qemu_driver.c (qemuDomainSnapshotIsCurrent)
(qemuDomainSnapshotHasMetadata): New functions.
2012-06-11 15:23:02 -06:00
Guido Günther
3ac8fb54f4 Only check for cluster fs if we're using a filesystem
otherwise migration fails for e.g. network filesystems like sheepdog
with:

   error: Invalid relative path 'virt-name': Invalid argument

while we should fail with:

    Migration may lead to data corruption if disks use cache != none

References:

    http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=676328
    https://www.redhat.com/archives/libvirt-users/2012-May/msg00088.html
2012-06-08 19:54:11 +02:00
Li Zhang
04a319ba4e Assign correct address type to spapr-vlan and spapr-vty.
For pseries guest, spapr-vlan and spapr-vty is based
on spapr-vio address. According to model of network
device, the address type should be assigned automatically.
For serial device, serial pty device is recognized as
spapr-vty device, which is also on spapr-vio.

So this patch is to correct the address type of
spapr-vlan and spapr-vty, and build correct
command line of spapr-vty.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Reviewed-by:   Michael Ellerman<michaele@au1.ibm.com>
2012-06-07 14:32:27 -06:00
Martin Kletzander
bda2f17d7e qemu: better detection of crashed domains
When libvirtd is started and there is an unusable/not-connectable
leftover from earlier started machine, it's more reasonable to say
that the machine "crashed" if we know it was started with
"-no-shutdown".
This patch fixes that and also changes the other result (when machine
was started without "-no-shutdown") to "unknown", because the previous
"failed" reason means (according to include/libvirt/libvirt.h.in:174),
that the machine failed to start.
2012-06-07 08:43:03 +02:00
Beat Jörg
7508338ff3 Fix for parallel port passthrough for QEMU
I came across a bug that the command line generated for passthrough
of the host parallel port /dev/parport0 by libvirt for QEMU is incorrect.

It currently produces:
-chardev tty,id=charparallel0,path=/dev/parport0
-device isa-parallel,chardev=charparallel0,id=parallel0

The first parameter is "tty". It sould be "parport".

If I launch qemu with -chardev parport,... it works as expected.

I have already filled a bug report (
https://bugzilla.redhat.com/show_bug.cgi?id=823879 ), the topic was
already on the list some months ago:

https://www.redhat.com/archives/libvirt-users/2011-September/msg00095.html

Signed-off-by: Eric Blake <eblake@redhat.com>
2012-06-04 16:46:23 -06:00
Marti Raudsepp
195fa214b6 qemu: move -name arg to be 1st in "ps x" output
Currently, monitoring QEMU virtual machines with standard Unix
sysadmin tools is harder than it has to be. The QEMU command line is
often miles long and mostly redundant, it's hard to tell which process
is which.

This patch reorders the QEMU -name argument to be the first, so it's
immediately visible in "ps x", htop and "atop -c" output.
2012-06-01 15:06:56 -06:00
Laine Stump
6734ce7bc8 qemu: fix netdev alias name assignment wrt type='hostdev'
This patch resolves:

   https://bugzilla.redhat.com/show_bug.cgi?id=827519

The problem is that an interface with type='hostdev' will have an
alias of the form "hostdev%d", while the function that looks through
existing netdevs to determine the name to use for a new addition will
fail if there's an existing entry that does not match the form
"net%d".

This is another of the handful of places that need an exception due to
the hybrid nature of <interface type='hostdev'> (which is not exactly
an <interface> or a <hostdev>, but is both at the same time).
2012-06-01 13:25:56 -04:00
Wen Congyang
b19c236d69 qemu: avoid closing fd more than once
If we migrate to fd, spec->fwdType is not MIGRATION_FWD_DIRECT,
we will close spec->dest.fd.local in qemuMigrationRun(). So we
should set spec->dest.fd.local to -1 in qemuMigrationRun().

Bug present since 0.9.5 (commit 326176179).
2012-05-30 21:41:46 -06:00
Wen Congyang
0a045f01cf avoid closing uninitialized fd
If the system does not support bypass cache, we will close fd,
but it is uninitialized.
2012-05-30 13:55:49 -06:00
Stefan Berger
67dd486f20 leak_fix.diff
==3240== 23 bytes in 1 blocks are definitely lost in loss record 242 of 744
==3240==    at 0x4C2A4CD: malloc (vg_replace_malloc.c:236)
==3240==    by 0x8077537: __vasprintf_chk (vasprintf_chk.c:82)
==3240==    by 0x509C677: virVasprintf (stdio2.h:199)
==3240==    by 0x509C733: virAsprintf (util.c:1912)
==3240==    by 0x1906583A: qemudStartup (qemu_driver.c:679)
==3240==    by 0x511991D: virStateInitialize (libvirt.c:809)
==3240==    by 0x40CD84: daemonRunStateInit (libvirtd.c:751)
==3240==    by 0x5098745: virThreadHelper (threads-pthread.c:161)
==3240==    by 0x7953D8F: start_thread (pthread_create.c:309)
==3240==    by 0x805FF5C: clone (clone.S:115)
2012-05-29 06:25:59 -04:00
Daniel P. Berrange
de9758ae9b Autogenerate augeas test case from default config files
When adding new config file parameters, the corresponding
additions to the augeas lens' are constantly forgotten.
Also there are augeas test cases, these don't catch the
error, since they too are never updated.

To address this, the augeas test cases need to be auto-generated
from the example config files.

* build-aux/augeas-gentest.pl: Helper to generate an
  augeas test file, substituting in elements from the
  example config files
* src/Makefile.am, daemon/Makefile.am: Switch to
  auto-generated augeas test cases
* daemon/test_libvirtd.aug, daemon/test_libvirtd.aug.in,
  src/locking/test_libvirt_sanlock.aug,
  src/locking/test_libvirt_sanlock.aug.in,
  src/lxc/test_libvirtd_lxc.aug,
  src/lxc/test_libvirtd_lxc.aug.in,
  src/qemu/test_libvirtd_qemu.aug,
  src/qemu/test_libvirtd_qemu.aug.in: Remove example
  config file data, replacing with a ::CONFIG:: placeholder

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-05-28 11:07:12 +01:00
Daniel P. Berrange
6c10c04c39 Re-order config options in qemu driver augeas lens
Currently all the config options are listed under a 'vnc_entry'
group. Create a bunch of new groups & move options to the
right place

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-05-28 11:02:10 +01:00
Daniel P. Berrange
a9c779caf3 Fix mistakes in augeas lens
Add nmissing 'host_uuid' entry to libvirtd.conf lens and
rename spice_passwd to spice_password in qemu.conf lens

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-05-28 11:00:01 +01:00
Daniel P. Berrange
c5c3278e9b Standardize whitespace used in example config files
Instead of doing

  # example_config

use

  #example_config

so it is possible to programatically uncomment example config
options, as distinct from their comment/descriptions

Also delete rogue trailing comma not allowed by lens

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-05-28 10:59:13 +01:00
Daniel P. Berrange
517368a377 Remove uid param from directory lookup APIs
Remove the uid param from virGetUserConfigDirectory,
virGetUserCacheDirectory, virGetUserRuntimeDirectory,
and virGetUserDirectory

These functions were universally called with the
results of getuid() or geteuid(). To make it practical
to port to Win32, remove the uid parameter and hardcode
geteuid()

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-05-28 10:55:06 +01:00
Douglas Schilling Landgraf
cdd762e425 qemu augeas: Add spice_tls/spice_tls_x509_cert_dir
If vdsm is installed and configured in Fedora 17, we add the following
items into qemu.conf:

spice_tls=1
spice_tls_x509_cert_dir="/etc/pki/vdsm/libvirt-spice"

However, after this changes, augtool cannot identify qemu.conf anymore.
2012-05-24 21:17:37 -06:00
Daniel P. Berrange
a4e45a06c0 Split QEMU dtrace probes into separate file
When building as driver modules, it is not possible for the QEMU
driver module to reference the DTrace/SystemTAP probes linked into
the main libvirt.so. Thus we need to move the QEMU probes into a
separate file 'libvirt_qemu_probes.d'. Also rename the existing
file from 'probes.d' to 'libvirt_probes.d' while we're at it

* daemon/Makefile.am, src/internal.h: Include libvirt_probes.h
  instead of probes.h
* src/Makefile.am: Add rules for libvirt_qemu_probes.d
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor_json.c,
  src/qemu/qemu_monitor_text.c: Include libvirt_qemu_probes.h
* src/libvirt_probes.d: Rename from probes.d
* src/libvirt_qemu_probes.d: QEMU specific probes formerly
  in probes.d

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-05-24 13:18:01 +01:00
Peter Krempa
db19417fc0 qemu_hotplug: Don't free the PCI device structure after hot-unplug
The pciDevice structure corresponding to the device being hot-unplugged
was freed after it was "stolen" from activeList. The pointer was still
used for eg-inactive list. This patch removes the free of the structure
and frees it only if reset fails on the device.
2012-05-22 18:21:29 +02:00
Daniel P. Berrange
2cb0899eec Fix potential events deadlock when unref'ing virConnectPtr
When the last reference to a virConnectPtr is released by
libvirtd, it was possible for a deadlock to occur in the
virDomainEventState functions. The virDomainEventStatePtr
holds a reference on virConnectPtr for each registered
callback. When removing a callback, the virUnrefConnect
function is run. If this causes the last reference on the
virConnectPtr to be released, then virReleaseConnect can
be run, which in turns calls qemudClose. This function has
a call to virDomainEventStateDeregisterConn which is intended
to remove all callbacks associated with the virConnectPtr
instance. This will try to grab a lock on virDomainEventState
but this lock is already held. Deadlock ensues

Thread 1 (Thread 0x7fcbb526a840 (LWP 23185)):

Since each callback associated with a virConnectPtr holds a
reference on virConnectPtr, it is impossible for the qemudClose
method to be invoked while any callbacks are still registered.
Thus the call to virDomainEventStateDeregisterConn must in fact
be a no-op. Thus it is possible to just remove all trace of
virDomainEventStateDeregisterConn and avoid the deadlock.

* src/conf/domain_event.c, src/conf/domain_event.h,
  src/libvirt_private.syms: Delete virDomainEventStateDeregisterConn
* src/libxl/libxl_driver.c, src/lxc/lxc_driver.c,
  src/qemu/qemu_driver.c, src/uml/uml_driver.c: Remove
  calls to virDomainEventStateDeregisterConn
2012-05-21 18:50:47 +01:00
Hu Tao
fe0aac0503 Adds support to param 'vcpu_time' in qemu_driver.
This involves setting the cpuacct cgroup to a per-vcpu granularity,
as well as summing the each vcpu accounting into a common array.
Now that we are reading more than one cgroup file, we double-check
that cpus weren't hot-plugged between reads to invalidate our
summing.

Signed-off-by: Eric Blake <eblake@redhat.com>
2012-05-18 08:53:49 -06:00
Marc-André Lureau
a7675a6ba5 qemu: honour sound <codec> sub-elements
With ICH6 audio device, allow to specify codecs.
By default, for compatibility reasons, if no codec is specified,
"hda-duplex" will be used.
2012-05-17 11:40:36 -06:00
Marc-André Lureau
0aaebd7abc qemu: test CAPS_HDA_MICRO 2012-05-17 11:12:40 -06:00
Michal Privoznik
9c484e3dc5 qemu: Don't delete USB device on failed qemuPrepareHostdevUSBDevices
If qemuPrepareHostdevUSBDevices fail it will roll back devices added
to the driver list of used devices. However, if it may fail because
the device is being used already. But then again - with roll back.
Therefore don't try to remove a usb device manually if the function
fail. Although, we want to remove the device if any operation
performed afterwards fail.
2012-05-17 13:40:52 +02:00
Michal Privoznik
2f5fdc886e qemu: Rollback on used USB devices
One of our latest USB device handling patches
05abd1507d introduced a regression.
That is, we first create a temporary list of all USB devices that
are to be used by domain just starting up. Then we iterate over and
check if a device from the list is in the global list of currently
assigned devices (activeUsbHostdevs). If not, we add it there and
continue with next iteration then. But if a device from temporary
list is either taken already or adding to the activeUsbHostdevs fails,
we remove all devices in temp list from the activeUsbHostdevs list.
Therefore, if a device is already taken we remove it from
activeUsbHostdevs even if we should not. Thus, next time we allow
the device to be assigned to another domain.
2012-05-16 17:10:28 +02:00
Daniel Walsh
73580c60d1 Pass the virt driver name into security drivers
To allow the security drivers to apply different configuration
information per hypervisor, pass the virtualization driver name
into the security manager constructor.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-05-16 10:05:46 +01:00
Jiri Denemark
63b4243624 qemu: Add support for -no-user-config
Thanks to this new option we are now able to use modern CPU models (such
as Westmere) defined in external configuration file.

The qemu-1.1{,-device} data files for qemuhelptest are filled in with
qemu-1.1-rc2 output for now. I will update those files with real
qemu-1.1 output once it is released.
2012-05-15 20:29:12 +02:00
Daniel P. Berrange
1ebd52cb87 Fix logic for assigning PCI addresses to USB2 companion controllers
Currently each USB2 companion controller gets put on a separate
PCI slot. Not only is this wasteful of PCI slots, but it is not
in compliance with the spec for USB2 controllers. The master
echi1 and all companion controllers should be in the same slot,
with echi1 in function 7, and uhci1-3 in functions 0-2 respectively.

* src/qemu/qemu_command.c: Special case handling of USB2 controllers
  to apply correct pci slot assignment
* tests/qemuxml2argvdata/qemuxml2argv-usb-ich9-ehci-addr.args,
  tests/qemuxml2argvdata/qemuxml2argv-usb-ich9-ehci-addr.xml: Expand
  test to cover automatic slot assignment
2012-05-15 17:07:34 +01:00
Osier Yang
be9f6ecb28 qemu: Set memory policy using cgroup if placement is auto
Like for 'static' placement, when the memory policy mode is
'strict', set the memory policy by writing the advisory nodeset
returned from numad to cgroup file cpuset.mems,
2012-05-15 10:11:14 +08:00
Osier Yang
d1bdeca875 qemu: Use the CPU index in capabilities to map NUMA node to cpu list.
On some of the NUMA platforms, the CPU index in each NUMA node
grows non-consecutive. While on other platforms, it can be inconsecutive,
E.g.

% numactl --hardware
available: 4 nodes (0-3)
node 0 cpus: 0 4 8 12 16 20 24 28
node 0 size: 131058 MB
node 0 free: 86531 MB
node 1 cpus: 1 5 9 13 17 21 25 29
node 1 size: 131072 MB
node 1 free: 127070 MB
node 2 cpus: 2 6 10 14 18 22 26 30
node 2 size: 131072 MB
node 2 free: 127758 MB
node 3 cpus: 3 7 11 15 19 23 27 31
node 3 size: 131072 MB
node 3 free: 127226 MB
node distances:
node   0   1   2   3
  0:  10  20  20  20
  1:  20  10  20  20
  2:  20  20  10  20
  3:  20  20  20  10

This patch is to fix the problem by using the CPU index in
caps->host.numaCell[i]->cpus[i] to set the bitmask instead of
assuming the CPU index of the NUMA nodes are always sequential.
2012-05-15 10:09:43 +08:00
Li Zhang
bb725ac1fa Assign spapr-vio bus address to ibmvscsi controller
For pseries guest, the default controller model is
ibmvscsi controller, this controller only can work
on spapr-vio address.

This patch is to assign spapr-vio address type to
ibmvscsi controller and correct vscsi test case.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
2012-05-14 16:47:16 -06:00
Eric Blake
5f89c86004 build: really silence the 32-bit warning
Commit cdce2f42d tried to silence a compiler warning on 32-bit builds,
but the gcc shipped with RHEL 5 is old enough that the type conversion
via multiplication by 1 was insufficient for the task.

* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Previous attempt
didn't get past all gcc versions.
2012-05-14 09:14:58 -06:00
William Jon McCann
32a9aac2e0 Use XDG Base Directories instead of storing in home directory
As defined in:
http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html

This offers a number of advantages:
 * Allows sharing a home directory between different machines, or
sessions (eg. using NFS)
 * Cleanly separates cache, runtime (eg. sockets), or app data from
user settings
 * Supports performing smart or selective migration of settings
between different OS versions
 * Supports reseting settings without breaking things
 * Makes it possible to clear cache data to make room when the disk
is filling up
 * Allows us to write a robust and efficient backup solution
 * Allows an admin flexibility to change where data and settings are stored
 * Dramatically reduces the complexity and incoherence of the
system for administrators
2012-05-14 15:15:58 +01:00
Peter Krempa
c833526924 qemu: Don't skip detection of virtual cpu's on non KVM targets
This patch lifts the limit of calling thread detection code only on KVM
guests. With upstream qemu the thread mappings are reported also on
non-KVM machines.

QEMU adopted the thread_id information from the kvm branch.

To remain compatible with older upstream versions of qemu the check is
attempted but the failure to detect threads (or even run the monitor
command - on older versions without SMP support) is treated non-fatal
and the code reports one vCPU with pid of the hypervisor (in same
fashion this was done on non-KVM guests).
2012-05-11 16:40:05 +02:00
Peter Krempa
3163682b58 qemu: Re-detect virtual cpu threads after cpu hot (un)plug.
After a cpu hotplug the qemu driver did not refresh information about
virtual processors used by qemu and their corresponding threads. This
patch forces a re-detection as is done on start of QEMU.

This ensures that correct information is reported by the
virDomainGetVcpus API and "virsh vcpuinfo".

A failure to obtain the thread<->vcpu mapping is treated non-fatal and
the mapping is not updated in a case of failure as not all versions of
QEMU report this in the info cpus command.
2012-05-11 16:40:05 +02:00
Peter Krempa
e99ad93d02 qemu: Refactor qemuDomainSetVcpusFlags
This patch changes a switch statement into ifs when handling live vs.
configuration modifications getting rid of redundant code in case when
both live and persistent configuration gets changed.
2012-05-11 16:40:05 +02:00
Guannan Ren
ab5fb8f34c usb: fix crash when failing to attach a second usb device
when failing to attach another usb device to a domain for some reason
which has one use device attached before, the libvirtd crashed.
The crash is caused by null-pointer dereference error in invoking
usbDeviceListSteal passed in NULL value usb variable.
commit 05abd1507d introduces the bug.
2012-05-11 14:29:15 +08:00
Eric Blake
5c650b98ce qemu: fix build when !HAVE_NUMACTL
Commit 97010eb1f forgot to change the other side of an #ifdef.

* src/qemu/qemu_process.c (qemuProcessInitNumaMemoryPolicy): Add
argument.
2012-05-09 17:59:46 -06:00
Osier Yang
a00efddab6 numad: Divide cur_balloon by 1024 before passing it to numad
Numad expects MB by default.
2012-05-08 16:57:37 -06:00
Osier Yang
97010eb1f1 numad: Set memory policy from numad advisory nodeset
Though numad will manage the memory allocation of task dynamically,
it wants management application (libvirt) to pre-set the memory
policy according to the advisory nodeset returned from querying numad,
(just like pre-bind CPU nodeset for domain process), and thus the
performance could benefit much more from it.

This patch introduces new XML tag 'placement', value 'auto' indicates
whether to set the memory policy with the advisory nodeset from numad,
and its value defaults to the value of <vcpu> placement, or 'static'
if 'nodeset' is specified. Example of the new XML tag's usage:

  <numatune>
    <memory placement='auto' mode='interleave'/>
  </numatune>

Just like what current "numatune" does, the 'auto' numa memory policy
setting uses libnuma's API too.

If <vcpu> "placement" is "auto", and <numatune> is not specified
explicitly, a default <numatume> will be added with "placement"
set as "auto", and "mode" set as "strict".

The following XML can now fully drive numad:

1) <vcpu> placement is 'auto', no <numatune> is specified.

   <vcpu placement='auto'>10</vcpu>

2) <vcpu> placement is 'auto', no 'placement' is specified for
   <numatune>.

   <vcpu placement='auto'>10</vcpu>
   <numatune>
     <memory mode='interleave'/>
   </numatune>

And it's also able to control the CPU placement and memory policy
independently. e.g.

1) <vcpu> placement is 'auto', and <numatune> placement is 'static'

   <vcpu placement='auto'>10</vcpu>
   <numatune>
     <memory mode='strict' nodeset='0-10,^7'/>
   </numatune>

2) <vcpu> placement is 'static', and <numatune> placement is 'auto'

   <vcpu placement='static' cpuset='0-24,^12'>10</vcpu>
   <numatune>
     <memory mode='interleave' placement='auto'/>
   </numatume>

A follow up patch will change the XML formatting codes to always output
'placement' for <vcpu>, even it's 'static'.
2012-05-08 16:57:32 -06:00
Eric Blake
8be304ecb9 snapshot: allow block devices past cgroup
It turns out that when cgroups are enabled, the use of a block device
for a snapshot target was failing with EPERM due to libvirt failing
to add the block device to the cgroup whitelist.  See also
https://bugzilla.redhat.com/show_bug.cgi?id=810200

* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotUndoSingleDiskActive): Account for cgroup.
(qemuDomainSnapshotCreateDiskActive): Update caller.
2012-05-08 15:59:58 -06:00
Alon Levy
ba97e4edc6 domain_conf: add "default" to list of valid spice channels
qemu's behavior in this case is to change the spice server behavior to
require secure connection to any channel not otherwise specified as
being in plaintext mode. libvirt doesn't currently allow requesting this
(via plaintext-channel=<channel name>).

RHBZ: 819499

Signed-off-by: Alon Levy <alevy@redhat.com>
2012-05-08 12:14:45 -06:00
Guannan Ren
05abd1507d qemu: call usb search function for hostdev initialization and hotplug
src/qemu/qemu_hostdev.c:
refactor qemuPrepareHostdevUSBDevices function, make it focus on
adding usb device to activeUsbHostdevs after check. After that,
the usb hotplug function qemuDomainAttachHostDevice also could use
it.
expand qemuPrepareHostUSBDevices to perform the usb search,
rollback on failure.

src/qemu/qemu_hotplug.c:
If there are multiple usb devices available with same vendorID and productID,
but with different value of "bus, device", we give an error to let user
use <address> to specify the desired one.
2012-05-07 23:36:25 +08:00
Guannan Ren
9914477efc usb: create functions to search usb device accurately
usbFindDevice():get usb device according to
                idVendor, idProduct, bus, device
                it is the exact match of the four parameters

usbFindDeviceByBus():get usb device according to bus, device
                  it returns only one usb device same as usbFindDevice

usbFindDeviceByVendor():get usb device according to idVendor,idProduct
                     it probably returns multiple usb devices.

usbDeviceSearch(): a helper function to do the actual search
2012-05-07 23:36:22 +08:00
Jiri Denemark
409b5f5495 qemu: Emit compatible XML when migrating a domain
When we added the default USB controller into domain XML, we efficiently
broke migration to older versions of libvirt that didn't support USB
controllers at all (0.9.4 and earlier) even for domains that don't use
anything that the older libvirt can't provide. We still want to present
the default USB controller in any XML seen by a user/app but we can
safely remove it from the domain XML used during migration. If we are
migrating to a new enough libvirt, it will add the controller XML back,
while older libvirt won't be confused with it although it will still
tell qemu to create the controller.

Similar approach can be used in the future whenever we find out we
always enabled some kind of device without properly advertising it in
domain XML.
2012-05-07 14:26:02 +02:00
Jiri Denemark
cd603008b1 qemu: Don't use virDomainDefFormat* directly
Always use appropriate qemuDomain{,Def}Format wrapper since it may do
some additional magic based on the flags.
2012-05-05 00:37:30 +02:00
Eric Blake
13f9a19326 qemu: reject blockiotune if qemu too old
Commit 4c82f09e added a capability check for qemu per-device io
throttling, but only applied it to domain startup.  As mentioned
in the previous commit (98cec05), the user can still get an 'internal
error' message during a hotplug attempt, when the monitor command
doesn't exist.  It is confusing to allow tuning on inactive domains
only to then be rejected when starting the domain.

* src/qemu/qemu_driver.c (qemuDomainSetBlockIoTune): Reject
offline tuning if online can't match it.
2012-05-04 16:13:56 -06:00
Eric Blake
98cec05288 qemu: don't modify domain on failed blockiotune
If you have a qemu build that lacks the blockio tune monitor command,
then this command:

$ virsh blkdeviotune rhel6u2 hda --total_bytes_sec 1000
error: Unable to change block I/O throttle
error: internal error Unexpected error

fails as expected (well, the error message is lousy), but the next
dumpxml shows that the domain was modified anyway.  Worse, that means
if you save the domain then restore it, the restore will likely fail
due to throttling being unsupported, even though no throttling should
even be active because the monitor command failed in the first place.

* src/qemu/qemu_driver.c (qemuDomainSetBlockIoTune): Check for
error before making modification permanent.
2012-05-04 16:13:53 -06:00
Stefan Berger
c0774482ff qemu: fix resource leak
Error: RESOURCE_LEAK:
/libvirt/src/qemu/qemu_driver.c:6968:
alloc_fn: Calling allocation function "calloc".
/libvirt/src/qemu/qemu_driver.c:6968:
var_assign: Assigning: "nodeset" =  storage returned from "calloc(1UL, 1UL)".
/libvirt/src/qemu/qemu_driver.c:6977:
noescape: Variable "nodeset" is not freed or pointed-to in function "virTypedParameterAssign".
/libvirt/src/qemu/qemu_driver.c:6997:
leaked_storage: Variable "nodeset" going out of scope leaks the storage it points to.
2012-05-04 10:42:09 -04:00
Eric Blake
cdce2f42d9 qemu: avoid 32-bit compiler warning
On 32-bit platforms, gcc warns that the comparison between a long
and (ULLONG_MAX/1024/1024) is always false; throwing in a type
conversion shuts up the warning.

* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Shut gcc up.
2012-05-03 17:04:34 -06:00
Li Zhang
0d631e9182 Correct indent errors in the function qemuDomainNetsRestart
qemuDomainNetsRestart indents with 3 spaces.

This patch is to correct it.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
2012-05-03 17:25:40 +08:00
Josh Durgin
b57e01532a qemu: allow snapshotting of sheepdog and rbd disks
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2012-05-01 08:54:18 -06:00
Josh Durgin
d50cae3335 qemu: change rbd auth_supported separation character to ;
This works with newer qemu that doesn't allow escaping spaces.
It's backwards compatible as well.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2012-05-01 08:49:24 -06:00
Jiri Denemark
9d2ac5453e qemu: Make sure qemu can access its directory in hugetlbfs
When libvirtd is started, we create "libvirt/qemu" directories under
hugetlbfs mount point. Only the "qemu" subdirectory is chowned to qemu
user and "libvirt" remains owned by root. If umask was too restrictive
when libvirtd started, qemu user may lose access to "qemu"
subdirectory. Let's explicitly grant search permissions to "libvirt"
directory for all users.
2012-04-30 08:17:40 +02:00
Michal Privoznik
378031088f qemu_agent: Report error class at least
Currently, qemu GA is not providing 'desc' field for errors like
we are used to from qemu monitor. Therefore, we fall back to this
general 'unknown error' string. However, GA is reporting 'class' which
is not perfect, but much more helpful than generic error string.
Thus we should fall back to class firstly and if even no class
is presented, then we can fall back to that generic string.

Before this patch:
virsh # dompmsuspend --target mem f16
error: Domain f16 could not be suspended
error: internal error unable to execute QEMU command
'guest-suspend-ram': unknown QEMU command error

After this patch:
virsh # dompmsuspend --target mem f16
error: Domain f16 could not be suspended
error: internal error unable to execute QEMU command
'guest-suspend-ram': The command has not been found
2012-04-28 09:39:46 +02:00
Stefan Berger
59b935f5ae More coverity findings addressed
More bug extermination in the category of:

Error: CHECKED_RETURN:

/libvirt/src/conf/network_conf.c:595:
check_return: Calling function "virAsprintf" without checking return value (as is done elsewhere 515 out of 543 times).

/libvirt/src/qemu/qemu_process.c:2780:
unchecked_value: No check of the return value of "virAsprintf(&msg, "was paused (%s)", virDomainPausedReasonTypeToString(reason))".

/libvirt/tests/commandtest.c:809:
check_return: Calling function "setsid" without checking return value (as is done elsewhere 4 out of 5 times).

/libvirt/tests/commandtest.c:830:
unchecked_value: No check of the return value of "virTestGetDebug()".

/libvirt/tests/commandtest.c:831:
check_return: Calling function "virTestGetVerbose" without checking return value (as is done elsewhere 41 out of 42 times).

/libvirt/tests/commandtest.c:833:
check_return: Calling function "virInitialize" without checking return value (as is done elsewhere 18 out of 21 times).


One note about the error in commandtest line 809: setsid() seems to fail when running the test -- could be removed ?
2012-04-27 17:25:35 -04:00
Eric Blake
2eabac008e blockjob: fix block-stream bandwidth race
With RHEL 6.2, virDomainBlockPull(dom, dev, bandwidth, 0) has a race
with non-zero bandwidth: there is a window between the block_stream
and block_job_set_speed monitor commands where an unlimited amount
of data was let through, defeating the point of a throttle.

This race was first identified in commit a9d3495e, and libvirt was
able to reduce the size of the window for that race.  In the meantime,
the qemu developers decided to fix things properly; per this message:
https://lists.gnu.org/archive/html/qemu-devel/2012-04/msg03793.html
the fix will be in qemu 1.1, and changes block-job-set-speed to use
a different parameter name, as well as adding a new optional parameter
to block-stream, which eliminates the race altogether.

Since our documentation already mentioned that we can refuse a non-zero
bandwidth for some hypervisors, I think the best solution is to do
just that for RHEL 6.2 qemu, so that the race is obvious to the user
(anyone using stock RHEL 6.2 binaries won't have this patch, and anyone
building their own libvirt with this patch for RHEL can also rebuild
qemu to get the modern semantics, so it is no real loss in behavior).

Meanwhile the code must be fixed to honor actual qemu 1.1 naming.
Rename the parameter to 'modern', since the naming difference now
covers more than just 'async' block-job-cancel.  And while at it,
fix an unchecked integer overflow.

* src/qemu/qemu_monitor.h (enum BLOCK_JOB_CMD): Drop unused value,
rename enum to match conventions.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Reflect enum rename.
* src/qemu_qemu_monitor_json.h (qemuMonitorJSONBlockJob): Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Likewise,
and support difference between RHEL 6.2 and qemu 1.1 block pull.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Reject
bandwidth during pull with too-old qemu.
* src/libvirt.c (virDomainBlockPull, virDomainBlockRebase):
Document this.
2012-04-27 13:00:56 -06:00
Jiri Denemark
2d76fea134 qemu: Use common helper when probing qemu capabilities
QEMU binary is called several times when we probe different kinds of
capabilities the binary supports. This patch introduces new common
helper so that all probes use a consistent way of invoking qemu.
2012-04-27 12:09:32 +02:00
Eric Blake
8e532d3403 qemu: improve errors related to offline domains
https://bugzilla.redhat.com/show_bug.cgi?id=816662 pointed out
that attempting 'virsh blockpull' on an offline domain gave a
misleading error message about qemu lacking support for the
operation, even when qemu was specifically updated to support it.
The real problem is that we have several capabilities that are
only determined when starting a domain, and therefore are still
clear when first working with an inactive domain (namely, any
capability set by qemuMonitorJSONCheckCommands).

While this patch was able to hoist an existing check in one of the
three culprits, it had to add redundant checks in the other two
places (because you always have to check for an active domain after
obtaining a VM job lock, but the capability bits were being checked
prior to obtaining the job lock).

Someday it would be nice to patch libvirt to cache the set of
capabilities per qemu binary (as determined by inode and timestamp),
rather than re-probing the binary every time a domain is started,
and to teach the cache how to query the monitor during the one
time the probe is made rather than having to wait until a guest
is started; then, a capability probe would succeed even for offline
guests because it just refers to the cache, and the single check for
an active domain after grabbing the job lock would be sufficient.
But since that will involve a lot more coding, I'm happy to go
with this simpler solution for an immediate solution.

* src/qemu/qemu_driver.c (qemuDomainPMSuspendForDuration)
(qemuDomainSnapshotCreateXML, qemuDomainBlockJobImpl): Check for
offline state before checking an online-only cap.
2012-04-26 16:43:05 -06:00
Jiri Denemark
8ef5f26361 qemu: Avoid bogus error at the end of tunnelled migration
Once qemu monitor reports migration has completed, we just closed our
end of the pipe and let migration tunnel die. This generated bogus error
in case we did so before the thread saw EOF on the pipe and migration
was aborted even though it was in fact successful.

With this patch we first wake up the tunnel thread and once it has read
all data from the pipe and finished the stream we close the
filedescriptor.

A small additional bonus of this patch is that real errors reported
inside qemuMigrationIOFunc are not overwritten by virStreamAbort any
more.
2012-04-26 16:30:23 +02:00
Jiri Denemark
25a63451ad qemu: Fix detection of failed migration
When QEMU reported failed or canceled migration, we correctly detected
it but didn't really consider it as an error condition and migration
protocol just went on. Luckily, some of the subsequent steps eventually
failed end we reported an (unrelated and mostly random) error back to
the caller.
2012-04-26 16:30:23 +02:00
Jiri Denemark
6d64694762 qemu: Preserve original error during migration
In some cases (spotted with broken connection during tunneled migration)
we were overwriting the original error with worse or even misleading
errors generated when we were cleaning up after failed migration.
2012-04-26 16:30:22 +02:00
Peter Krempa
a2ba53cf18 cpu: Improve error reporting on incompatible CPUs
This patch modifies the CPU comparrison function to report the
incompatibilities in more detail to ease identification of problems.

* src/cpu/cpu.h:
    cpuGuestData(): Add argument to return detailed error message.
* src/cpu/cpu.c:
    cpuGuestData(): Add passthrough for error argument.
* src/cpu/cpu_x86.c
    x86FeatureNames(): Add function to convert a CPU definition to flag
                       names.
    x86Compute(): - Add error message parameter
                  - Add macro for reporting detailed error messages.
                  - Improve error reporting.
                  - Simplify calculation of forbidden flags.
    x86DataIteratorInit():
    x86cpuidMatchAny(): Remove functions that are no longer needed.
* src/qemu/qemu_command.c:
    qemuBuildCpuArgStr(): - Modify for new function prototype
                          - Add detailed error reports
                          - Change error code on incompatible processors
                            to VIR_ERR_CONFIG_UNSUPPORTED instead of
                            internal error
* tests/cputest.c:
    cpuTestGuestData(): Modify for new function prototype
2012-04-23 10:59:51 +02:00
Eric Blake
6fb8a64d93 qemu: use consistent error when qemu binary is too old
Most of our errors complaining about an inability to support a
particular action due to qemu limitations used CONFIG_UNSUPPORTED,
but we had a few outliers.  Reported by Jiri Denemark.

* src/qemu/qemu_command.c (qemuBuildDriveDevStr): Prefer
CONFIG_UNSUPPORTED.
* src/qemu/qemu_driver.c (qemuDomainReboot)
(qemuDomainBlockJobImpl): Likewise.
* src/qemu/qemu_hotplug.c (qemuDomainAttachPciControllerDevice):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorTransaction)
(qemuMonitorBlockJob, qemuMonitorSystemWakeup): Likewise.
2012-04-17 11:09:44 -06:00
Osier Yang
a4cda054e7 qemu: Split ide-drive into ide-cd and ide-hd
A "ide-drive" device can be either a hard disk or a CD-ROM,
if there is ",media=cdrom" specified for the backend, it's
a CD-ROM, otherwise it's a hard disk.

Upstream qemu splitted "ide-drive" into "ide-hd" and "ide-cd"
since commit 1f56e32, and ",media=cdrom" is not required for
ide-cd anymore. "ide-drive" is still supported for backwards
compatibility, but no doubt we should go foward.
2012-04-17 17:21:48 +08:00
Osier Yang
02e8d0cfdf qemu: Split scsi-disk into into scsi-hd and scsi-cd
A "scsi-disk" device can be either a hard disk or a CD-ROM,
if there is ",media=cdrom" specified for the backend, it's
a CD-ROM, otherwise it's a hard disk.

But upstream qemu splitted "scsi-disk" into "scsi-hd" and
"scsi-cd" since commit b443ae, and ",media=cdrom" is not
required for scsi-cd anymore. "scsi-disk" is still supported
for backwards compatibility, but no doubt we should go
foward.
2012-04-17 17:21:24 +08:00
Jan Kiszka
dde91ab917 Do not enforce source type of console[0]
If console[0] is an alias for serial[0], do not enforce the former to
have a PTY source type. This breaks serial consoles on stdio and makes
no sense.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2012-04-16 22:24:20 -06:00
Michal Privoznik
63ddc65d63 qemuProcessStart: Switch to flags instead of bunch booleans
Currently, we have 3 boolean arguments we have to pass
to qemuProcessStart(). As libvirt grows it is harder and harder
to remember them and their position. Therefore we should
switch to flags instead.
2012-04-16 17:20:04 +02:00
Osier Yang
6fbd5737e9 qemu: Avoid the memory allocation and freeing 2012-04-16 18:09:10 +08:00
Osier Yang
ccf80e3630 numad: Convert node list to cpumap before setting affinity
Instead of returning a CPUs list, numad returns NUMA node
list instead, this patch is to convert the node list to
cpumap before affinity setting. Otherwise, the domain
processes will be pinned only to CPU[$numa_cell_num],
which will cause significiant performance losses.

Also because numad will balance the affinity dynamically,
reflecting the cpuset from numad back doesn't make much
sense then, and it may just could produce confusion for
the users. Thus the better way is not to reflect it back
to XML. And in this case, it's better to ignore the cpuset
when parsing XML.

The codes to update the cpuset is removed in this patch
incidentally, and there will be a follow up patch to ignore
the manually specified "cpuset" if "placement" is "auto",
and document will be updated too.
2012-04-16 18:09:05 +08:00
Michal Privoznik
354e6d4ed0 qemu: Fix mem leak in qemuProcessInitCpuAffinity
If placement mode is AUTO, on some return paths char *cpumap or
char *nodeset are leaked.
2012-04-13 12:01:53 +02:00
D. Herrendoerfer
997366ca7d qemu,util: fix netlink callback registration for migration
This patch adds a netlink callback when migrating a VEPA enabled
virtual machine.  It fixes a Bug where a VM would not request a port
association when it was cleared by lldpad.

This patch requires the latest git version of lldpad to work.

Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
2012-04-12 14:32:10 -04:00
Michal Privoznik
b1256816ff qemuOpenFile: Don't force chown on NFS
If dynamic_ownership is off and we are creating a file on NFS
we force chown. This will fail as chown/chmod are not supported
on NFS. However, with no dynamic_ownership we are not required
to do any chown.
2012-04-12 13:53:38 +02:00
Eric Blake
a9d3495e67 blockjob: allow for fast-finishing job
In my testing, I was able to provoke an odd block pull failure:

$ virsh blockpull dom vda --bandwidth 10000
error: Requested operation is not valid: No active operation on device: drive-virtio-disk0

merely by using gdb to artifically wait to do the block job set speed
until after the pull had already finished.  But in reality, that should
be a success, since the pull finished before we had a chance to set
speed.  Furthermore, using a double job lock is not only annoying, but
a bug in itself - if you do parallel virDomainBlockRebase, and hit
the race window just right, the first call grabs the VM job to start
a fast block job, then the second call grabs the VM job to start
a long-running job with unspecified speed, then the first call finally
regrabs the VM job and sets the speed, which ends up running the
second job under the speed from the first call.  By consolidating
things into a single job, we avoid opening that race, as well as reduce
the time between starting the job and changing the speed, for less
likelihood of the speed change happening after block job completion
in the first place.

* src/qemu/qemu_monitor.h (BLOCK_JOB_CMD): Add new mode.
* src/qemu/qemu_driver.c (qemuDomainBlockRebase): Move secondary
job call...
(qemuDomainBlockJobImpl): ...here, for fewer locks.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Change
return value on new internal mode.
2012-04-11 21:45:43 -06:00
Eric Blake
a91ce852b5 blockjob: wire up qemu async virDomainBlockJobAbort
Without the VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag, libvirt will internally
poll using qemu's "query-block-jobs" API and will not return until the
operation has been completed.  API users are advised that this operation
is unbounded and further interaction with the domain during this period
may block.  Future patches may refactor things to allow other queries in
parallel with this polling.  For older qemu, we synthesize the cancellation
event, since qemu won't generate it.

The choice of polling duration copies from the code in qemu_migration.c.

Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2012-04-11 21:22:06 -06:00
Eric Blake
ecb39e9d4b blockjob: optimize JSON event handler lookup
Probably in the noise, but this will let us scale more efficiently
as we learn to recognize even more qemu events.

* src/qemu/qemu_monitor_json.c (eventHandlers): Sort.
(qemuMonitorEventCompare): New helper function.
(qemuMonitorJSONIOProcessEvent): Optimize event lookup.
2012-04-11 20:56:03 -06:00
Eric Blake
2b085f5bc5 blockjob: add qemu capabilities related to block pull jobs
RHEL 6.2 was released with an early version of block jobs, which only
worked on the qed file format, where the commands were spelled with
underscore (contrary to QMP style), and where 'block_job_cancel' was
synchronous and did not trigger an event.

The upcoming qemu 1.1 release has fixed these short-comings [1][2]:
the commands now work on multiple file types, are spelled with dash,
and 'block-job-cancel' is asynchronous and emits an event upon conclusion.

[1]qemu commit 370521a1d6f5537ea7271c119f3fbb7b0fa57063
[2]https://lists.gnu.org/archive/html/qemu-devel/2012-04/msg01248.html

This patch recognizes the new spellings, and fixes virDomainBlockRebase
to give a graceful error when talking to a too-old qemu on a partial
rebase attempt.  Fixes for the new semantics will come later.  This
patch also removes a bogus ATTRIBUTE_NONNULL mistakenly added in
commit 10ec36e2.

* src/qemu/qemu_capabilities.h (QEMU_CAPS_BLOCKJOB_SYNC)
(QEMU_CAPS_BLOCKJOB_ASYNC): New bits.
* src/qemu/qemu_capabilities.c (qemuCaps): Name them.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONCheckCommands): Set
them.
(qemuMonitorJSONBlockJob): Manage both command names.
(qemuMonitorJSONDiskSnapshot): Minor formatting fix.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob): Alter signature.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJob): Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Pass through
capability bit.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Update callers.
2012-04-11 20:43:53 -06:00
Peter Krempa
3d3de46a67 qemu: Fix deadlock when qemuDomainOpenConsole cleans up a connection
The new safe console handling introduced a possibility to deadlock the
qemu driver when a new console connection forcibly disconnects a
previous console stream that belongs to an already closed connection.

The virStreamFree function calls subsequently a the virReleaseConnect
function that tries to lock the driver while discarding the connection,
but the driver was already locked in qemuDomainOpenConsole.

Backtrace of the deadlocked thread:
0  0x00007f66e5aa7f14 in __lll_lock_wait () from /lib64/libpthread.so.0
1  0x00007f66e5aa3411 in _L_lock_500 () from /lib64/libpthread.so.0
2  0x00007f66e5aa322a in pthread_mutex_lock () from/lib64/libpthread.so.0
3  0x0000000000462bbd in qemudClose ()
4  0x00007f66e6e178eb in virReleaseConnect () from/usr/lib64/libvirt.so.0
5  0x00007f66e6e19c8c in virUnrefStream () from /usr/lib64/libvirt.so.0
6  0x00007f66e6e3d1de in virStreamFree () from /usr/lib64/libvirt.so.0
7  0x00007f66e6e09a5d in virConsoleHashEntryFree () from/usr/lib64/libvirt.so.0
8  0x00007f66e6db7282 in virHashRemoveEntry () from/usr/lib64/libvirt.so.0
9  0x00007f66e6e09c4e in virConsoleOpen () from /usr/lib64/libvirt.so.0
10 0x00000000004526e9 in qemuDomainOpenConsole ()
11 0x00007f66e6e421f1 in virDomainOpenConsole () from/usr/lib64/libvirt.so.0
12 0x00000000004361e4 in remoteDispatchDomainOpenConsoleHelper ()
13 0x00007f66e6e80375 in virNetServerProgramDispatch () from/usr/lib64/libvirt.so.0
14 0x00007f66e6e7ae11 in virNetServerHandleJob () from/usr/lib64/libvirt.so.0
15 0x00007f66e6da897d in virThreadPoolWorker () from/usr/lib64/libvirt.so.0
16 0x00007f66e6da7ff6 in virThreadHelper () from/usr/lib64/libvirt.so.0
17 0x00007f66e5aa0c5c in start_thread () from /lib64/libpthread.so.0
18 0x00007f66e57e7fcd in clone () from /lib64/libc.so.6

* src/qemu/qemu_driver.c: qemuDomainOpenConsole()
        -- unlock the qemu driver right after acquiring the domain
        object
2012-04-11 10:45:53 +02:00
Jiri Denemark
6eede368bc qemu: Warn on possibly incorrect usage of EnterMonitor*
qemuDomainObjEnterMonitor{,WithDriver} should not be called from async
jobs, only EnterMonitorAsync variant is allowed.
2012-04-11 09:57:39 +02:00
Jiri Denemark
08ec1d787f qemu: Track job owner for better debugging
In case an API fails with "cannot acquire state change lock", searching
for the API that possibly forgot to end its job is not always easy.
Let's keep track of the job owner and print it out for easier
identification.
2012-04-11 09:57:39 +02:00
Jiri Denemark
31796e2c1c qemu: Avoid excessive calls to qemuDomainObjSaveJob()
As reported by Daniel Berrangé, we have a huge performance regression
for virDomainGetInfo() due to the change which makes virDomainEndJob()
save the XML status file every time it is called. Previous to that
change, 2000 calls to virDomainGetInfo() took ~2.5 seconds. After that
change, 2000 calls to virDomainGetInfo() take 2 *minutes* 45 secs.

We made the change to be able to recover from libvirtd restart in the
middle of a job. However, only destroy and async jobs are taken care of.
Thus it makes more sense to only save domain state XML when these jobs
are started/stopped.
2012-04-11 09:57:21 +02:00
Daniel P. Berrange
ddf2dfa1f7 Wire up <loader> to set the QEMU BIOS path
* src/qemu/qemu_command.c: Wire up -bios with <loader>
* tests/qemuxml2argvdata/qemuxml2argv-bios.args,
  tests/qemuxml2argvdata/qemuxml2argv-bios.xml: Expand
  existing BIOS test case to cover <loader>
2012-04-10 16:34:39 +01:00
Eric Blake
1413560966 snapshot: fix memory leak on error
Leak introduced in commit 0436d32.  If we allocate an actions array,
but fail early enough to never consume it with the qemu monitor
transaction call, we leaked memory.

But our semantics of making the transaction command free the caller's
memory is awkward; avoiding the memory leak requires making every
intermediate function in the call chain check for error.  It is much
easier to fix things so that the function that allocates also frees,
while the call chain leaves the caller's data intact.  To do that,
I had to hack our JSON data structure to make it easy to protect a
portion of an arbitrary JSON tree from being freed.

* src/util/json.h (virJSONType): Name the enum.
(_virJSONValue): New field.
* src/util/json.c (virJSONValueFree): Use it to protect a portion
of an array.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONTransaction): Avoid
freeing caller's data.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive):
Free actions array on failure.
2012-04-06 08:39:34 -06:00
Michal Privoznik
650da0e99c qemu_ga: Don't overwrite errors on FSThaw
We can tell qemuDomainSnapshotFSThaw if we want it to report errors or
not. However, if we don't want to and an error has been already set by
previous qemuReportError() we must keep copy of that error not just a
pointer to it. Otherwise, it get overwritten if FSThaw reports an error.
2012-04-06 13:42:04 +02:00
Michal Privoznik
ea3bc548ac qemu: Build activeUsbHostdevs list on process reconnect
If the daemon is restarted it will lose list of active
USB devices assigned to active domains. Therefore we need
to rebuild this list on qemuProcessReconnect().
2012-04-04 15:09:41 +02:00
Michal Privoznik
e2f5dd6134 qemu: Delete USB devices used by domain on stop
To prevent assigning one USB device to two domains,
we keep a list of assigned USB devices. On domain
startup - qemuProcessStart() - we insert devices
used by domain into the list but remove them only
on detach-device. Devices are, however, released
on qemuProcessStop() as well.
2012-04-04 15:09:41 +02:00
Michal Privoznik
b2c7b9ee0e qemu: Don't leak temporary list of USB devices
and add debug message when adding USB device
to the list of active devices.
2012-04-04 15:09:41 +02:00
Jiri Denemark
66cab01ae1 qemu: Start nested job in qemuDomainCheckEjectableMedia
Originally, qemuDomainCheckEjectableMedia was entering monitor with qemu
driver lock. Commit 2067e31bf9, which I
made to fix that, revealed another issue we had (but didn't notice it
since the driver was locked): we didn't set nested job when
qemuDomainCheckEjectableMedia is called during migration. Thus the
original fix I made was wrong.
2012-04-02 21:44:27 +02:00
Philipp Hahn
b8bf79aad7 Support clock=variable relative to localtime
Since Xen 3.1 the clock=variable semantic is supported. In addition to
qemu/kvm Xen also knows about a variant where the offset is relative to
'localtime' instead of 'utc'.

Extends the libvirt structure with a flag 'basis' to specify, if the
offset is relative to 'localtime' or 'utc'.

Extends the libvirt structure with a flag 'reset' to force the reset
behaviour of 'localtime' and 'utc'; this is needed for backward
compatibility with previous versions of libvirt, since they report
incorrect XML.

Adapt the only user 'qemu' to the new name.
Extend the RelaxNG schema accordingly.
Document the new 'basis' attribute in the HTML documentation.
Adapt test for the new attribute.

Signed-off-by: Philipp Hahn <hahn@univention.de>
2012-04-02 09:08:31 -06:00
Eric Blake
095b0bc46a qemu: reflect any memory rounding back to xml
If we round up a user's memory request, we should update the XML
to reflect the actual value in use by the VM, rather than giving
an artificially small value back to the user.

* src/qemu/qemu_command.c (qemuBuildNumaArgStr)
(qemuBuildCommandLine): Reflect rounding back to XML.
2012-03-31 09:17:35 -06:00
Hendrik Schwartke
2711ac8716 qemu: support live change of the bridge used by a guest network device
This patch was created to resolve this upstream bug:

  https://bugzilla.redhat.com/show_bug.cgi?id=784767

and is at least a partial solution to this RHEL RFE:

  https://bugzilla.redhat.com/show_bug.cgi?id=805071

Previously the only attribute of a network device that could be
modified by virUpdateDeviceFlags() ("virsh update-device") was the
link state; attempts to change any other attribute would log an error
and fail.

This patch adds recognition of a change in bridge device name, and
supports reconnecting the guest's interface to the new device.
Standard audit logs for detaching and attaching a network device are
also generated. Although the current auditing function doesn't log the
bridge being attached to, this will later be changed in a separate
patch.
2012-03-30 20:14:36 -04:00
Laine Stump
ecde15910a qemu: eliminate nested switch, simplify code
qemuBuildHostNetStr had a switch-within-a-switch where both were
looking at the same variable. This was apparently to take advantage of
code common to three different cases (while also taking care of some
code that was different). However, there were only 2 lines common to
all, one of those can be eliminated by merging it into the
virAsprintfs that are in each case. On top of that, all the extra
empty cases cause Coverity complaints (because they are unreachable),
but absence of the empty cases causes a compile error due to
"enumeration value not handled in switch".

The solution is to just make each toplevel case independent, folding
in the common code to each.
2012-03-30 12:41:18 -04:00
Laine Stump
3269ee657c qemu: set default name for SPICE agent channel when generating command
commit b0e2bb33 set a default value for the SPICE agent channel by
inserting it during parsing of the channel XML. That method of setting
a default is problematic because it makes a format/parse roundtrip
unclean, and experience with setting other values as a side effect of
parsing has led to headaches (e.g. automatically setting a MAC address
in the parser when one isn't specified in the input XML).

This patch does not revert commit b0e2bb33 (it will be reverted in a
separate patch) but adds the alternate implementation of simply
inserting the default value in the appropriate place on the qemu
commandline when no value is provided.
2012-03-30 12:37:52 -04:00
Michal Privoznik
075c8518c6 qemu_agent: Issue guest-sync prior to every command
If we issue guest command and GA is not running, the issuing thread
will block endlessly. We can check for GA presence by issuing
guest-sync with unique ID (timestamp). We don't want to issue real
command as even if GA is not running, once it is started, it process
all commands written to GA socket.
2012-03-30 18:16:17 +02:00
Daniel P. Berrange
ec8cae93db Consistent style for usage of sizeof operator
The code is splattered with a mix of

  sizeof foo
  sizeof (foo)
  sizeof(foo)

Standardize on sizeof(foo) and add a syntax check rule to
enforce it

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-03-30 11:47:24 +01:00
Wen Congyang
ff68d6eeb5 fix a deadlock when qemu cannot start
When qemu cannot start, we may call qemuProcessStop() twice.
We have check whether the vm is running at the beginning of
qemuProcessStop() to avoid libvirt deadlock. We call
qemuProcessStop() with driver and vm locked. It seems that
we can avoid libvirt deadlock. But unfortunately we may
unlock driver and vm in the function qemuProcessKill() while
vm->def->id is not -1. So qemuProcessStop() will be run twice,
and monitor will be freed unexpectedly. So we should set
vm->def->id to -1 at the beginning of qemuProcessStop().
2012-03-30 14:21:49 +08:00
Christian Benvenuti
a02500d010 qemu: Make migration fail when port profile association fails on the dst host
In the current V3 migration protocol, Libvirt does not
check the result of the function

  qemuMigrationVPAssociatePortProfiles

This means that it is possible for a migration to complete
successfully even when the VM loses network connectivity on
the destination host.

With this change libvirt aborts the migration
(during the "finish" step) when the above function fails, that
is to say when at least one of the port profile associations fails.

Signed-off by: Christian Benvenuti <benve@cisco.com>
2012-03-28 10:45:22 -06:00
Eric Blake
a14eda311e snapshot: don't pass NULL to QMP command creation
Commit d42a2ff caused a regression in creating a disk-only snapshot
of a qcow2 disk; by passing the wrong variable to the monitor call,
libvirt ended up creating JSON that looked like "format":null instead
of the intended "format":"qcow2".

To make it easier to diagnose this in the future, make JSON creation
error out if "s:arg" is paired with NULL (it is still possible to
use "n:arg" in the rare cases where qemu will accept a null).

* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive): Pass correct value.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONMakeCommandRaw):
Improve error message.
2012-03-27 09:34:07 -06:00
D. Herrendoerfer
bd6b0a052e qemu,util: on restart of libvirt restart vepa callbacks
When libvirtd is restarted, also restart the netlink event
message callbacks for existing VEPA connections and send
a message to lldpad for these existing links, so it learns
the new libvirtd pid.

Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
2012-03-27 10:48:39 -04:00
Jiri Denemark
2067e31bf9 qemu: Avoid entering monitor with locked driver
This avoids possible deadlock of the qemu driver in case a domain is
begin migrated (in Begin phase) and unrelated connection to qemu driver
is closed at the right time.

I checked all callers of qemuDomainCheckEjectableMedia() and they are
calling this function with qemu driver locked.
2012-03-27 14:18:12 +02:00
Laine Stump
ecb4d92d57 build: fix "missing initializer" error in qemu_process.c
Found when attempting to build on Fedora 17 alpha with:

   ./autogen.sh --system --enable-compile-warnings=error

(this same build command works without problem on Fedora 16). Since
the consumer of the qemuProcessReconnectData doesn't assume that the
other fields of the struct are initialized (although it uses them
internally), the simpler solution is to just switch to C99-style
struct initialization (which doesn't require specification of all
fields).
2012-03-26 17:08:30 -04:00
Laine Stump
cf57d345b5 build: avoid frame size error when building without -O2
libvirt always adds -Werror-frame-larger-than=4096 to the flags when
it builds. When building on Fedora 17, two functions with multiple
1024 buffers declared inside if {} blocks would generate frame size
errors; apparently the version of gcc on Fedora 16 will merge these
multiple buffers into a single buffer even when optimization is off,
but Fedora 17 won't.

The fix is to declare a single 1024 buffer at the top of the two
offending functions, and reuse the single buffer throughout the
functions.
2012-03-26 17:08:30 -04:00
Martin Kletzander
9943276fd2 Cleanup for a return statement in source files
Return statements with parameter enclosed in parentheses were modified
and parentheses were removed. The whole change was scripted, here is how:

List of files was obtained using this command:
git grep -l -e '\<return\s*([^()]*\(([^()]*)[^()]*\)*)\s*;' |             \
grep -e '\.[ch]$' -e '\.py$'

Found files were modified with this command:
sed -i -e                                                                 \
's_^\(.*\<return\)\s*(\(\([^()]*([^()]*)[^()]*\)*\))\s*\(;.*$\)_\1 \2\4_' \
-e 's_^\(.*\<return\)\s*(\([^()]*\))\s*\(;.*$\)_\1 \2\3_'

Then checked for nonsense.

The whole command looks like this:
git grep -l -e '\<return\s*([^()]*\(([^()]*)[^()]*\)*)\s*;' |             \
grep -e '\.[ch]$' -e '\.py$' | xargs sed -i -e                            \
's_^\(.*\<return\)\s*(\(\([^()]*([^()]*)[^()]*\)*\))\s*\(;.*$\)_\1 \2\4_' \
-e 's_^\(.*\<return\)\s*(\([^()]*\))\s*\(;.*$\)_\1 \2\3_'
2012-03-26 14:45:22 -06:00
Osier Yang
beb76e3742 spec: Add missed dependancy for numad
numad is available since Fedora 17 and RHEL6.X. And it's not supported
on s390[x] and ARM.
2012-03-24 09:35:20 +08:00
Eric Blake
d42a2ffc07 snapshot: improve qemu handling of reused snapshot targets
The oVirt developers have stated that the real reasons they want
to have qemu reuse existing volumes when creating a snapshot are:
1. the management framework is set up so that creation has to be
done from a central node for proper resource tracking, and having
libvirt and/or qemu create things violates the framework, and
2. qemu defaults to creating snapshots with an absolute path to
the backing file, but oVirt wants to manage a backing chain that
uses just relative names, to allow for easier migration of a chain
across storage locations.

When 0.9.10 added VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT (commit
4e9953a4), it only addressed point 1, but libvirt was still using
O_TRUNC which violates point 2.  Meanwhile, the new qemu
'transaction' monitor command includes a new optional mode argument
that will force qemu to reuse the metadata of the file it just
opened (with the burden on the caller to have valid metadata there
in the first place).  So, this tweaks the meaning of the flag to
cover both points as intended for use by oVirt.  It is not strictly
backward-compatible to 0.9.10 behavior, but it can be argued that
the O_TRUNC of 0.9.10 was a bug.

Note that this flag is all-or-nothing, and only selects between
'existing' and the default 'absolute-paths'.  A more flexible
approach that would allow per-disk selections, as well as adding
support for the 'no-backing-file' mode, would be possible by
extending the <domainsnapshot> xml to have a per-disk mode, but
until we have a management application expressing a need for that
additional complexity, it is not worth doing.

* src/libvirt.c (virDomainSnapshotCreateXML): Tweak documentation.
* src/qemu/qemu_monitor.h (qemuMonitorDiskSnapshot): Add
parameters.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskSnapshot):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorDiskSnapshot): Pass them
through.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskSnapshot): Use
new monitor command arguments.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive)
(qemuDomainSnapshotCreateSingleDiskActive): Adjust callers.
(qemuDomainSnapshotDiskPrepare): Allow qed, modify rules on reuse.
2012-03-23 16:38:20 -06:00
Eric Blake
0436d328f5 snapshot: wire up qemu transaction command
The hardest part about adding transactions is not using the new
monitor command, but undoing the partial changes we made prior
to a failed transaction.

* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive): Use
transaction when available.
(qemuDomainSnapshotUndoSingleDiskActive): New function.
(qemuDomainSnapshotCreateSingleDiskActive): Pass through actions.
(qemuDomainSnapshotCreateXML): Adjust caller.
2012-03-23 16:38:20 -06:00
Eric Blake
64d5e815b7 snapshot: add support for qemu transaction command
QEmu 1.1 is adding a 'transaction' command to the JSON monitor.
Each element of a transaction corresponds to a top-level command,
with the additional guarantee that the transaction flushes all
pending I/O, then guarantees that all actions will be successful
as a group or that failure will roll back the state to what it
was before the monitor command.  The difference between a
top-level command:

{ "execute": "blockdev-snapshot-sync", "arguments":
  { "device": "virtio0", ... } }

and a transaction:

{ "execute": "transaction", "arguments":
  { "actions": [
    { "type": "blockdev-snapshot-sync", "data":
      { "device": "virtio0", ... } } ] } }

is just a couple of changed key names and nesting the shorter
command inside a JSON array to the longer command.  This patch
just adds the framework; the next patch will actually use a
transaction.

* src/qemu/qemu_monitor_json.c (qemuMonitorJSONMakeCommand): Move
guts...
(qemuMonitorJSONMakeCommandRaw): ...into new helper.  Add support
for array element.
(qemuMonitorJSONTransaction): New command.
(qemuMonitorJSONDiskSnapshot): Support use in a transaction.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskSnapshot): Add
argument.
(qemuMonitorJSONTransaction): New declaration.
* src/qemu/qemu_monitor.h (qemuMonitorTransaction): Likewise.
(qemuMonitorDiskSnapshot): Add argument.
* src/qemu/qemu_monitor.c (qemuMonitorTransaction): New wrapper.
(qemuMonitorDiskSnapshot): Pass argument on.
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive): Update caller.
2012-03-23 16:38:20 -06:00
Eric Blake
4c4cc1b96d snapshot: rudimentary qemu support for atomic disk snapshot
Taking an external snapshot of just one disk is atomic, without having
to pause and resume the VM.  This also paves the way for later patches
to interact with the new qemu 'transaction' monitor command.

The various scenarios when requesting atomic are:
online, 1 disk, old qemu - safe, allowed by this patch
online, more than 1 disk, old qemu - failure, this patch
offline snapshot - safe, once a future patch implements offline disk snapshot
online, 1 or more disks, new qemu - safe, once future patch uses transaction

Taking an online system checkpoint snapshot is atomic, since it is
done via a single 'savevm' monitor command.  Taking an offline system
checkpoint snapshot is atomic, thanks to the previous patch.

* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Support
new flag for single-disk setups.
(qemuDomainSnapshotDiskPrepare): Check for atomic here.
(qemuDomainSnapshotCreateDiskActive): Skip pausing the VM when
atomic supported.
(qemuDomainSnapshotIsAllowed): Use bool instead of int.
2012-03-23 16:38:20 -06:00
Eric Blake
922d498e1c snapshot: make offline qemu snapshots atomic
Offline internal snapshots can be rolled back with just a little
bit of refactoring, meaning that we are now automatically atomic.

* src/qemu/qemu_domain.c (qemuDomainSnapshotForEachQcow2): Move
guts...
(qemuDomainSnapshotForEachQcow2Raw): ...to new helper, to allow
rollbacks.
2012-03-23 16:38:20 -06:00
Eric Blake
311357d9e3 snapshot: add qemu capability for 'transaction' command
We need a capability bit to gracefully error out if some of the
additions in future patches can't be implemented by the running qemu.

* src/qemu/qemu_capabilities.h (QEMU_CAPS_TRANSACTION): New cap.
* src/qemu/qemu_capabilities.c (qemuCaps): Name it.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONCheckCommands): Set
it.
2012-03-23 16:38:19 -06:00
Osier Yang
7c5a0c94e4 qemu: Update domain status to running while wakeup event is emitted
This introduces a new running reason VIR_DOMAIN_RUNNING_WAKEUP,
and new suspend event type VIR_DOMAIN_EVENT_STARTED_WAKEUP.

While a wakeup event is emitted, the domain which entered into
VIR_DOMAIN_PMSUSPENDED will be transferred to "running"
with reason VIR_DOMAIN_RUNNING_WAKEUP, and a new domain lifecycle
event emitted with type VIR_DOMAIN_EVENT_STARTED_WAKEUP.
2012-03-23 23:12:29 +08:00
Osier Yang
321fa64bf5 qemu: Update domain state to pmsuspended while suspend event occurs 2012-03-23 23:12:26 +08:00
Osier Yang
487c063381 Add support for the suspend event
This patch introduces a new event type for the QMP event
SUSPEND:

    VIR_DOMAIN_EVENT_ID_PMSUSPEND

The event doesn't take any data, but considering there might
be reason for wakeup in future, the callback definition is:

typedef void
(*virConnectDomainEventSuspendCallback)(virConnectPtr conn,
                                        virDomainPtr dom,
                                        int reason,
                                        void *opaque);

"reason" is unused currently, always passes "0".
2012-03-23 23:12:18 +08:00
Osier Yang
57ddcc235a Add support for the wakeup event
This patch introduces a new event type for the QMP event
WAKEUP:

    VIR_DOMAIN_EVENT_ID_PMWAKEUP

The event doesn't take any data, but considering there might
be reason for wakeup in future, the callback definition is:

typedef void
(*virConnectDomainEventWakeupCallback)(virConnectPtr conn,
                                       virDomainPtr dom,
                                       int reason,
                                       void *opaque);

"reason" is unused currently, always passes "0".
2012-03-23 23:12:14 +08:00
Osier Yang
2d19e33f97 qemu: Update tray status while tray moved event is emitted
With this patch, libvirt won't start the guest with the medium
source which already ejected by guest when doing migration, or
saving/restoring.
2012-03-23 23:12:09 +08:00
Osier Yang
7fcf943bcd qemu: Prohibit setting tray status as open for block type disk 2012-03-23 23:12:02 +08:00
Osier Yang
ad7db43913 qemu: Do not start with source for removable disks if tray is open
This is similiar with physical world, one will be surprised if the
box starts with medium exists while the tray is open.

New tests are added, tests disk-{cdrom,floppy}-tray are for the qemu
supports "-device" flag, and disk-{cdrom,floppy}-no-device-cap are
for old qemu, i.e. which doesn't support "-device" flag.
2012-03-23 23:11:54 +08:00
Osier Yang
a26a1969c3 Add support for event tray moved of removable disks
This patch introduces a new event type for the QMP event
DEVICE_TRAY_MOVED, which occurs when the tray of a removable
disk is moved (i.e opened or closed):

    VIR_DOMAIN_EVENT_ID_TRAY_CHANGE

The event's data includes the device alias and the reason
for tray status' changing, which indicates why the tray
status was changed. Thus the callback definition for the event
is:

enum {
    VIR_DOMAIN_EVENT_TRAY_CHANGE_OPEN = 0,
    VIR_DOMAIN_EVENT_TRAY_CHANGE_CLOSE,

\#ifdef VIR_ENUM_SENTINELS
    VIR_DOMAIN_EVENT_TRAY_CHANGE_LAST
\#endif
} virDomainEventTrayChangeReason;

typedef void
(*virConnectDomainEventTrayChangeCallback)(virConnectPtr conn,
                                           virDomainPtr dom,
                                           const char *devAlias,
                                           int reason,
                                           void *opaque);
2012-03-23 23:10:26 +08:00
Daniel P. Berrange
1f66c18f79 Centralize error reporting for URI parsing/formatting problems
Move error reporting out of the callers, into virURIParse
and virURIFormat, to get consistency.

* include/libvirt/virterror.h, src/util/virterror.c: Add VIR_FROM_URI
* src/util/viruri.c, src/util/viruri.h: Add error reporting
* src/esx/esx_driver.c, src/libvirt.c, src/libxl/libxl_driver.c,
  src/lxc/lxc_driver.c, src/openvz/openvz_driver.c,
  src/qemu/qemu_driver.c, src/qemu/qemu_migration.c,
  src/remote/remote_driver.c, src/uml/uml_driver.c,
  src/vbox/vbox_tmpl.c, src/vmx/vmx.c, src/xen/xen_driver.c,
  src/xen/xend_internal.c, tests/viruritest.c: Remove error
  reporting

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-03-23 12:59:21 +00:00
Daniel P. Berrange
c33dae3175 Use virURIFree instead of xmlFreeURI
Since we defined a custom virURIPtr type, we should use a
virURIFree method instead of assuming it will always be
a typedef for xmlURIPtr

* src/util/viruri.c, src/util/viruri.h, src/libvirt_private.syms:
  Add a virURIFree method
* src/datatypes.c, src/esx/esx_driver.c, src/libvirt.c,
  src/qemu/qemu_migration.c, src/vmx/vmx.c, src/xen/xend_internal.c,
  tests/viruritest.c: s/xmlFreeURI/virURIFree/

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-03-23 12:59:20 +00:00
Jiri Denemark
1fdc53c385 qemu: Avoid dangling migration-out job when client dies
When a client which started non-p2p migration dies in a bad time, the
source libvirtd never clears the migration job and almost nothing can be
done with the domain without restarting the daemon. This patch makes use
of connection close callbacks and ensures that migration job is properly
discarded when the client disconnects.
2012-03-21 17:31:09 +01:00
Jiri Denemark
527d867a94 qemu: Make autodestroy utilize connection close callbacks 2012-03-21 17:31:09 +01:00
Jiri Denemark
791273603e qemu: Add connection close callbacks
Add support for registering arbitrary callback to be called for a domain
when a connection gets closed.
2012-03-21 17:31:09 +01:00
Jiri Denemark
4f061ea641 qemu: Avoid dangling migration-in job on shutoff domains
Destination daemon should not rely on the client or source daemon
(depending on the type of migration) to call Finish when migration
fails, because the client may crash before it can do so. The domain
prepared for incoming migration is set to be destroyed (and migration
job cleaned up) when connection with the client closes but this is not
enough. If the associated qemu process crashes after Prepare step and
the domain is cleaned up before the connection gets closed, autodestroy
is not called for the domain and migration jobs remains set. In case the
domain is defined on destination host (i.e., it is not completely
removed once destroyed) we keep the job set for ever. To fix this, we
register a cleanup callback which is responsible to clean migration-in
job when a domain dies anywhere between Prepare and Finish steps. Note
that we can't blindly clean any job when spotting EOF on monitor since
normally an API is running at that time.
2012-03-21 17:31:09 +01:00
Jiri Denemark
bf9f0a9726 qemu: Add support for domain cleanup callbacks
Add support for registering cleanup callbacks to be run when a domain
transitions to shutoff state.
2012-03-21 17:31:08 +01:00
Jiri Denemark
9f71368d06 qemu: Use unlimited speed when migrating to file
This reverts commit 61f2b6ba5f and most of
commit d8916dc8e2, which effectively
brings back commit ef1065cf5a written by
Jim Fehlig:

The qemu migration speed default is 32MiB/s as defined in migration.c

/* Migration speed throttling */
static int64_t max_throttle = (32 << 20);

There's no need to throttle migration when targeting a file, so set
migration speed to unlimited prior to migration, and restore to libvirt
default value after migration.

Default units is MB for migrate_set_speed monitor command, so
(INT64_MAX / (1024 * 1024)) is used for unlimited migration speed.

This was reverted because migration to file could not be canceled and
even monitored since qemu was not processing any monitor commands until
the migration finished. This is now different as we make sure the
file descriptor we pass to qemu is able to properly report EAGAIN.
Recent qemu changes might have helped as well.

I tested managedsave with this patch in and indeed, it is 10x faster
while I can still monitor its progress.
2012-03-21 17:26:20 +01:00
Eric Blake
7c736bab06 snapshot: make quiesce a bit safer
If a guest is paused, we were silently ignoring the quiesce flag,
which results in unclean snapshots, contrary to the intent of the
flag.  Since we can't quiesce without guest agent support, we should
instead fail if the guest is not running.

Meanwhile, if we attempt a quiesce command, but the guest agent
doesn't respond, and we time out, we may have left the command
pending on the guest's queue, and when the guest resumes parsing
commands, it will freeze even though our command is no longer
around to issue a thaw.  To be safe, we must _always_ pair every
quiesce call with a counterpart thaw, even if the quiesce call
failed due to a timeout, so that if a guest wakes up and starts
processing a command backlog, it will not get stuck in a frozen
state.

* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive):
Always issue thaw after a quiesce, even if quiesce failed.
(qemuDomainSnapshotFSThaw): Add a parameter.
2012-03-19 10:58:18 -06:00
Daniel P. Berrange
f987d17511 Fix handling of blkio deviceWeight empty string
A common coding pattern for changing blkio parameters is

  1. virDomainGetBlkioParameters

  2. change one or more params

  3. virDomainSetBlkioParameters

For this to work, it must be possible to roundtrip through
the methods without error. Unfortunately virDomainGetBlkioParameters
will return "" for the deviceWeight parameter for guests by default,
which virDomainSetBlkioParameters will then reject as invalid.

This fixes the handling of "" to be a no-op, and also improves the
error message to tell you what was invalid
2012-03-16 15:05:05 +00:00
Michal Privoznik
362c3b33e6 qemuDomainDetachPciDiskDevice: Free allocated cgroup
This function potentially allocates new virCgroup but never
frees it.
2012-03-15 17:10:22 +01:00