Commit Graph

3044 Commits

Author SHA1 Message Date
Shivaprasad G Bhat
81fae6b95c qemu: fix live pinning to memory node on NUMA system
Ever since the subcpusets(vcpu,emulator) were introduced, the parent
cpuset cannot be modified to remove the nodes that are in use by the
subcpusets.
The fix is to break the memory node modification into three steps:
 1. assign new nodes into the parent,
 2. change the nodes in the child nodes,
 3. remove the old nodes on the parent node.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1009880

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2013-12-04 13:32:07 +01:00
Laine Stump
5e12641e0e qemu: report error on attempt to live change virtio-net queues
This resolves:

  https://bugzilla.redhat.com/show_bug.cgi?id=1029732

The BZ asked for the capability to change the number of queues used by
a virtio-net device while the device is in use. Because the number of
queues can only be set at the time the device is created, that isn't
possible. However, libvirt also shouldn't be silently reporting
success when someone tries to change the number of queues. So this
patch flags that as an error (just as attempts to change any of the
other virtio-specific parameters already do).
2013-12-03 16:50:59 +02:00
Laine Stump
96fddee322 qemu: add "-boot strict" to commandline whenever possible
This resolves:

  https://bugzilla.redhat.com/show_bug.cgi?id=888635

(which was already closed as CANTFIX because the qemu "-boot strict"
commandline option wasn't available at the time).

Problem: you couldn't have a domain that used PXE to boot, but also
had an un-bootable disk device *even if that disk wasn't listed in the
boot order*, because if PXE timed out (e.g. due to the bridge
forwarding delay), the BIOS would move on to the next target, which
would be the unbootable disk device (again - even though it wasn't
given a boot order), and get stuck at a "BOOT DISK FAILURE, PRESS ANY
KEY" message until a user intervened.

The solution available since sometime around QEMU 1.5, is to add
"-boot strict=on" to *every* qemu command. When this is done, if any
devices have a boot order specified, then QEMU will *only* attempt to
boot from those devices that have an explicit boot order, ignoring the
rest.
2013-12-03 11:58:26 +02:00
Laine Stump
47b9aae0ae qemu: default to vfio for nodedev-detach
This patch resolves:

  https://bugzilla.redhat.com/show_bug.cgi?id=1035188

Commit f094aaac48 changed the PCI device assignment in qemu domains
to default to using VFIO rather than legacy KVM device assignment
(when VFIO is available). It didn't change which driver was used by
default for virNodeDeviceDetachFlags(), though, so that API (and the
virsh nodedev-detach command) was still binding to the pci-stub
driver, used by legacy KVM assignment, by default.

This patch publicizes (only within the qemu module, though, so no
additions to the symbol exports are needed) the functions that check
for presence of KVM and VFIO device assignment, then uses those
functions to decide what to do when no driver is specified for
virNodeDeviceDetachFlags(); if the vfio driver is loaded, the device
will be bound to vfio-pci, or if legacy KVM assignment is supported on
this system, the device will be bound to pci-stub; if neither method
is available, the detach will fail.
2013-12-03 11:58:26 +02:00
Peter Krempa
26fb96d8c0 qemu: snapshots: Declare supported and unsupported snapshot configs
Currently the snapshot code did not check if it actually supports
snapshots on various disk backends for domains. To avoid future problems
add checkers that whitelist the supported configurations.
2013-12-03 10:41:05 +01:00
Peter Krempa
bdeb0f0123 qemu: Clear old translated pool source
Clear the old data to avoid leaking it when attempting to re-translate a
pool on the same domain object.
2013-12-03 10:38:40 +01:00
Peter Krempa
0df53f0432 qemu: Refactor disk source string formatting
This patch adds function qemuGetDriveSourceString to produce
qemu-compatible disk source strings that will enable to reuse the code
and refactors building of the qemu commandline of disks to use this new
helper.
2013-12-03 10:36:12 +01:00
Peter Krempa
b384e2b4d7 qemu: Unify formatting of RBD sources 2013-12-03 10:31:19 +01:00
Peter Krempa
d94fd0c9c2 qemu: Split out NBD command generation 2013-12-03 10:28:57 +01:00
Peter Krempa
eaa1539b2f qemu: Migrate sheepdog source generation into common function 2013-12-03 10:27:11 +01:00
Peter Krempa
078a102537 qemu: Use qemuBuildNetworkDriveURI to handle http/ftp and friends
Prepare the function to integrate other protocols and start folding
other network protocols into a common place.
2013-12-03 10:25:11 +01:00
Peter Krempa
927ddae197 qemu: Simplify call pattern of qemuBuildDriveURIString
Automatically assign secret type from the disk source definition and
pull in adding of the comma. Then update callers to keep generated
output the same.
2013-12-03 10:23:16 +01:00
Peter Krempa
a29d33ffcb qemu: Split out formatting of network disk source URI
The snapshot code will need to use qemu-style formatted URIs of network
disks. Split out the code to avoid duplication.
2013-12-03 10:19:30 +01:00
Peter Krempa
e1a4d08baf qemu: Refactor qemuTranslateDiskSourcePool
Before this patch, the translation function still needs a second ugly
helper function to actually format the command line for qemu. But if we
do the right stuff in the translation function, we don't have to bother
with the second function any more.

This patch removes the messy qemuBuildVolumeString function and changes
qemuTranslateDiskSourcePool to set stuff up correctly so that the
regular code paths meant for volumes can be used to format the command
line correctly.

For this purpose a new helper "qemuDiskGetActualType()" is introduced to
return the type of the volume in a pool.

As a part of the refactor the qemuTranslateDiskSourcePool function is
fixed to do decisions based on the pool type instead of the volume type.
This allows to separate pool-type-specific stuff more clearly and will
ease addition of other pool types that will require certain other
operations to get the correct pool source.

The previously fixed tests should make sure that we don't break stuff
that was working before.
2013-12-03 10:16:12 +01:00
Peter Krempa
7e6242e9a7 qemu: snapshot: Add functions similar to disk source pool translation
To avoid future pain, add placeholder functions to get the actual
snapshot disk type.
2013-12-02 14:43:13 +01:00
Peter Krempa
cdf02d6474 qemu: snapshot: Touch up error message 2013-12-02 14:43:07 +01:00
Peter Krempa
d8cf91ae38 qemu: snapshot: Detect internal snapshots also for sheepdog and RBD
When doing an internal snapshot on a VM with sheepdog or RBD disks we
would not set a flag to mark the domain is using internal snapshots and
might end up creating a mixed snapshot. Move the setting of the variable
to avoid this problem.
2013-12-02 14:31:03 +01:00
Bing Bu Cao
8e043864ec qemu: preserve netdev MAC address during 'domxml-to-native'
The virsh command 'domxml-to-native' (virConnectDomainXMLToNative())
converts all network devices to "type='ethernet'" in order to make it
more likely that the generated command could be run directly from a
shell (other libvirt network device types end up referencing file
descriptors for tap devices assumed to have been created by libvirt,
which can't be done in this case).

During this conversion, all of the netdev parameters are cleared out,
then specific items are filled in after changing the type. The MAC
address was not one of these preserved items, and the result was that
mac addresses in the generated commandlines were always
00:00:00:00:00:00.

This patch saves the mac address before the conversion, then
repopulates it afterwards, so the proper mac addresses show up in the
commandline.

Signed-off-by: Bing Bu Cao <mars@linux.vnet.ibm.com>
Signed-off-by: Laine Stump <laine@laine.org>
2013-11-27 14:20:18 +02:00
Eric Blake
ecd881b7a7 storage: add network-dir as new storage volume type
In the 'directory' and 'netfs' storage pools, a user can see
both 'file' and 'dir' storage volume types, to know when they
can descend into a subdirectory.  But in a network-based storage
pool, such as the upcoming 'gluster' pool, we use 'network'
instead of 'file', and did not have any counterpart for a
directory until this patch.  Adding a new volume type
'network-dir' is better than reusing 'dir', because it makes
it clear that the only way to access 'network' volumes within
that container is through the network mounting (leaving 'dir'
for something accessible in the local file system).

* include/libvirt/libvirt.h.in (virStorageVolType): Expand enum.
* docs/formatstorage.html.in: Document it.
* docs/schemasa/storagevol.rng (vol): Allow new value.
* src/conf/storage_conf.c (virStorageVol): Use new value.
* src/qemu/qemu_command.c (qemuBuildVolumeString): Fix client.
* src/qemu/qemu_conf.c (qemuTranslateDiskSourcePool): Likewise.
* tools/virsh-volume.c (vshVolumeTypeToString): Likewise.
* src/storage/storage_backend_fs.c
(virStorageBackendFileSystemVolDelete): Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-11-25 12:29:49 -07:00
Shivaprasad G Bhat
ec1c34498b virsh domxml-from-native to treat SCSI as the bus type for pseries by default
The bus type IDE being enum Zero, the bus type on pseries system appears as IDE for all the -hda/-cdrom and for disk drives with if="none" type. Pseries platform needs this to appear as SCSI instead of IDE. The ide being not supported, the explicit requests for ide devices will return an error.

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
2013-11-25 10:44:46 -05:00
Ján Tomko
9846402116 Don't start a nested job in qemuMigrationPrepareAny
This nested job is canceled by the first ExitMonitor call (even though
it was not created by the corresponding EnterMonitor call), and
again in qemuMigrationPrepareAny if qemuProcessStart failed.
This can lead to a crash if the vm object was disposed of before calling
qemuDomainRemoveInactive:
0  ..62bc in virClassIsDerivedFrom (klass=0xdeadbeef,
   parent=0x7ffce4cdd270) at util/virobject.c:166
1 ..6666 in virObjectIsClass at util/virobject.c:362
2 ..66b4 in virObjectLock at util/virobject.c:314
3 ..477e in virDomainObjListRemove at conf/domain_conf.c:2359
4 ..7a64 in qemuDomainRemoveInactive at qemu/qemu_domain.c:2087
5 ..956c in qemuMigrationPrepareAny at qemu/qemu_migration.c:2469

This was added by commit e4e2822, exposed by 5a4c237 and c7ac251.

https://bugzilla.redhat.com/show_bug.cgi?id=1018267
2013-11-22 16:22:31 +01:00
Eric Farman
881eb78064 qemu: Auto-generate controller for hotplugged hostdev
If a SCSI hostdev is included in an initial domain XML, without a
corresponding controller statement, one is created silently when the
guest is booted.

When hotplugging a SCSI hostdev, a presumption is that the controller
is already present in the domain either from the original XML, or via
an earlier hotplug.

  [root@xxxxxxxx ~]# cat disk.xml
  <hostdev mode='subsystem' type='scsi'>
    <source>
      <adapter name='scsi_host0'/>
      <address bus='0' target='3' unit='1088438288'/>
    </source>
  </hostdev>
  [root@xxxxxxxx ~]# virsh attach-device guest01 disk.xml
  error: Failed to attach device from disk.xml
  error: internal error: unable to execute QEMU command 'device_add': Bus 'scsi0.0' not found

Since the infrastructure is in place, we can also create a controller
silently for use by the hotplugged hostdev device.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
2013-11-21 10:38:57 +01:00
Eric Farman
6f22f95f77 qemu: Separate calls based on controller bus type
For systems without a PCI bus, attaching a SCSI controller fails:

  [root@xxxxxxxx ~]# cat controller.xml
  <controller type='scsi' model='virtio-scsi' index='0' />
  [root@xxxxxxxx ~]# virsh attach-device guest01 controller.xml
  error: Failed to attach device from controller.xml
  error: XML error: No PCI buses available

A similar problem occurs with the detach of a controller:

  [root@xxxxxxxx ~]# virsh detach-device guest01 controller.xml
  error: Failed to detach device from controller.xml
  error: operation failed: controller scsi:0 not found

The qemuDomainXXtachPciControllerDevice routines made assumptions
that any caller had a PCI bus.  These routines now selectively calls
PCI functions where necessary, and assigns the device information
type to one appropriate for the bus in use.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2013-11-21 10:38:53 +01:00
Eric Farman
271eb0584b qemu: Rename controller hotplug functions to not be PCI-specific
For attach/detach of controller devices, we rename the functions to
remove 'PCI' from their title.  The actual separation of PCI-specific
operations will be handled in the next patch.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
2013-11-21 10:05:46 +01:00
Clark Laughlin
c7ccd2c44b qemu: Add support for virt machine type with virtio-mmio devices on armv7
These changes allow the correct virtio-blk-device and virtio-net-device
devices to be used for the 'virt' machine type for armv7 rather than the
PCI virtio devices.

A test case was added to qemuxml2argvtest for this change.

Signed-off-by: Clark Laughlin <clark.laughlin@linaro.org>
2013-11-20 14:31:17 -05:00
Eric Blake
5d509e9ee2 maint: fix comma style issues: qemu
Most of our code base uses space after comma but not before;
fix the remaining uses before adding a syntax check.

* src/qemu/qemu_cgroup.c: Consistently use commas.
* src/qemu/qemu_command.c: Likewise.
* src/qemu/qemu_conf.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
* src/qemu/qemu_monitor.c: Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-11-20 09:14:55 -07:00
Michal Privoznik
730af8f2cd qemuMonitorJSONGetCPUx86Data: Don't fail on ancient qemus
On the domain startup, this function is called to dump some info about
the CPUs. At the beginning of the function we check if we aren't running
older qemu which is not exposing the CPUs via 'qom-list'. However, we
are not checking for even older qemus, which throw 'CommandNotFound'
error.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2013-11-19 16:28:16 +01:00
Eric Blake
4a601c3080 maint: fix comment typos.
* src/qemu/qemu_command.c (qemuBuildVolumeString): Fix typo.
* src/qemu/qemu_monitor.c (qemuMonitorSend): Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-11-18 16:31:42 -07:00
Ján Tomko
8c41794af8 Return -1 in virPortAllocatorAcquire if all ports are used
Report the error in virPortAllocatorAcquire instead
of doing it in every caller.

The error contains the port range name instead of the intended
use for the port, e.g.:
Unable to find an unused port in range 'display' (65534-65535)
instead of:
Unable to find an unused port for SPICE

This also adds error reporting when the QEMU driver could not
find an unused port for VNC, VNC WebSockets or NBD migration.
2013-11-18 12:28:07 +01:00
Ján Tomko
d16d90fd40 Add a name to virPortAllocator
This allows its error messages to be more specific.
2013-11-18 12:28:02 +01:00
Ján Tomko
28ea39a004 Don't release spice port twice when no TLS port is available
Introduced by 7b4a630.
2013-11-18 12:26:59 +01:00
Michael Avdienko
d35ae4143d Fix migration with QEMU 1.6
QEMU 1.6.0 introduced new migration status: setup
Libvirt does not expect such string in QMP and refuses to migrate with error
"unexpected migration status in setup"

This patch fixes it.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2013-11-15 13:50:47 +01:00
Jiri Denemark
05e149f94c qemu: Call qemuSetupHostdevCGroup later during hotplug
https://bugzilla.redhat.com/show_bug.cgi?id=1025108

So far qemuSetupHostdevCGroup was called very early during hotplug, even
before we knew the device we were about to hotplug was actually
available. By calling the function later, we make sure QEMU won't be
allowed to access devices used by other domains.

Another important effect of this change is that hopluging USB devices
specified by vendor and product (but not by their USB address) works
again. This was broken since v1.0.5-171-g7d763ac, when the call to
qemuFindHostdevUSBDevice was moved after the call to
qemuSetupHostdevCGroup, which then used an uninitialized USB address.
2013-11-15 13:50:47 +01:00
Michal Privoznik
f417ad07df qemuMonitorIO: Don't use @mon after it's unrefed
https://bugzilla.redhat.com/show_bug.cgi?id=1018267

The aim of virObject refing and urefing is to tell where the object is
to be used and when is no longer needed. Hence any object shouldn't be
used after it has been unrefed, as we might be the last to hold the
reference. The better way is to call virObjectUnref() *after* the last
object usage. In this specific case, the monitor EOF handler was called
after the qemuMonitorIO called virObjectUnref. Not only that @mon was
disposed (which is not used in the handler anyway) but the @mon->vm
which is causing a SIGSEGV:

2013-11-15 10:17:54.425+0000: 20110: error : qemuMonitorIO:688 : internal error: early end of file from monitor: possible problem:
qemu-kvm: -incoming tcp:01.01.01.0:49152: Failed to bind socket: Cannot assign requested address

Program received signal SIGSEGV, Segmentation fault.
qemuProcessHandleMonitorEOF (mon=<optimized out>, vm=0x7fb728004170) at qemu/qemu_process.c:299
299         if (priv->beingDestroyed) {
(gdb) p *priv
Cannot access memory at address 0x0
(gdb) p vm
$1 = (virDomainObj *) 0x7fb728004170
(gdb) p *vm
$2 = {parent = {parent = {magic = 3735928559, refs = 0, klass = 0xdeadbeef}, lock = {lock = {__data = {__lock = 2, __count = 0, __owner = 20110, __nusers = 1, __kind = 0, __spins = 0, __list = {__prev = 0x0,
            __next = 0x0}}, __size = "\002\000\000\000\000\000\000\000\216N\000\000\001", '\000' <repeats 26 times>, __align = 2}}}, pid = 0, state = {state = 0, reason = 0}, autostart = 0, persistent = 0,
  updated = 0, def = 0x0, newDef = 0x0, snapshots = 0x0, current_snapshot = 0x0, hasManagedSave = false, privateData = 0x0, privateDataFreeFunc = 0x0, taint = 304}

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2013-11-15 11:32:35 +01:00
Michal Privoznik
3367c21dad qemuProcessReconnectHelper: Don't create joinable thread
In the qemuProcessReconnectHelper() a new thread that does all the
interesting work is spawned. The rationale is to not block the daemon
startup process in case of unresponsive qemu. However, the thread
handler is a local variable which gets lost once the control goes out of
scope. Hence the thread gets leaked. We can avoid this if the thread
isn't made joinable.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2013-11-14 17:44:20 +01:00
Peter Krempa
84f6960214 qemu: Check for presence of device and properities when getting CPUID
The QOM path in qemu that contains the CPUID registers of a running VM
may not be present (introduced in QEMU 1.5).

Since commit d94b781771 we have a regression with QEMU that don't
support reporting of the CPUID register state via the monitor as the
process startup code expects the path to exist.

This patch adds code that checks with the monitor if the requested path
already exists and uses it only in this case.
2013-11-12 19:36:06 +01:00
Peter Krempa
a6a6f84af9 qemu: Change return type of qemuMonitorGetGuestCPU()
To allow returning more granular errors, change the error type to an
integer.
2013-11-12 19:35:51 +01:00
Daniel P. Berrange
cbb6ec42e2 Don't expose 'none' machine type to capabilities
The 'none' machine type is something only intended for use
by libvirt probing capabilities. It isn't something that
is useful for running real VM instances. As such it should
not be exposed to users in the capabilities.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-11-12 10:47:30 +00:00
Daniel P. Berrange
f41830680e Fix mem leak in virQEMUCapsProbeQMPMachineTypes on OOM
The virQEMUCapsProbeQMPMachineTypes method iterates over machine
types copying them into the qemuCapsPtr object. It only updates
the qemuCaps->nmachinetypes value at the end though. So if OOM
occurs in the middle, the destructor of qemuCapsPtr will not
free the partially initialized machine types.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-11-12 10:47:30 +00:00
Peter Krempa
2b2decbdcc conf: Rename virDomainDiskHostDefFree to virDomainDiskHostDefClear
The function destroys only the contents not the object itself thus it
should be called Clear.
2013-11-12 10:38:34 +01:00
Michal Privoznik
cfc28c66f9 qemuDomainObjStart: Warn on corrupted image
If the managedsave image is corrupted, e.g. the XML part is, we fail to
parse it and throw an error, e.g.:

error: Failed to start domain jms8
error: XML error: missing security model when using multiple labels

This is okay, as we can't really start the machine and avoid undefined
qemu behaviour. On the other hand, the error message doesn't give a
clue to users what should they do. The consensus here would be to thrown
a warning to logs saying "Hey, you've got a corrupted file".

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2013-11-11 17:16:39 +01:00
Eric Blake
d0b2d0177b docs: grammar fixes
Fix some user-visible wording from commits 72aafe9 and 1606d89.

* src/qemu/qemu.conf (migration_address): Better wording.
* include/libvirt/libvirt.h.in (VIR_MIGRATE_PARAM_LISTEN_ADDRESS):
Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-11-08 10:37:06 -07:00
Vitor de Lima
efdd591dfb qemu: Fix SCSI hotplug on pseries guests
This patch moves some code in the qemuDomainAttachSCSIDisk
function. The check for the existence of a PCI address assigned
to the SCSI controller was moved in order to be executed only
when needed. The PCI address of a controller is not necessary
if QEMU_CAPS_DEVICE is supported.

This fixes issues with the hotplug of SCSI disks on pseries guests.
2013-11-08 16:25:02 +02:00
Vitor de Lima
54e4d9d081 qemu: assign PCI address to primary video card
When adding support for Q35 guests, the code to assign a PCI address
to the primary video card was moved into Q35 and i440fx(PIIX3)
specific functions, but no fallback was kept for other machine types
that might have a video card.

This patch remedies that by assigning a PCI address to the primary
video card if it does not have any kind of address.  In particular,
this fixes issues with pseries guests.

Signed-off-by: Vitor de Lima <vitor.lima@eldorado.org.br>
Signed-off-by: Laine Stump <laine@laine.org>
2013-11-08 12:48:32 +02:00
Peter Krempa
d94b781771 qemu: process: Validate specific CPUID flags of a guest
When starting a VM the qemu process may filter out some requested
features of a domain as it's not supported either by the host or by
qemu. Libvirt didn't check if this happened which might end up in
changing of the guest ABI when migrating.

The proof of concept implementation adds the check for the recently
introduced kvm_pv_unhalt cpuid feature bit. This feature depends on both
qemu and host kernel support and thus increase the possibility of guest
ABI breakage.
2013-11-08 09:44:42 +01:00
Peter Krempa
e0dc851164 qemu: Add support for paravirtual spinlocks in the guest
The linux kernel recently added support for paravirtual spinlock
handling to avoid performance regressions on overcomitted hosts. This
feature needs to be turned in the hypervisor so that the guest OS is
notified about the possible support.

This patch adds a new feature "paravirt-spinlock" to the XML and
supporting code to enable the "kvm_pv_unhalt" pseudo CPU feature in
qemu.

https://bugzilla.redhat.com/show_bug.cgi?id=1008989
2013-11-08 09:44:42 +01:00
Peter Krempa
de7b5faf43 conf: Refactor storing and usage of feature flags
Currently we were storing domain feature flags in a bit field as the
they were either enabled or disabled. New features such as paravirtual
spinlocks however can be tri-state as the default option may depend on
hypervisor version.

To allow storing tri-state feature state in the same place instead of
having to declare dedicated variables for each feature this patch
refactors the bit field to an array.
2013-11-08 09:44:42 +01:00
Jiri Denemark
3afde0756f qemu: Add monitor APIs to fetch CPUID data from QEMU
The qemu monitor supports retrieval of actual CPUID bits presented to
the guest using QMP monitor. Add APIs to extract these information and
tests for them.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2013-11-08 09:44:36 +01:00
Michal Privoznik
1f2f879ed1 qemu: Don't access vm->priv on unlocked domain
Since 86d90b3a (yes, my patch; again) we are supporting NBD storage
migration. However, on error recovery path we got the steps reversed.
The correct order is: return NBD port to the virPortAllocator and then
either unlock the vm or remove it from the driver. Not vice versa.

==11192== Invalid write of size 4
==11192==    at 0x11488559: qemuMigrationPrepareAny (qemu_migration.c:2459)
==11192==    by 0x11488EA6: qemuMigrationPrepareDirect (qemu_migration.c:2652)
==11192==    by 0x114D1509: qemuDomainMigratePrepare3Params (qemu_driver.c:10332)
==11192==    by 0x519075D: virDomainMigratePrepare3Params (libvirt.c:7290)
==11192==    by 0x1502DA: remoteDispatchDomainMigratePrepare3Params (remote.c:4798)
==11192==    by 0x12DECA: remoteDispatchDomainMigratePrepare3ParamsHelper (remote_dispatch.h:5741)
==11192==    by 0x5212127: virNetServerProgramDispatchCall (virnetserverprogram.c:435)
==11192==    by 0x5211C86: virNetServerProgramDispatch (virnetserverprogram.c:305)
==11192==    by 0x520A8FD: virNetServerProcessMsg (virnetserver.c:165)
==11192==    by 0x520A9E1: virNetServerHandleJob (virnetserver.c:186)
==11192==    by 0x50DA78F: virThreadPoolWorker (virthreadpool.c:144)
==11192==    by 0x50DA11C: virThreadHelper (virthreadpthread.c:161)
==11192==  Address 0x1368baa0 is 576 bytes inside a block of size 688 free'd
==11192==    at 0x4A07F5C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==11192==    by 0x5079A2F: virFree (viralloc.c:580)
==11192==    by 0x11456C34: qemuDomainObjPrivateFree (qemu_domain.c:267)
==11192==    by 0x50F41B4: virDomainObjDispose (domain_conf.c:2034)
==11192==    by 0x50C2991: virObjectUnref (virobject.c:262)
==11192==    by 0x50F4CFC: virDomainObjListRemove (domain_conf.c:2361)
==11192==    by 0x1145C125: qemuDomainRemoveInactive (qemu_domain.c:2087)
==11192==    by 0x11488520: qemuMigrationPrepareAny (qemu_migration.c:2456)
==11192==    by 0x11488EA6: qemuMigrationPrepareDirect (qemu_migration.c:2652)
==11192==    by 0x114D1509: qemuDomainMigratePrepare3Params (qemu_driver.c:10332)
==11192==    by 0x519075D: virDomainMigratePrepare3Params (libvirt.c:7290)
==11192==    by 0x1502DA: remoteDispatchDomainMigratePrepare3Params (remote.c:4798)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2013-11-08 07:34:11 +01:00
Michal Privoznik
5a4c2374a2 qemu: Avoid double free of VM
One of my previous patches (c7ac2519b7) did try to fix the issue when
domain dies too soon during migration. However, this clumsy approach was
missing removal of qemuProcessHandleMonitorDestroy resulting in double
unrefing of mon->vm and hence producing the daemon crash:

==11843== Invalid read of size 4
==11843==    at 0x50C28C5: virObjectUnref (virobject.c:255)
==11843==    by 0x1148F7DB: qemuMonitorDispose (qemu_monitor.c:258)
==11843==    by 0x50C2991: virObjectUnref (virobject.c:262)
==11843==    by 0x50C2D13: virObjectFreeCallback (virobject.c:388)
==11843==    by 0x509C37B: virEventPollCleanupHandles (vireventpoll.c:583)
==11843==    by 0x509C711: virEventPollRunOnce (vireventpoll.c:652)
==11843==    by 0x509A620: virEventRunDefaultImpl (virevent.c:274)
==11843==    by 0x520D21C: virNetServerRun (virnetserver.c:1112)
==11843==    by 0x11F368: main (libvirtd.c:1513)
==11843==  Address 0x13b88864 is 4 bytes inside a block of size 136 free'd
==11843==    at 0x4A07F5C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==11843==    by 0x5079A2F: virFree (viralloc.c:580)
==11843==    by 0x50C29E3: virObjectUnref (virobject.c:270)
==11843==    by 0x114770E4: qemuProcessHandleMonitorDestroy (qemu_process.c:1103)
==11843==    by 0x1148F7CB: qemuMonitorDispose (qemu_monitor.c:257)
==11843==    by 0x50C2991: virObjectUnref (virobject.c:262)
==11843==    by 0x50C2D13: virObjectFreeCallback (virobject.c:388)
==11843==    by 0x509C37B: virEventPollCleanupHandles (vireventpoll.c:583)
==11843==    by 0x509C711: virEventPollRunOnce (vireventpoll.c:652)
==11843==    by 0x509A620: virEventRunDefaultImpl (virevent.c:274)
==11843==    by 0x520D21C: virNetServerRun (virnetserver.c:1112)
==11843==    by 0x11F368: main (libvirtd.c:1513)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2013-11-08 07:31:02 +01:00
Michal Privoznik
b2f31af701 qemuMigrationBeginPhase: Check for 'drive-mirror' for NBD
So far we are checking if qemu supports 'nbd-server-start'. This,
however, makes no sense on the source as nbd-server-* is used on the
destination. On the source the 'drive-mirror' is used instead.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2013-11-07 14:15:13 +01:00
Michal Privoznik
9cc8a5af02 qemuMonitorDispose: Reset lastError
Since the 90139a62 commit the error is copied into mon->lastError but
it's never freed from there.

==31989== 395 bytes in 1 blocks are definitely lost in loss record 877 of 978
==31989==    at 0x4A06C2B: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==31989==    by 0x7EAF129: strdup (in /lib64/libc-2.15.so)
==31989==    by 0x50D586C: virStrdup (virstring.c:554)
==31989==    by 0x50976C1: virCopyError (virerror.c:191)
==31989==    by 0x5097A35: virCopyLastError (virerror.c:312)
==31989==    by 0x114909A9: qemuMonitorIO (qemu_monitor.c:690)
==31989==    by 0x509BEDE: virEventPollDispatchHandles (vireventpoll.c:501)
==31989==    by 0x509C701: virEventPollRunOnce (vireventpoll.c:648)
==31989==    by 0x509A620: virEventRunDefaultImpl (virevent.c:274)
==31989==    by 0x520D21C: virNetServerRun (virnetserver.c:1112)
==31989==    by 0x11F368: main (libvirtd.c:1513)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2013-11-06 19:03:30 +01:00
Zeng Junliang
c92ca769af qemu: clean up migration ports when migration cancelled
If there's a migration cancelled, the bitmap of migration port should be
cleaned up too.

Signed-off-by: Zeng Junliang <zengjunliang@huawei.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2013-11-06 15:08:43 +01:00
Ján Tomko
1569fa14d8 qemu: don't use deprecated -no-kvm-pit-reinjection
Since qemu-kvm 1.1 [1] (since 1.3. in upstream QEMU [2])
'-no-kvm-pit-reinjection' has been deprecated.
Use -global kvm-pit.lost_tick_policy=discard instead.

https://bugzilla.redhat.com/show_bug.cgi?id=978719

[1] http://git.kernel.org/cgit/virt/kvm/qemu-kvm.git/commit/?id=4e4fa39
[2] http://git.qemu.org/?p=qemu.git;a=commitdiff;h=c21fb4f
2013-11-05 16:04:06 +01:00
John Ferlan
5669045580 Resolve Coverity issue regarding not checking return value
Coverity complains that the call to virPCIDeviceDetach() in
qemuPrepareHostdevPCIDevices() doesn't check status return like
other calls.  Seems this just was lurking until a recent change
to this module resulted in Coverity looking harder and finding
the issue.  Introduced by 'a4efb2e33' when function was called
'pciReAttachDevice()'

Just added a ignore_value() since it doesn't appear to matter
if the call fails since we're on a failure path already.
2013-11-05 07:55:54 -05:00
Ján Tomko
3e1e16aa8d Use a port from the migration range for NBD as well
Instead of using a port from the remote display range.

https://bugzilla.redhat.com/show_bug.cgi?id=1025699
2013-11-01 12:07:12 +01:00
Daniel P. Berrange
4b9862775c Improve debugging of QEMU start/stop
Include reference of the VM object pointer and name in debug
logs for QEMU start/stop functions. Also make sure we log the
PID that we started, since it isn't available elsewhere in the
logs.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-10-31 16:56:01 +00:00
Daniel P. Berrange
dddc57a339 Improve debugging of job enter/exit code
In debugging a recent oVirt/libvirt race condition, I was very
frustrated by lack of logging in the job enter/exit code. This
patch adds some key data which would have been useful in by
debugging attempts.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-10-31 16:56:01 +00:00
Daniel P. Berrange
f26701f565 Fix race condition reconnecting to vms & loading configs
The following sequence

 1. Define a persistent QMEU guest
 2. Start the QEMU guest
 3. Stop libvirtd
 4. Kill the QEMU process
 5. Start libvirtd
 6. List persistent guests

At the last step, the previously running persistent guest
will be missing. This is because of a race condition in the
QEMU driver startup code. It does

 1. Load all VM state files
 2. Spawn thread to reconnect to each VM
 3. Load all VM config files

Only at the end of step 3, does the 'virDomainObjPtr' get
marked as "persistent". There is therefore a window where
the thread reconnecting to the VM will remove the persistent
VM from the list.

The easy fix is to simply switch the order of steps 2 & 3.

In addition to this though, we must only attempt to reconnect
to a VM which had a non-zero PID loaded from its state file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-10-30 11:16:18 +00:00
Daniel P. Berrange
54a2411220 Fix leak of objects when reconnecting to QEMU instances
The 'error' cleanup block in qemuProcessReconnect() had a
'return' statement in the middle of it. This caused a leak
of virConnectPtr & virQEMUDriverConfigPtr instances. This
was identified because netcf recently started checking its
refcount in libvirtd shutdown:

netcfStateCleanup:109 : internal error: Attempt to close netcf state driver with open connections

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-10-30 11:16:17 +00:00
Michael Chapman
0883f3ba04 qemu: fix well-formed migration URI formatting
When adding an automatically allocated port to a well-formed migration
URI, keep it well-formed:

  tcp://1.2.3.4/  ->  tcp://1.2.3.4/:12345   # wrong
  tcp://1.2.3.4/  ->  tcp://1.2.3.4:12345/   # fixed
  tcp://1.2.3.4   ->  tcp://1.2.3.4:12345    # still works
  tcp:1.2.3.4     ->  tcp:1.2.3.4:12345      # still works (old syntax)

Signed-off-by: Michael Chapman <mike@very.puzzling.org>
2013-10-29 08:49:42 -06:00
Giuseppe Scrivano
b51038a4cd capabilities: add baselabel per sec driver/virt type to secmodel
Expand the "secmodel" XML fragment of "host" with a sequence of
baselabel's which describe the default security context used by
libvirt with a specific security model and virtualization type:

<secmodel>
  <model>selinux</model>
  <doi>0</doi>
  <baselabel type='kvm'>system_u:system_r:svirt_t:s0</baselabel>
  <baselabel type='qemu'>system_u:system_r:svirt_tcg_t:s0</baselabel>
</secmodel>
<secmodel>
  <model>dac</model>
  <doi>0</doi>
  <baselabel type='kvm'>107:107</baselabel>
  <baselabel type='qemu'>107:107</baselabel>
</secmodel>

"baselabel" is driver-specific information, e.g. in the DAC security
model, it indicates USER_ID:GROUP_ID.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2013-10-29 07:06:04 -06:00
Laine Stump
69e047ae21 qemu: fix removal of <interface type='hostdev'>
This patch (and the two patches that precede it) resolve:

  https://bugzilla.redhat.com/show_bug.cgi?id=1005682

When libvirt was changed to delay the final cleanup of device removal
until the qemu process had signaled it with a DEVICE_DELETED event for
that device, the hostdev removal function
(qemuDomainRemoveHostDevice()) was written to properly handle the
removal of a hostdev that was actually an SRIOV virtual function
(defined with <interface type='hostdev'>). However, the function used
to search for a device matching the alias name provided in the
DEVICE_DELETED message (virDomainDefFindDevice()) would search through
the list of netdevs before hostdevs, so qemuDomainRemoveHostDevice()
was never called; instead the netdev function,
qemuDomainRemoveNetDevice() (which *doesn't* properly cleanup after
removal of <interface type='hostdev'>), was called.

(As a reminder - each <interface type='hostdev'> results in a
virDomainNetDef which contains a virDomainHostdevDef having a parent
type of VIR_DOMAIN_DEVICE_NET, and parent.data.net pointing back to
the virDomainNetDef; both Defs point to the same device info object
(and the info contains the device's "alias", which is used by qemu to
identify the device). The virDomainHostdevDef is added to the domain's
hostdevs list *and* the virDomainNetDef is added to the domain's nets
list, so searching either list for a particular alias will yield a
positive result.)

This function modifies the qemuDomainRemoveNetDevice() to short
circuit itself and call qemu DomainRemoveHostDevice() instead when the
actual device is a VIR_DOMAIN_NET_TYPE_HOSTDEV (similar logic to what
is done in the higher level qemuDomainDetachNetDevice())

Note that even if virDomainDefFindDevice() changes in the future so
that it finds the hostdev entry first, the current code will continue
to work properly.
2013-10-21 18:09:04 +03:00
Laine Stump
c5561644d8 qemu: move qemuDomainRemoveNetDevice to avoid forward reference
pure code movement to setup for next patch.
2013-10-21 18:07:49 +03:00
Laine Stump
7a600cf77f qemu: simplify calling qemuDomainHostdevNetConfigRestore
This function was called in three places, and in each the call was
qualified by a slightly different conditional. In reality, this
function should only be called for a hostdev if all of the following
are true:

  1) mode='subsystem'
  2) type='pci'
  3) there is a parent device definition which is an <interface>
     (VIR_DOMAIN_DEVICE_NET)

We can simplify the callers and make them more consistent by checking
these conditions at the top ov qemuDomainHostdevNetConfigRestore and
returning 0 if one of them isn't satisfied.

The location of the call to qemuDomainHostdevNetConfigRestore() has
also been changed in the hot-plug case - it is moved into the caller
of its previous location (i.e. from qemuDomainRemovePCIHostDevice() to
qemuDomainRemoveHostDevice()). This was done to be more consistent
about which functions pay attention to whether or not this is one of
the special <interface> hostdevs or just a normal hostdev -
qemuDomainRemoveHostDevice() already contained a call to
networkReleaseActualDevice() and virDomainNetDefFree(), so it makes
sense for it to also handle the resetting of the device's MAC address
and vlan tag (which is what's done by
qemuDomainHostdevNetConfigRestore()).
2013-10-21 18:06:30 +03:00
Daniel P. Berrange
9b0af09240 Remove (nearly) all use of getuid()/getgid()
Most of the usage of getuid()/getgid() is in cases where we are
considering what privileges we have. As such the code should be
using the effective IDs, not real IDs.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-10-21 14:03:52 +01:00
Daniel P. Berrange
9b8f307c6a Make virCommand env handling robust in setuid env
When running setuid, we must be careful about what env vars
we allow commands to inherit from us. Replace the
virCommandAddEnvPass function with two new ones which do
filtering

  virCommandAddEnvPassAllowSUID
  virCommandAddEnvPassBlockSUID

And make virCommandAddEnvPassCommon use the appropriate
ones

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-10-21 14:03:52 +01:00
Michal Privoznik
d9be5a7157 qemu: Fix augeas support for migration ports
Commit e3ef20d7 allows user to configure migration ports range via
qemu.conf. However, it forgot to update augeas definition file and
even the test data was malicious.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2013-10-18 18:30:13 +02:00
Jiri Denemark
e3ef20d7f7 qemu: Make migration port range configurable
https://bugzilla.redhat.com/show_bug.cgi?id=1019053
2013-10-18 16:35:38 +02:00
Wang Yufei
0196845d3a qemu: Avoid assigning unavailable migration ports
https://bugzilla.redhat.com/show_bug.cgi?id=1019053

When we migrate vms concurrently, there's a chance that libvirtd on
destination assigns the same port for different migrations, which will
lead to migration failure during prepare phase on destination. So we use
virPortAllocator here to solve the problem.

Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2013-10-18 16:34:09 +02:00
John Ferlan
0cacffac64 Remove ATTRIBUTE_NONNULL(3) from qemuMonitorJSONDrivePivot
The header definition didn't match the function declaration, so adjusted
header to reflect the definition.

Found during a Coverity build where STATIC_ANALYSIS is enabled resulting
in the internal.h adding __nonnull__ handling to arguments.

Commit '6d264c91' added support for the qemuMonitorJSONDrivePivot() and
commit 'fbc3adc9' added a corresponding test which ended up triggering
the build failure which I didn't notice until today!
2013-10-17 19:36:42 -04:00
Daniel P. Berrange
291a6ef3e4 Add support for enabling SASL for SPICE guests
QEMU has support for SASL auth for SPICE guests, but libvirt
has no way to enable it. Following the example from VNC where
it is globally enabled via qemu.conf

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-10-17 16:02:43 +01:00
Michal Privoznik
ac5f3f292b qemuDomainCleanupRemove: s/memmove/VIR_DELETE_ELEMENT_INPLACE/
The last argument of memmove is the amount of bytes to be moved. The
amount is in Bytes. We are moving some void pointers around. However,
since sizeof(void *) is not Byte on any architecture, we've got the
arithmetic wrong.
2013-10-17 15:24:05 +02:00
Peter Krempa
fe1bf917f9 qemu: command: Fix macro indentation 2013-10-15 16:46:41 +02:00
Ján Tomko
15fac93b95 Convert uuid to a string before printing it
Introduced by 1fa7946.

https://bugzilla.redhat.com/show_bug.cgi?id=1019023
2013-10-15 12:30:21 +02:00
Eric Blake
9a520a591d maint: avoid 'const fooPtr' in qemu
'const fooPtr' is the same as 'foo * const' (the pointer won't
change, but it's contents can).  But in general, if an interface
is trying to be const-correct, it should be using 'const foo *'
(the pointer is to data that can't be changed).

Fix up offenders in src/qemu.

* src/qemu/qemu_bridge_filter.h (networkAllowMacOnPort)
(networkDisallowMacOnPort): Use intended type.
* src/qemu/qemu_bridge_filter.c (networkAllowMacOnPort)
(networkDisallowMacOnPort): Likewise.
* src/qemu/qemu_command.c (qemuBuildTPMBackendStr)
(qemuBuildTPMDevStr, qemuBuildCpuArgStr)
(qemuBuildObsoleteAccelArg, qemuBuildMachineArgStr)
(qemuBuildSmpArgStr, qemuBuildNumaArgStr): Likewise.
* src/qemu/qemu_conf.c (qemuSharedDeviceEntryCopy): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSaveImageStartVM): Likewise.
* src/qemu/qemu_hostdev.c
(qemuDomainHostdevNetConfigVirtPortProfile): Likewise.
* src/qemu/qemu_monitor_json.c
(qemuMonitorJSONAttachCharDevCommand): Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-10-14 14:34:38 -06:00
Eric Blake
d24677090f maint: avoid 'const fooPtr' in domain_conf
'const fooPtr' is the same as 'foo * const' (the pointer won't
change, but it's contents can).  But in general, if an interface
is trying to be const-correct, it should be using 'const foo *'
(the pointer is to data that can't be changed).

Fix up offenders in src/conf/domain_conf, and their fallout.

Several things to note: virObjectLock() requires a non-const
argument; if this were C++, we could treat the locking field
as 'mutable' and allow locking an otherwise 'const' object, but
that is a more invasive change, so I instead dropped attempts
to be const-correct on domain lookup.  virXMLPropString and
friends require a non-const xmlNodePtr - this is because libxml2
is not a const-correct library.  We could make the src/util/virxml
wrappers cast away const, but I figured it was easier to not
try to mark xmlNodePtr as const.  Finally, virDomainDeviceDefCopy
was a rather hard conversion - it calls virDomainDeviceDefPostParse,
which in turn in the xen driver was actually modifying the domain
outside of the current device being visited.  We should not be
adding a device on the first per-device callback, but waiting until
after all per-device callbacks are complete.

* src/conf/domain_conf.h (virDomainObjListFindByID)
(virDomainObjListFindByUUID, virDomainObjListFindByName)
(virDomainObjAssignDef, virDomainObjListAdd): Drop attempt at
const.
(virDomainDeviceDefCopy): Use intended type.
(virDomainDeviceDefParse, virDomainDeviceDefPostParseCallback)
(virDomainVideoDefaultType, virDomainVideoDefaultRAM)
(virDomainChrGetDomainPtrs): Make const-correct.
* src/conf/domain_conf.c (virDomainObjListFindByID)
(virDomainObjListFindByUUID, virDomainObjListFindByName)
(virDomainDeviceDefCopy, virDomainObjListAdd)
(virDomainObjAssignDef, virDomainHostdevSubsysUsbDefParseXML)
(virDomainHostdevSubsysPciOrigStatesDefParseXML)
(virDomainHostdevSubsysPciDefParseXML)
(virDomainHostdevSubsysScsiDefParseXML)
(virDomainControllerModelTypeFromString)
(virDomainTPMDefParseXML, virDomainTimerDefParseXML)
(virDomainSoundCodecDefParseXML, virDomainSoundDefParseXML)
(virDomainWatchdogDefParseXML, virDomainRNGDefParseXML)
(virDomainMemballoonDefParseXML, virDomainNVRAMDefParseXML)
(virSysinfoParseXML, virDomainVideoAccelDefParseXML)
(virDomainVideoDefParseXML, virDomainHostdevDefParseXML)
(virDomainRedirdevDefParseXML)
(virDomainRedirFilterUsbDevDefParseXML)
(virDomainRedirFilterDefParseXML, virDomainIdMapEntrySort)
(virDomainIdmapDefParseXML, virDomainVcpuPinDefParseXML)
(virDiskNameToBusDeviceIndex, virDomainDeviceDefCopy)
(virDomainVideoDefaultType, virDomainHostdevAssignAddress)
(virDomainDeviceDefPostParseInternal, virDomainDeviceDefPostParse)
(virDomainChrGetDomainPtrs, virDomainControllerSCSINextUnit)
(virDomainSCSIDriveAddressIsUsed)
(virDomainDriveAddressIsUsedByDisk)
(virDomainDriveAddressIsUsedByHostdev): Fix fallout.
* src/openvz/openvz_driver.c (openvzDomainDeviceDefPostParse):
Likewise.
* src/libxl/libxl_domain.c (libxlDomainDeviceDefPostParse):
Likewise.
* src/qemu/qemu_domain.c (qemuDomainDeviceDefPostParse)
(qemuDomainDefaultNetModel): Likewise.
* src/lxc/lxc_domain.c (virLXCDomainDeviceDefPostParse):
Likewise.
* src/uml/uml_driver.c (umlDomainDeviceDefPostParse): Likewise.
* src/xen/xen_driver.c (xenDomainDeviceDefPostParse): Split...
(xenDomainDefPostParse): ...since per-device callback is not the
time to be adding a device.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-10-14 14:34:38 -06:00
Eric Blake
f8fa2b3e3a maint: fix awkward typing of virDomainChrGetDomainPtrs
virDomainChrGetDomainPtrs() required 4 levels of pointers (taking
a parameter that will be used as an output variable to return the
address of another variable that contains an array of pointers).
This is rather complex to reason about, especially when outside
of the domain_conf file, no other caller should be modifying
the resulting array of pointers directly.  Changing the public
signature gives something is easier to reason with, and actually
make const-correct; which is important as it was the only function
that was blocking virDomainDeviceDefCopy from treating its source
as const.

* src/conf/domain_conf.h (virDomainChrGetDomainPtrs): Use simpler
types, and make const-correct for external users.
* src/conf/domain_conf.c (virDomainChrGetDomainPtrs): Split...
(virDomainChrGetDomainPtrsInternal): ...into an internal version
that lets us modify terms, vs. external form that is read-only.
(virDomainDeviceDefPostParseInternal, virDomainChrFind)
(virDomainChrInsert): Adjust callers.
* src/qemu/qemu_command.c (qemuGetNextChrDevIndex): Adjust caller.
(qemuDomainDeviceAliasIndex): Make const-correct.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-10-14 14:34:37 -06:00
Peter Krempa
7df5093f67 qemu: snapshot: Add support for compressing external snapshot memory
The regular save image code has the support to compress images using a
specified algorithm. This was not implemented for external checkpoints
although it shares most of the backend code.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1017227
2013-10-14 15:54:19 +02:00
Peter Krempa
550cae847b qemu: managedsave: Add support for compressing managed save images
The regular save image code has the support to compress images using a
specified algorithm. This was not implemented for managed save although
it shares most of the backend code.
2013-10-14 15:36:57 +02:00
Michal Privoznik
be65186044 qemu: Include listenAddress in debug prints
After my patches, some functions gained one more argument
(@listenAddress) which wasn't included in debug printing of
arguments they were called with. Functions in question are:
qemuMigrationPrepareDirect and qemuMigrationPerform.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2013-10-11 15:19:23 +02:00
Michal Privoznik
c7ac2519b7 qemu_migration: Avoid crashing if domain dies too quickly
I've noticed a SIGSEGV-ing libvirtd on the destination when the qemu
died too quickly = in Prepare phase. What is happening here is:

1) [Thread 3493] We are in qemuMigrationPrepareAny() and calling
qemuProcessStart() which subsequently calls qemuProcessWaitForMonitor()
and qemuConnectMonitor(). So far so good. The qemuMonitorOpen()
succeeds, however switching monitor to QMP mode fails as qemu died
meanwhile. That is qemuMonitorSetCapabilities() returns -1.

2013-10-08 15:54:10.629+0000: 3493: debug : qemuMonitorSetCapabilities:1356 : mon=0x14a53da0
2013-10-08 15:54:10.630+0000: 3493: debug : qemuMonitorJSONCommandWithFd:262 : Send command '{"execute":"qmp_capabilities","id":"libvirt-1"}' for write with FD -1
2013-10-08 15:54:10.630+0000: 3493: debug : virEventPollUpdateHandle:147 : EVENT_POLL_UPDATE_HANDLE: watch=17 events=13
...
2013-10-08 15:54:10.631+0000: 3493: debug : qemuMonitorSend:956 : QEMU_MONITOR_SEND_MSG: mon=0x14a53da0 msg={"execute":"qmp_capabilities","id":"libvirt-1"}
 fd=-1
2013-10-08 15:54:10.631+0000: 3262: debug : virEventPollRunOnce:641 : Poll got 1 event(s)

2) [Thread 3262] The event loop is trying to do the talking to monitor.
However, qemu is dead already, remember?

2013-10-08 15:54:13.436+0000: 3262: error : qemuMonitorIORead:551 : Unable to read from monitor: Connection reset by peer
2013-10-08 15:54:13.516+0000: 3262: debug : virFileClose:90 : Closed fd 25
...
2013-10-08 15:54:13.533+0000: 3493: debug : qemuMonitorSend:968 : Send command resulted in error internal error: early end of file from monitor: possible problem:

3) [Thread 3493] qemuProcessStart() failed. No big deal. Go to the
'endjob' label and subsequently to the 'cleanup'. Since the domain is
not persistent and ret is -1, the qemuDomainRemoveInactive() is called.
This has an (unpleasant) effect of virObjectUnref()-in the @vm object.
Unpleasant because the event loop which is about to trigger EOF callback
still holds a pointer to the @vm (not the reference). See the valgrind
output below.

4) [Thread 3262] So the event loop starts triggering EOF:

2013-10-08 15:54:13.542+0000: 3262: debug : qemuMonitorIO:729 : Triggering EOF callback
2013-10-08 15:54:13.543+0000: 3262: debug : qemuProcessHandleMonitorEOF:294 : Received EOF on 0x14549110 'migt10'

And the monitor is cleaned up. This results in calling
qemuProcessHandleMonitorEOF with the @vm pointer passed. The pointer is
kept in qemuMonitor struct.

==3262== Thread 1:
==3262== Invalid read of size 4
==3262==    at 0x77ECCAA: pthread_mutex_lock (in /lib64/libpthread-2.15.so)
==3262==    by 0x52FAA06: virMutexLock (virthreadpthread.c:85)
==3262==    by 0x52E3891: virObjectLock (virobject.c:320)
==3262==    by 0x11626743: qemuProcessHandleMonitorEOF (qemu_process.c:296)
==3262==    by 0x11642593: qemuMonitorIO (qemu_monitor.c:730)
==3262==    by 0x52BD526: virEventPollDispatchHandles (vireventpoll.c:501)
==3262==    by 0x52BDD49: virEventPollRunOnce (vireventpoll.c:648)
==3262==    by 0x52BBC68: virEventRunDefaultImpl (virevent.c:274)
==3262==    by 0x542D3D9: virNetServerRun (virnetserver.c:1112)
==3262==    by 0x11F368: main (libvirtd.c:1513)
==3262==  Address 0x14549128 is 24 bytes inside a block of size 136 free'd
==3262==    at 0x4C2AF5C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==3262==    by 0x529B1FF: virFree (viralloc.c:580)
==3262==    by 0x52E3703: virObjectUnref (virobject.c:270)
==3262==    by 0x531557E: virDomainObjListRemove (domain_conf.c:2355)
==3262==    by 0x1160E899: qemuDomainRemoveInactive (qemu_domain.c:2061)
==3262==    by 0x1163A0C6: qemuMigrationPrepareAny (qemu_migration.c:2450)
==3262==    by 0x1163A923: qemuMigrationPrepareDirect (qemu_migration.c:2626)
==3262==    by 0x11682D71: qemuDomainMigratePrepare3Params (qemu_driver.c:10309)
==3262==    by 0x53B0976: virDomainMigratePrepare3Params (libvirt.c:7266)
==3262==    by 0x1502D3: remoteDispatchDomainMigratePrepare3Params (remote.c:4797)
==3262==    by 0x12DECA: remoteDispatchDomainMigratePrepare3ParamsHelper (remote_dispatch.h:5741)
==3262==    by 0x54322EB: virNetServerProgramDispatchCall (virnetserverprogram.c:435)

The mon->vm is set in qemuMonitorOpenInternal() which is the correct
place to increase @vm ref counter. The correct place to decrease the ref
counter is then qemuMonitorDispose().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2013-10-11 14:55:51 +02:00
Michal Privoznik
1606d89c86 qemu_conf: Introduce "migration_address"
This configuration knob is there to override default listen address for
-incoming for all qemu domains.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2013-10-11 11:11:33 +02:00
Michal Privoznik
c4ac7ef663 qemu: Implement support for VIR_MIGRATE_PARAM_LISTEN_ADDRESS
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2013-10-11 11:08:55 +02:00
Michal Privoznik
7d704812b9 qemu: Introduce qemuDomainDefCheckABIStability
https://bugzilla.redhat.com/show_bug.cgi?id=994364

Whenever we check for ABI stability, we have new xml (e.g. provided by
user, or obtained from snapshot, whatever) which we compare to old xml
and see if ABI won't break. However, if the new xml was produced via
virDomainGetXMLDesc(..., VIR_DOMAIN_XML_MIGRATABLE) it lacks some
devices, e.g. 'pci-root' controller. Hence, the ABI stability check
fails even though it is stable. Moreover, we can't simply fix
virDomainDefCheckABIStability because removing the correct devices is
task for the driver. For instance, qemu driver wants to remove the usb
controller too, while LXC driver doesn't. That's why we need special
qemu wrapper over virDomainDefCheckABIStability which removes the
correct devices from domain XML, produces MIGRATABLE xml and calls the
check ABI stability function.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2013-10-11 10:31:35 +02:00
Michal Privoznik
9c228e0817 qemu: Init @pcidevs in qemuPrepareHostdevPCIDevices
At the beginning of the function qemuPrepareHostdevPCICheckSupport() is
called. After that @pcidevs is initialized. However, if the very first
command fails, we go to 'cleanup' label where virObjectUnref(pcidevs) is
called. Obviously, it is called before @pcidevs was able to get
initialized. Compiler warns about it:

  CC       qemu/libvirt_driver_qemu_impl_la-qemu_hostdev.lo
qemu/qemu_hostdev.c: In function 'qemuPrepareHostdevPCIDevices':
qemu/qemu_hostdev.c:824:19: error: 'pcidevs' may be used uninitialized in this function [-Werror=maybe-uninitialized]
     virObjectUnref(pcidevs);
                   ^
cc1: all warnings being treated as errors
2013-10-10 12:32:49 +02:00
Peter Krempa
f094aaac48 qemu: Prefer VFIO for PCI device passthrough
Prefer using VFIO (if available) to the legacy KVM device passthrough.

With this patch a PCI passthrough device without the driver configured
will be started with VFIO if it's available on the host. If not legacy
KVM passthrough is checked and error is reported if it's not available.
2013-10-10 12:00:56 +02:00
Peter Krempa
467b561ac2 qemu: hostdev: Add checks if PCI passthrough is available in the host
Add code to check availability of PCI passhthrough using VFIO and the
legacy KVM passthrough and use it when starting VMs and hotplugging
devices to live machine.
2013-10-10 10:35:01 +02:00
Peter Krempa
f24150b1f5 qemu: hostdev: Fix function spacing and header formatting 2013-10-10 10:32:07 +02:00
Peter Krempa
a863b89010 qemu: refactor qemuCompressProgramAvailable() 2013-10-09 18:26:48 +02:00
Peter Krempa
f2b0a5336e qemu: Fix coding style in qemuDomainSaveFlags()
Avoid mixed brace style in an if statement and fix formatting of error
messages.
2013-10-09 18:26:48 +02:00
Peter Krempa
9d13298901 qemu: hostdev: Refactor PCI passhrough handling
To simplify future patches dealing with this code, simplify and refactor
some conditions to switch statements.
2013-10-08 15:24:27 +02:00
Daniel P. Berrange
999d72fbd5 Remove use of virConnectPtr from all remaining nwfilter code
The virConnectPtr is passed around loads of nwfilter code in
order to provide it as a parameter to the callback registered
by the virt drivers. None of the virt drivers use this param
though, so it serves no purpose.

Avoiding the need to pass a virConnectPtr means that the
nwfilterStateReload method no longer needs to open a bogus
QEMU driver connection. This addresses a race condition that
can lead to a crash on startup.

The nwfilter driver starts before the QEMU driver and registers
some callbacks with DBus to detect firewalld reload. If the
firewalld reload happens while the QEMU driver is still starting
up though, the nwfilterStateReload method will open a connection
to the partially initialized QEMU driver and cause a crash.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-10-07 14:19:10 +01:00
Peter Krempa
f8e2da01be qemu: Use maximum guest memory size when getting NUMA placement advice
When starting the VM the guest balloon driver is not loaded at that
time. We need to ask numad for placement of the complete VM.
2013-10-04 14:57:54 +02:00
Cole Robinson
670e86bfd7 qemu: snapshot: Break out redefine preparation to shared function 2013-10-03 17:31:55 -04:00
Cole Robinson
56ff156d15 qemu: snapshots: Simplify REDEFINE flag check
Makes things more readable IMO
2013-10-03 16:52:54 -04:00
Laine Stump
9881bfed25 qemu: check actual netdev type rather than config netdev type during init
This resolves:

   https://bugzilla.redhat.com/show_bug.cgi?id=1012824
   https://bugzilla.redhat.com/show_bug.cgi?id=1012834

Note that a similar problem was reported in:

   https://bugzilla.redhat.com/show_bug.cgi?id=827519

but the fix only worked for <interface type='hostdev'>, *not* for
<interface type='network'> where the network itself was a pool of
hostdevs.

The symptom in both cases was this error message:

   internal error: Unable to determine device index for network device

In both cases the cause was lack of proper handling for netdevs
(<interface>) of type='hostdev' when scanning the netdev list looking
for alias names in qemuAssignDeviceNetAlias() - those that aren't
type='hostdev' have an alias of the form "net%d", while those that are
hostdev use "hostdev%d". This special handling was completely lacking
prior to the fix for Bug 827519 which was:

When searching for the highest alias index, libvirt looks at the alias
for each netdev and if it is type='hostdev' it ignores the entry. If
the type is not hostdev, then it expects the "net%d" form; if it
doesn't find that, it fails and logs the above error message.

That fix works except in the case of <interface type='network'> where
the network uses hostdev (i.e. the network is a pool of VFs to be
assigned to the guests via PCI passthrough). In this case, the check
for type='hostdev' would fail because it was done as:

     def->net[i]->type == VIR_DOMAIN_NET_TYPE_HOSTDEV

(which compares what was written in the config) when it actually
should have been:

    virDomainNetGetActualType(def->net[i]) == VIR_DOMAIN_NET_TYPE_HOSTDEV

(which compares the type of netdev that was actually allocated from
the network at runtime).

Of course the latter wouldn't be of any use if the netdevs of
type='network' hadn't already acquired their actual network connection
yet, but manual examination of the code showed that this is never the
case.

While looking through qemu_command.c, two other places were found to
directly compare the net[i]->type field rather than getting actualType:

* qemuAssignDeviceAliases() - in this case, the incorrect comparison
  would cause us to create a "net%d" alias for a netdev with
  type='network' but actualType='hostdev'. This alias would be
  subsequently overwritten by the proper "hostdev%d" form, so
  everything would operate properly, but a string would be
  leaked. This patch also fixes this problem.

* qemuAssignDevicePCISlots() - would defer assigning a PCI address to
  a netdev if it was type='hostdev', but not for type='network +
  actualType='hostdev'. In this case, the actual device usually hasn't
  been acquired yet anyway, and even in the case that it has, there is
  no practical difference between assigning a PCI address while
  traversing the netdev list or while traversing the hostdev
  list. Because changing it would be an effective NOP (but potentially
  cause some unexpected regression), this usage was left unchanged.
2013-10-03 11:06:45 -04:00
Michal Privoznik
3e8343e151 qemuMonitorJSONSendKey: Avoid double free
After successful @cmd construction the memory where @keys points to is
part of @cmd. Avoid double freeing it.
2013-10-03 08:57:57 +02:00
Michal Privoznik
ec07a9e84b qemuMonitorJSONGetVirtType: Fix error message
When querying for kvm, we try to find 'enabled' field. Hence the error
message should report we haven't found 'enabled' and not 'running'
(which is not even in the reply). Probably a typo or copy-paste error.
2013-10-03 08:57:50 +02:00
Michal Privoznik
9fa10d3901 qemu_hotplug: Allow QoS update in qemuDomainChangeNet
The qemuDomainChangeNet() is called when 'virsh update-device' is
invoked on a NIC. Currently, we fail to update the QoS even though
we have routines for that.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2013-10-02 10:48:03 +02:00
Cole Robinson
a924d9d083 qemu: cgroup: Fix crash if starting nographics guest
We can dereference graphics[0] even if guest has no graphics device
configured. I screwed this up in a216e64872

https://bugzilla.redhat.com/show_bug.cgi?id=1014088
2013-10-01 11:22:18 -04:00
Michal Privoznik
64f1e1688d qemu_capabilities: Introduce virQEMUCapsInitQMPMonitor
This basically covers the talking-to-monitor part of
virQEMUCapsInitQMP.  The patch itself has no real value,
but it creates an entity to be tested in the next patches.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2013-10-01 10:48:47 +02:00
Peter Krempa
59e21e973f qemu: process: Silence coverity warning when rewinding log file
The change in ef29de14c3 that introduced
better error logging from qemu introduced a warning from coverity about
unused return value from lseek. Silence this warning and fix typo in the
corresponding error message.

Reported by: John Ferlan
2013-09-30 13:43:32 +02:00
Jiri Denemark
9e03f313b8 qemu: Free all driver data in qemuStateCleanup
https://bugzilla.redhat.com/show_bug.cgi?id=1011330 (case A)

While activeScsiHostdevs and webSocketPorts were allocated in
qemuStateInitialize, they were not freed in qemuStateCleanup.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2013-09-27 15:57:14 +02:00
Jiri Denemark
833cdab6d2 qemu: Don't leak reference to virQEMUDriverConfigPtr
https://bugzilla.redhat.com/show_bug.cgi?id=1011330 (case D)

qemuProcessStart created two references to virQEMUDriverConfigPtr before
calling fork():

    cfg = virQEMUDriverGetConfig(driver);
    ...
    hookData.cfg = virObjectRef(cfg);

However, the child only unreferenced hookData.cfg and the parent only
removed the cfg reference. That said, we don't need to increment the
reference counter when assigning cfg to hookData. Both the child and the
parent will correctly remove the reference on cfg (the child will do
that through hookData).

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2013-09-27 15:57:14 +02:00
Chen Hanxiao
21813c9fb5 qemu: virDomainControllerFind may return 0 if controller found
The return value of virDomainControllerFind >=0 means that
the specific controller was found.
But some functions invoke it and treat 0 as not found.
This patch fix these incorrect invocation.

Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
2013-09-26 15:13:36 +02:00
Daniel P. Berrange
145de7b8f3 Fix leak of command line args in qemuParseCommandLine
If qemuParseCommandLine finds an arg it does not understand
it adds it to the QEMU passthrough custom arg list. If the
qemuParseCommandLine method hits an error for any reason
though, it just does 'VIR_FREE(cmd)' on the custom arg list.
This means all actual args / env vars are leaked. Introduce
a qemuDomainCmdlineDefFree method to be used for cleanup.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-25 15:49:28 +01:00
Daniel P. Berrange
94e6b94ab7 Fix leak in qemuParseCommandLine on OOM
If the call to virDomainControllerInsert fails in
qemuParseCommandLine, the controller struct is leaked.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-25 15:49:28 +01:00
Daniel P. Berrange
b391b19144 Fix leak in qemuStringToArgvEnv upon OOM
The 'qemuStringToArgvEnv' method splits up a string of command
line env/args to an 'arglist' array. It then copies env vars
to a 'progenv' array and args to a 'progargv' array. When
copyin the env vars, it NULL-ifies the element in 'arglist'
that is copied.

Upon OOM the 'virStringListFree' is called on progenv and
arglist. Unfortunately, because the elements in 'arglist'
related to env vars have been set to NULL, the call to
virStringListFree(arglist) doesn't free anything, even
though some non-NULL args vars still exist later in the
array.

To fix this leak, stop NULL-ifying the 'arglist' elements,
and change the cleanup code to only free elements in the
'arglist' array, not 'progenv'.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-25 15:49:28 +01:00
Daniel P. Berrange
6bb7f19eb1 Fix missing jump to error cleanup in qemuParseCommandLineDisk
In a number of places in qemuParseCommandLineDisk, an error
is reported, but no 'goto error' jump is used. This causes
failure to report OOM conditions to the caller.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-25 15:49:27 +01:00
Daniel P. Berrange
fbf82783e8 Fix leak in qemuParseCommandLineDisk on OOM
If OOM occurs in qemuParseCommandLineDisk some intermediate
variables will be leaked when parsing Sheepdog or RBD disks.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-25 15:49:27 +01:00
Daniel P. Berrange
86139a408d Fix leak on OOM in qemuBuildCommandLine dealing with sound card
The qemuBuildCommandLine code for parsing sound cards will leak
an intermediate variable if an OOM occurs. Move the free'ing of
the variable earlier to avoid the leak.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-25 15:49:27 +01:00
Daniel P. Berrange
a72d25f40f Fix failure to honour OOM status in qemuParseNBDString
In qemuParseNBDString, if the virURIParse fails, the
error is not reported to the caller. Instead execution
falls through to the non-URI codepath causing memory
leaks later on.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-25 15:49:13 +01:00
Daniel P. Berrange
d7e9f9f7e8 Avoid leak in qemuParseRBDString on failure of qemuAddRBDHost
If qemuAddRBDHost fails due to parsing problems or OOM, then
qemuParseRBDString cleanup is skipped causing a memory leak.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-25 15:49:13 +01:00
Daniel P. Berrange
e7b7a2019d Fix leak of address string in qemuDomainPCIAddressGetNextSlot
qemuDomainPCIAddressGetNextSlot has a loop for finding
compatible PCI buses. In the loop body it creates a
PCI address string, but never frees this. This causes
a leak if the loop executes more than one iteration,
or if a call in the loop body fails.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-25 15:49:12 +01:00
Laine Stump
386ebb47a5 qemu: prefer to put a Q35 machine's dmi-to-pci-bridge at 00:1E.0
This resolves one of the issues listed in:

   https://bugzilla.redhat.com/show_bug.cgi?id=1003983

00:1E.0 is the location of this controller on at least some actual Q35
hardware, so we try to replicate the placement. The bridge should work
just as well in any other location though, so if 00:1E.0 isn't
available, just allow it to be auto-assigned anywhere appropriate.
2013-09-25 10:39:23 -04:00
Laine Stump
c484fe16cb qemu: turn if into switch in qemuDomainValidateDevicePCISlotsQ35
This will make it simpler to add checks for other types of
controllers.

This is a prerequisite for patches to resolve:

   https://bugzilla.redhat.com/show_bug.cgi?id=1003983
2013-09-25 10:38:50 -04:00
Laine Stump
b83d26f6c4 qemu: support ich9-intel-hda audio device
This resolves one of the issues in:

   https://bugzilla.redhat.com/show_bug.cgi?id=1003983

This device is identical to qemu's "intel-hda" device (known as "ich6"
in libvirt), but has a different PCI device ID (which matches the ID
of the hda audio built into the ich9 chipset, of course). It's not
supported in earlier versions of qemu, so it requires a capability
bit.
2013-09-25 10:38:02 -04:00
Laine Stump
8e0dab3a8e qemu: replace multiple strcmps with a switch on an enum
I'm not sure why this code was written to compare the strings that it
had just retrieved from an enum->string conversion, rather than just
look at the original enum values, but this yields the same results,
and is much more efficient (especially as you add more devices).

This is a prerequisite for patches to resolve:

   https://bugzilla.redhat.com/show_bug.cgi?id=1003983
2013-09-25 10:37:33 -04:00
Laine Stump
07af519298 qemu: allow some PCI devices to be attached to PCIe slots
Part of the resolution to:

   https://bugzilla.redhat.com/show_bug.cgi?id=1003983

Although most devices available in qemu area defined as PCI devices,
and strictly speaking should only be attached via a PCI slot, in
practice qemu allows them to be attached to a PCIe slot and sometimes
this makes sense.

For example, The UHCI and EHCI USB controllers are usually attached
directly to the PCIe "root complex" (i.e. PCIe slots) on real
hardware, so that should be possible for a Q35-based qemu virtual
machine as well.

We still want to prefer a standard PCI slot when auto-assigning
addresses, though, and in general to disallow attaching PCI devices
via PCIe slots.

This patch makes that possible by adding a new
QEMU_PCI_CONNECT_TYPE_EITHER_IF_CONFIG flag. Three things are done
with this flag:

1) It is set for the "pcie-root" controller

2) qemuCollectPCIAddress() now has a set of nested switches that set
this "EITHER" flag for devices that we want to allow connecting to
pcie-root when specifically requested in the config.

3) qemuDomainPCIAddressFlagsCompatible() adds this new flag to the
"flagsMatchMask" if the address being checked came from config rather
than being newly auto-allocated by libvirt (this knowledge is
conveniently already available in the "fromConfig" arg).

Now any device having the EITHER flag set can be connected to
pcie-root if explicitly requested, but auto-allocated addresses for
those devices will still be standard PCI slots instead.

This patch only loosens the restrictions on devices that have been
specifically requested, but the setup is such that it should be fairly
easy to add new devices.
2013-09-25 10:36:45 -04:00
Laine Stump
fbd9be484c qemu: eliminate redundant if clauses in qemuCollectPCIAddress
Replace them with switch cases. This will make it more efficient when
we add exceptions for more controller types, and other device types.

This is a prerequisite for patches to resolve:

   https://bugzilla.redhat.com/show_bug.cgi?id=1003983
2013-09-25 10:35:49 -04:00
Peter Krempa
ef29de14c3 qemu: Wire up better early error reporting
The previous patches added infrastructure to report better errors from
monitor in some cases. This patch finalizes this "feature" by enabling
this enhanced error reporting on early phases of VM startup. In these
phases the possibility of qemu producing a useful error message is
really high compared to running it during the whole life cycle. After
the start up is complete, the feature is disabled to provide the usual
error messages so that users are not confused by possibly irrelevant
messages that may be in the domain log.

The original motivation to do this enhancement is to capture errors when
using VFIO device passthrough, where qemu reports errors after the
monitor is initialized and the existing error catching code couldn't
catch this producing a unhelpful message:

 # virsh start test
 error: Failed to start domain test
 error: Unable to read from monitor: Connection reset by peer

With this change, the message is changed to:

 # virsh start test
 error: Failed to start domain test
 error: internal error: early end of file from monitor: possible problem:
 qemu-system-x86_64: -device vfio-pci,host=00:1a.0,id=hostdev0,bus=pci.0,addr=0x5: vfio: error, group 8 is not viable, please ensure all devices within the iommu_group are bound to their vfio bus driver.
 qemu-system-x86_64: -device vfio-pci,host=00:1a.0,id=hostdev0,bus=pci.0,addr=0x5: vfio: failed to get group 8
 qemu-system-x86_64: -device vfio-pci,host=00:1a.0,id=hostdev0,bus=pci.0,addr=0x5: Device 'vfio-pci' could not be initialized
2013-09-25 13:50:57 +02:00
Peter Krempa
90139a6236 qemu: monitor: Produce better errors on monitor hangup
Change the monitor error code to add the ability to access the qemu log
file using a file descriptor so that we can dig in it for a more useful
error message. The error is now logged on monitor hangups and overwrites
a possible lesser error. A hangup on the monitor usualy means that qemu
has crashed and there's a significant chance it produced a useful error
message.

The functionality will be latent until the next patch.
2013-09-25 13:50:56 +02:00
Peter Krempa
8519e9ecdc qemu: monitor: Add infrastructure to access VM logs for better err msgs
Early VM startup errors usually produce a better error message in the
machine log file. Currently we were accessing it only when the process
exited during certain phases of startup. This will help adding a more
comprehensive error extraction for early qemu startup phases.

This patch adds infrastructure to keep a file descriptor for the machine
log file that will be used in case an error happens.
2013-09-25 13:50:56 +02:00
Peter Krempa
310651a5e3 qemu_process: Make qemuProcessReadLog() more versatile and reusable
Teach the function to skip character device definitions printed by qemu
at startup in addition to libvirt log messages and make it usable from
outside of qemu_process.c. Also add documentation about the func.
2013-09-25 13:50:56 +02:00
Daniel P. Berrange
cba4868ad8 Check return value of virDomainControllerInsert when parsing QEMU args
The parsing of '-usb' did not check for failure of the
virDomainControllerInsert method. As a result on OOM, the
parser mistakenly attached USB disks to the IDE controller.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-24 16:58:32 +01:00
Daniel P. Berrange
b81f30566b Honour error returned by virBitmapFormat
The code formatting NUMA args was ignoring the return value
of virBitmapFormat, so on OOM, it would silently drop the
NUMA cpumask arg.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-24 16:58:27 +01:00
Daniel P. Berrange
a4b0c75ce8 Add missing check for OOM when building boot menu args
When building boot menu args, if OOM occurred the CLI args
would end up containing  'order=(null)' due to a missing
call to 'virBufferError'.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-24 16:58:23 +01:00
Daniel P. Berrange
5dd3b5e32a Fix missing OOM check in qemuParseCommandLine when splitting strings
The qemuParseCommandLine method did not check the return value of
virStringSplit to see if OOM had occurred. This lead to dereference
of a NULL pointer on OOM.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-24 10:52:26 +01:00
Daniel P. Berrange
5923ea67b1 Fix error checking of qemuParseKeywords return status
Most callers of qemuParseKeywords were assigning its return
value to a 'size_t' variable. Then then also checked '< 0'
for error condition, but this will never be true with the
unsigned size_t variable. Rather than using 'ssize_t', change
qemuParseKeywords so that the element count is returned via
an output parameter, leaving the return value solely as an
error indicator.

This avoids a crash accessing beyond the end of an error
upon OOM.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-24 10:52:26 +01:00
Daniel P. Berrange
150c1db52b Fix allocation of arglist in qemuStringToArgvEnv
In

  commit 41b5505679
  Author: Eric Blake <eblake@redhat.com>
  Date:   Wed Aug 28 15:01:23 2013 -0600

    qemu: simplify list cleanup

The qemuStringToArgvEnv method was changed to use virStringFreeList
to free the 'arglist' array. This method assumes the string list
array is NULL terminated, however, qemuStringToArgvEnv was not
ensuring this when populating 'arglist'. This caused an out of
bounds access by virStringFreeList when OOM occured in the initial
loop of qemuStringToArgvEnv

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-24 10:52:26 +01:00
Daniel P. Berrange
0bea528a33 Fix crash on OOM in qemuAddRBDHost
When parsing the RBD hosts, it increments the 'nhosts' counter
before increasing the 'hosts' array allocation. If an OOM then
occurs when increasing the array allocation, the cleanup block
will attempt to access beyond the end of the array. Switch
to using VIR_EXPAND_N instead of VIR_REALLOC_N to protect against
this mistake

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-24 10:52:26 +01:00
Daniel P. Berrange
ba19783d9b Fix crash on OOM in qemuDomainCCWAddressSetCreate()
If OOM occurs in qemuDomainCCWAddressSetCreate, it jumps to
a cleanup block and frees the partially initialized object.
It then mistakenly returns the address of the just free'd
pointer instead of NULL.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-24 10:52:21 +01:00
Giuseppe Scrivano
cbcecd7ab1 virConnectGetCPUModelNames: add the support for qemu
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2013-09-23 15:52:14 -06:00
Martin Kletzander
484cc3217b qemu: Fix seamless SPICE migration
Since the wait is done during migration (still inside
QEMU_ASYNC_JOB_MIGRATION_OUT), the code should enter the monitor as such
in order to prohibit all other jobs from interfering in the meantime.
This patch fixes bug #1009886 in which qemuDomainGetBlockInfo was
waiting on the monitor condition and after GetSpiceMigrationStatus
mangled its internal data, the daemon crashed.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1009886
2013-09-20 17:11:10 +02:00
Laine Stump
30bb4c4b54 qemu: use "ide" as device name for implicit SATA controller on Q35
This resolves https://bugzilla.redhat.com/show_bug.cgi?id=1008903

The Q35 machinetype has an implicit SATA controller at 00:1F.2 which
isn't given the "expected" id of ahci0 by qemu when it's created. The
original suggested solution to this problem was to not specify any
controller for the disks that use the default controller and just
specify "unit=n" instead; qemu should then use the first IDE or SATA
controller for the disk.

Unfortunately, this "solution" is ignorant of the fact that in the
case of SATA disks, the "unit" attribute in the disk XML is actually
*not* being used for the unit, but is instead used to specify the
"bus" number; each SATA controller has 6 buses, and each bus only
allows a single unit. This makes it nonsensical to specify unit='n'
where n is anything other than 0. It also means that the only way to
connect more than a single device to the implicit SATA controller is
to explicitly give the bus names, which happen to be "ide.$n", where
$n can be replaced by the disk's "unit" number.
2013-09-20 07:03:23 -04:00
Jiri Denemark
13e9bad55a qemu: Avoid dangling job in qemuDomainSetBlockIoTune
virDomainSetBlockIoTuneEnsureACL was incorrectly called after we already
started a job. As a result of this, the job was not cleaned up when an
access driver had forbidden the action.
2013-09-18 10:37:48 +02:00
Aline Manera
8ffe1d0c46 Add tftp protocol support for cdrom disk
qemu/KVM also supports a tftp URL while specifying the cdrom ISO image.

The xml should be as following:

    <disk type='network' device='cdrom'>
      <source protocol='tftp' name='/url/path'>
        <host name='host.name' port='69'/>
      </source>
    </disk>

Signed-off-by: Aline Manera <alinefm@br.ibm.com>
2013-09-17 14:45:02 +01:00
Aline Manera
0f24393e60 Add ftps protocol support for cdrom disk
The ftps protocol is another protocol supported by qemu/KVM while specifying
the cdrom ISO image.

The xml should be as following:

    <disk type='network' device='cdrom'>
      <source protocol='ftps' name='/url/path'>
        <host name='host.name' port='990'/>
      </source>
    </disk>

Signed-off-by: Aline Manera <alinefm@br.ibm.com>
2013-09-17 14:45:02 +01:00
Aline Manera
d9dd981801 Add https protocol support for cdrom disk
The https protocol is also accepted by qemu/KVM when specifying the cdrom ISO
image.

The xml should be as following:

    <disk type='network' device='cdrom'>
      <source protocol='https' name='/url/path'>
        <host name='host.name' port='443'/>
      </source>
    </disk>

Signed-off-by: Aline Manera <alinefm@br.ibm.com>
2013-09-17 14:45:02 +01:00
Peter Krempa
044e3e7524 qemu: Fix memleak after commit 59898a88ce
If the ABI compatibility check with the "migratable" user XML is
successful, we would leak the originally parsed XML from the user that
would not be used in this case.

Reported by Ján Tomko.
2013-09-17 12:04:57 +02:00
Peter Krempa
f87a7c67de qemu: Factor out body of qemuDomainSetMetadata for universal use
The function implemented common behavior that can be reused for other
hypervisor drivers that use the virDomainObj data structures. Factor out
the core into a separate helper func.
2013-09-17 09:42:49 +02:00
Peter Krempa
99c51af2ee qemu: Factor out body of qemuDomainGetMetadata for universal use
The function implemented common behavior that can be reused for other
hypervisor drivers that use the virDomainObj data structures. Factor out
the core into a separate helper func.
2013-09-17 09:42:49 +02:00
Peter Krempa
1b7bfa65e3 qemu: Use "migratable" XML definition when doing external checkpoints
In the original implementation of external checkpoints I've mistakenly
used the live definition to be stored in the save image. The normal
approach is to use the "migratable" definition. This was discovered when
commit 07966f6a8b changed the behavior to
use a converted XML from the user to do the compatibility check to fix
problem when using the regular machine saving.

As the previous patch added a compatibility layer, we can now change the
type of the XML in the image.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1008340
2013-09-17 09:42:43 +02:00
Peter Krempa
59898a88ce qemu: Fix checking of ABI stability when restoring external checkpoints
External checkpoints have a bug in the implementation where they use the
normal definition instead of the "migratable" one. This causes errors
when the snapshot is being reverted using the workaround method via
qemuDomainRestoreFlags() with a custom XML. This issue was introduced
when commit 07966f6a8b changed the code to
compare "migratable" XMLs from the user as we should have used
migratable in the image too.

This patch adds a compatibility layer, so that fixing the snapshot code
won't make existing snapshots fail to load.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1008340
2013-09-17 09:42:43 +02:00
Ján Tomko
102eb00c28 Always free network and graphics cookies
qemuMigrationEatCookie has flags to control if these should
be parsed, but it does not fill mig->flags. These cookies might
get leaked if these flags are not set by qemuMigrationBakeCookie.

42 (32 direct, 10 indirect) bytes in 1 blocks are definitely lost in
loss record 361 of 662
==123== by 0x1BA33FCA: qemuMigrationEatCookie (qemu_migration.c:678)
==123== by 0x1BA34A1E: qemuMigrationRun (qemu_migration.c:3108)
==123== by 0x1BA3622B: doNativeMigrate (qemu_migration.c:3343)
==123== by 0x1BA3B408: qemuMigrationPerform (qemu_migration.c:4138)
2013-09-16 19:26:21 +02:00
Peter Krempa
d79fe8b50b cgroup: Move [qemu|lxc]GetCpuBWStatus to vicgroup.c and refactor it
The function existed in two identical instances in lxc and qemu. Move it
to vircgroup.c and simplify it. Refactor the callers too.
2013-09-16 11:32:49 +02:00
Peter Krempa
4baa8d7637 cleanup: Kill usage of access(PATH, F_OK) in favor of virFileExists()
Semantics of the libvirt helper are more clear. This change also allows
to clean up some pieces of code.
2013-09-16 10:37:39 +02:00
Peter Krempa
53c39f5837 qemu: Fix checking of guest ABI compatibility when reverting snapshots
When reverting a live internal snapshot with a live guest the ABI
compatiblity check was comparing a "migratable" definition with a normal
one. This resulted in the check failing with:

revert requires force: Target device address type none does not match source pci

This patch generates a "migratable" definition from the actual one to
check against the definition from the snapshot to avoid this problem.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1006886
2013-09-12 15:11:38 +02:00
Eric Blake
6cd1548258 qemu: endjob returns a bool
Osier Yang pointed out that ever since commit 31cb030, the
signature of qemuDomainObjEndJob was changed to return a bool.
While comparison against 0 or > 0 still gives the right results,
it looks fishy; we also had one place that was comparing < 0
which is effectively dead code.

* src/qemu/qemu_migration.c (qemuMigrationPrepareAny): Fix dead
code bug.
(qemuMigrationBegin): Use more canonical form of bool check.
* src/qemu/qemu_driver.c (qemuAutostartDomain)
(qemuDomainCreateXML, qemuDomainSuspend, qemuDomainResume)
(qemuDomainShutdownFlags, qemuDomainReboot, qemuDomainReset)
(qemuDomainDestroyFlags, qemuDomainSetMemoryFlags)
(qemuDomainSetMemoryStatsPeriod, qemuDomainInjectNMI)
(qemuDomainSendKey, qemuDomainGetInfo, qemuDomainScreenshot)
(qemuDomainSetVcpusFlags, qemuDomainGetVcpusFlags)
(qemuDomainRestoreFlags, qemuDomainGetXMLDesc)
(qemuDomainCreateWithFlags, qemuDomainAttachDeviceFlags)
(qemuDomainUpdateDeviceFlags, qemuDomainDetachDeviceFlags)
(qemuDomainBlockResize, qemuDomainBlockStats)
(qemuDomainBlockStatsFlags, qemuDomainMemoryStats)
(qemuDomainMemoryPeek, qemuDomainGetBlockInfo)
(qemuDomainAbortJob, qemuDomainMigrateSetMaxDowntime)
(qemuDomainMigrateGetCompressionCache)
(qemuDomainMigrateSetCompressionCache)
(qemuDomainMigrateSetMaxSpeed)
(qemuDomainSnapshotCreateActiveInternal)
(qemuDomainRevertToSnapshot, qemuDomainSnapshotDelete)
(qemuDomainQemuMonitorCommand, qemuDomainQemuAttach)
(qemuDomainBlockJobImpl, qemuDomainBlockCopy)
(qemuDomainBlockCommit, qemuDomainOpenGraphics)
(qemuDomainGetBlockIoTune, qemuDomainGetDiskErrors)
(qemuDomainPMSuspendForDuration, qemuDomainPMWakeup)
(qemuDomainQemuAgentCommand, qemuDomainFSTrim): Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-09-09 13:07:29 -06:00