Commit Graph

260 Commits

Author SHA1 Message Date
Marc Hartmayer
e22de286b1 qemu: Fix deadlock across fork() in QEMU driver
The functions in virCommand() after fork() must be careful with regard
to accessing any mutexes that may have been locked by other threads in
the parent process. It is possible that another thread in the parent
process holds the lock for the virQEMUDriver while fork() is called.
This leads to a deadlock in the child process when
'virQEMUDriverGetConfig(driver)' is called and therefore the handshake
never completes between the child and the parent process. Ultimately
the virDomainObjectPtr will never be unlocked.

It gets much worse if the other thread of the parent process, that
holds the lock for the virQEMUDriver, tries to lock the already locked
virDomainObject. This leads to a completely unresponsive libvirtd.

It's possible to reproduce this case with calling 'virsh start XXX'
and 'virsh managedsave XXX' in a tight loop for multiple domains.

This commit fixes the deadlock in the same way as it is described in
commit 61b52d2e38.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2017-02-21 15:47:32 +01:00
Michal Privoznik
1bb787fdc9 qemuDomainGetHostdevPath: Report /dev/vfio/vfio less frequently
So far, qemuDomainGetHostdevPath has no knowledge of the reasong
it is called and thus reports /dev/vfio/vfio for every VFIO
backed device. This is suboptimal, as we want it to:

a) report /dev/vfio/vfio on every addition or domain startup
b) report /dev/vfio/vfio only on last VFIO device being unplugged

If a domain is being stopped then namespace and CGroup die with
it so no need to worry about that. I mean, even when a domain
that's exiting has more than one VFIO devices assigned to it,
this function does not clean /dev/vfio/vfio in CGroup nor in the
namespace. But that doesn't matter.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-20 07:21:59 +01:00
Michal Privoznik
b8e659aa98 qemuDomainGetHostdevPath: Create /dev/vfio/vfio iff needed
So far, we are allowing /dev/vfio/vfio in the devices cgroup
unconditionally (and creating it in the namespace too). Even if
domain has no hostdev assignment configured. This is potential
security hole. Therefore, when starting the domain (or
hotplugging a hostdev) create & allow /dev/vfio/vfio too (if
needed).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-20 07:21:58 +01:00
Michal Privoznik
9d92f533f8 qemuSetupHostdevCgroup: Use qemuDomainGetHostdevPath
Since these two functions are nearly identical (with
qemuSetupHostdevCgroup actually calling virCgroupAllowDevicePath)
we can have one function call the other and thus de-duplicate
some code.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-20 07:21:58 +01:00
Michal Privoznik
b57bd206b9 qemu_conf: Check for namespaces availability more wisely
The bare fact that mnt namespace is available is not enough for
us to allow/enable qemu namespaces feature. There are other
requirements: we must copy all the ACL & SELinux labels otherwise
we might grant access that is administratively forbidden or vice
versa.
At the same time, the check for namespace prerequisites is moved
from domain startup time to qemu.conf parser as it doesn't make
much sense to allow users to start misconfigured libvirt just to
find out they can't start a single domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-15 12:43:23 +01:00
Michal Privoznik
18ce9d139d qemuDomainNamespace{Setup,Teardown}Disk: Don't pass pointer to full disk
These functions do not need to see the whole virDomainDiskDef.
Moreover, they are going to be called from places where we don't
have access to the full disk definition. Sticking with
virStorageSource is more than enough.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:56:05 +01:00
Martin Kletzander
8336cbca21 qemu: Fix indentation in qemu_domain.h for RNG Namespaces
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2017-01-31 16:13:32 +01:00
Michal Privoznik
b020cf73fe domain_conf: Introduce <mtu/> to <interface/>
So far we allow to set MTU for libvirt networks. However, not all
domain interfaces have to be plugged into a libvirt network and
even if they are, they might want to have a different MTU (e.g.
for testing purposes).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-26 09:59:56 +01:00
Michal Privoznik
406e390962 qemu: Drop qemuDomainDeleteNamespace
After previous commits, this function is no longer needed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:57 +01:00
John Ferlan
7f7d990483 qemu: Don't assume secret provided for LUKS encryption
https://bugzilla.redhat.com/show_bug.cgi?id=1405269

If a secret was not provided for what was determined to be a LUKS
encrypted disk (during virStorageFileGetMetadata processing when
called from qemuDomainDetermineDiskChain as a result of hotplug
attach qemuDomainAttachDeviceDiskLive), then do not attempt to
look it up (avoiding a libvirtd crash) and do not alter the format
to "luks" when adding the disk; otherwise, the device_add would
fail with a message such as:

   "unable to execute QEMU command 'device_add': Property 'scsi-hd.drive'
    can't find value 'drive-scsi0-0-0-0'"

because of assumptions that when the format=luks that libvirt would have
provided the secret to decrypt the volume.

Access to unlock the volume will thus be left to the application.
2017-01-03 12:59:18 -05:00
Michal Privoznik
f95c5c48d4 qemu: Manage /dev entry on RNG hotplug
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
f5fdf23a68 qemu: Manage /dev entry on chardev hotplug
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
6e57492839 qemu: Manage /dev entry on hostdev hotplug
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
81df21507b qemu: Manage /dev entry on disk hotplug
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
bb4e529664 qemu: Spawn qemu under mount namespace
Prime time. When it comes to spawning qemu process and
relabelling all the devices it's going to touch, there's inherent
race with other applications in the system (e.g. udev). Instead
of trying convincing udev to not touch libvirt managed devices,
we can create a separate mount namespace for the qemu, and mount
our own /dev there. Of course this puts more work onto us as we
have to maintain /dev files on each domain start and device
hot(un-)plug. On the other hand, this enhances security also.

From technical POV, on domain startup process the parent
(libvirtd) creates:

  /var/lib/libvirt/qemu/$domain.dev
  /var/lib/libvirt/qemu/$domain.devpts

The child (which is going to be qemu eventually) calls unshare()
to create new mount namespace. From now on anything that child
does is invisible to the parent. Child then mounts tmpfs on
$domain.dev (so that it still sees original /dev from the host)
and creates some devices (as explained in one of the previous
patches). The devices have to be created exactly as they are in
the host (including perms, seclabels, ACLs, ...). After that it
moves $domain.dev mount to /dev.

What's the $domain.devpts mount there for then you ask? QEMU can
create PTYs for some chardevs. And historically we exposed the
host ends in our domain XML allowing users to connect to them.
Therefore we must preserve devpts mount to be shared with the
host's one.

To make this patch as small as possible, creating of devices
configured for domain in question is implemented in next patches.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Nikolay Shirokovskiy
aaf2992d90 qemu: agent: fix unsafe agent access
qemuDomainObjExitAgent is unsafe.

First it accesses domain object without domain lock.
Second it uses outdated logic that goes back to commit 79533da1 of
year 2009 when code was quite different. (unref function
instead of unreferencing only unlocked and disposed object
in case of last reference and leaved unlocking to the caller otherwise).
Nowadays this logic may lead to disposing locked object
i guess.

Another problem is that the callers of qemuDomainObjEnterAgent
use domain object again (namely priv->agent) without domain lock.

This patch address these two problems.

qemuDomainGetAgent is dropped as unused.
2016-11-23 11:31:28 +03:00
Nikolay Shirokovskiy
3c1c56781d qemu: drop write-only agentStart 2016-11-23 11:31:14 +03:00
Peter Krempa
3f71c79768 qemu: monitor: Extract qemu cpu id along with other data
Storing of the ID will allow simpler extraction of data present only in
query-cpus without the need to call qemuMonitorGetCPUInfo in statistics
paths.
2016-11-21 17:19:48 +01:00
Laine Stump
50adb8a660 qemu: new functions qemuDomainMachineHasPCI[e]Root()
These functions provide a simple one line method of learning if the
current domain has a pci-root or pcie-root bus.
2016-11-14 14:03:09 -05:00
Peter Krempa
93d9ff3da0 qemu: process: detect if dimm aliases are broken on reconnect
Detect on reconnect to a running qemu VM whether the alias of a
hotpluggable memory device (dimm) does not match the dimm slot number
where it's connected to. This is necessary as qemu is actually
considering the alias as machine ABI used to connect the backend object
to the dimm device.

This will require us to keep them consistent so that we can reliably
restore them on migration. In some situations it was currently possible
to create a mismatched configuration and qemu would refuse to restore
the migration stream.

To avoid breaking existing VMs we'll need to keep the old algorithm
though.
2016-11-10 17:36:55 +01:00
John Ferlan
daf5c651f0 qemu: Add a secret object to/for a char source dev
Add the secret object so the 'passwordid=' can be added if the command line
if there's a secret defined in/on the host for TCP chardev TLS objects.

Preparation for the secret involves adding the secinfo to the char source
device prior to command line processing. There are multiple possibilities
for TCP chardev source backend usage.

Add test for at least a serial chardev as an example.
2016-10-26 07:18:25 -04:00
Viktor Mihajlovski
08f22976b1 qemu: Add domain support for VCPU halted state
Adding a field to the domain's private vcpu object to hold the halted
state information.
Adding two functions in support of the halted state:
- qemuDomainGetVcpuHalted: retrieve the halted state from a
  private vcpu object
- qemuDomainRefreshVcpuHalted: obtain the per-vcpu halted states
  via qemu monitor and store the results in the private vcpu objects

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Hao QingFeng <haoqf@linux.vnet.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-10-24 18:52:36 -04:00
Pavel Hrdina
0298531b29 domain: Add optional 'tls' attribute for TCP chardev
Add an optional "tls='yes|no'" attribute for a TCP chardev.

For QEMU, this will allow for disabling the host config setting of the
'chardev_tls' for a domain chardev channel by setting the value to "no" or
to attempt to use a host TLS environment when setting the value to "yes"
when the host config 'chardev_tls' setting is disabled, but a TLS environment
is configured via either the host config 'chardev_tls_x509_cert_dir' or
'default_tls_x509_cert_dir'

Signed-off-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-10-24 16:05:33 +02:00
John Ferlan
7bd8312e7f conf: Move the privateData from virDomainChrDef to virDomainChrSourceDef
Commit id '5f2a132786' should have placed the data in the host source
def structure since that's also used by smartcard, redirdev, and rng in
order to provide a backend tcp channel.  The data in the private structure
will be necessary in order to provide the secret properly.

This also renames the previous names from "Chardev" to "ChrSource" for
the private data structures and API's
2016-10-21 16:42:59 -04:00
John Ferlan
5f2a132786 qemu: Introduce qemuDomainChardevPrivatePtr
Modeled after the qemuDomainHostdevPrivatePtr (commit id '27726d8c'),
create a privateData pointer in the _virDomainChardevDef to allow storage
of private data for a hypervisor in order to at least temporarily store
secret data for usage during qemuBuildCommandLine.

NB: Since the qemu_parse_command (qemuParseCommandLine) code is not
expecting to restore the secret data, there's no need to add code
code to handle this new structure there.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-10-19 15:40:29 -04:00
Pavel Hrdina
4c029e8cfa qemu_command: properly detect which model to use for video device
This improves commit 706b5b6277 in a way that we check qemu capabilities
instead of what architecture we are running on to detect whether we can
use *virtio-vga* model or not.  This is not a case only for arm/aarch64.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-10-12 17:46:48 +02:00
Peter Krempa
63aac8c299 qemu: domain: Add macro to simplify access to vm private data
Sometimes adding a separate variable to access vm->privateData is not
necessary. Add a macro that will do the typecasting rather than having
to add a temp variable to force the compiler to typecast it.
2016-09-21 16:32:36 +02:00
Martin Kletzander
0f61d7b5f2 qemu: Abstract shmem socket path preparation
Put it into qemuDomainPrepareShmemChardev() so it can be used later.
Also don't fill in the path unless the server option is enabled.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-09-20 15:42:43 +02:00
Jiri Denemark
56258a388f qemu: Don't use query-migrate on destination
When migration fails, we need to poke QEMU monitor to check for a reason
of the failure. We did this using query-migrate QMP command, which is
not supposed to return any meaningful result on the destination side.
Thus if the monitor was still functional when we detected the migration
failure, parsing the answer from query-migrate always failed with the
following error message:

    "info migration reply was missing return status"

This irrelevant message was then used as the reason for the migration
failure replacing any message we might have had.

Let's use harmless query-status for poking the monitor to make sure we
only get an error if the monitor connection is broken.

https://bugzilla.redhat.com/show_bug.cgi?id=1374613

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-09-12 15:56:10 +02:00
Peter Krempa
20ef1232ec qemu: process: Copy final vcpu order information into the vcpu definition
The vcpu order information is extracted only for hotpluggable entities,
while vcpu definitions belonging to the same hotpluggable entity need
to all share the order information.

We also can't overwrite it right away in the vcpu info detection code as
the order is necessary to add the hotpluggable vcpus enabled on boot in
the correct order.

The helper will store the order information in places where we are
certain that it's necessary.
2016-08-24 15:44:47 -04:00
Peter Krempa
48e3d42889 qemu: migration: Prepare for non-contiguous vcpu configurations
Introduce a new migration cookie flag that will be used for any
configurations that are not compatible with libvirt that would not
support the specific vcpu hotplug approach. This will make sure that old
libvirt does not fail to reproduce the configuration correctly.
2016-08-24 15:44:47 -04:00
Peter Krempa
133be0a9e2 qemu: domain: Prepare for VCPUs vanishing while libvirt is not running
Similarly to devices the guest may allow unplug of the VCPU if libvirt
is down. To avoid problems, refresh the vcpu state on reconnect. Don't
mess with the vcpu state otherwise.
2016-08-24 15:44:47 -04:00
Peter Krempa
6b4a23ff6c qemu: domain: Extract cpu-hotplug related data
Now that the monitor code gathers all the data we can extract it to
relevant places either in the definition or the private data of a vcpu.

As only thread id is broken for TCG guests we may extract the rest of
the data and just skip assigning of the thread id. In case where qemu
would allow cpu hotplug in TCG mode this will make it work eventually.
2016-08-24 15:44:47 -04:00
Peter Krempa
2bdc300a34 qemu: domain: Improve vCPU data checking in qemuDomainRefreshVcpu
Validate the presence of the thread id according to state of the vCPU
rather than just checking the vCPU count. Additionally put the new
validation code into a separate function so that the information
retrieval can be split from the validation.
2016-08-04 08:08:31 +02:00
Peter Krempa
8f56b5baaf qemu: domain: Rename qemuDomainDetectVcpuPids to qemuDomainRefreshVcpuInfo
The function will eventually do more useful stuff than just detection of
thread ids.
2016-08-04 08:03:58 +02:00
Martin Kletzander
a2b97a8d91 qemu: Fix support for startupPolicy with volume/pool disks
Until now we simply errored out when the translation from pool+volume
failed.  However, we should instead check whether that disk is needed or
not since there is an option for that.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1168453

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-08-02 13:21:01 +02:00
Tomasz Flendrich
1aa5e66cf3 qemu: remove ccwaddrs caching
Dropping the caching of ccw address set.
The cached set is not required anymore, because the set is now being
recalculated from the domain definition on demand, so the cache
can be deleted.
2016-07-26 13:04:46 +02:00
Tomasz Flendrich
19a148b7c8 qemu: remove vioserialaddrs caching
Dropping the caching of virtio serial address set.
The cached set is not required anymore, because the set is now being
recalculated from the domain definition on demand, so the cache
can be deleted.

Credit goes to Cole Robinson.
2016-07-26 13:04:46 +02:00
Ján Tomko
ddd31fd7dc Reserve existing USB addresses
Check if they fit on the USB controllers the domain has,
and error out if two devices try to use the same address.
2016-07-21 08:30:26 +02:00
Peter Krempa
5184f398b4 qemu: Store vCPU thread ids in vcpu private data objects
Rather than storing them in an external array store them directly.
2016-07-11 10:44:09 +02:00
Peter Krempa
2540c93203 qemu: domain: Add vcpu private data structure
Members will be added in follow-up patches.
2016-07-11 10:44:09 +02:00
John Ferlan
60c40ce3be qemu: Introduce helper qemuDomainSecretDiskCapable
Introduce a helper to help determine if a disk src could be possibly used
for a disk secret... Going to need this for hot unplug.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-07-01 15:46:57 -04:00
Andrea Bolognani
177ecaa598 qemu: Introduce qemuDomainMachineIsPSeries()
This new function checks for both the architecture and the
machine type, so we can use it instead of writing the same
checks over and over again.
2016-06-24 10:17:59 +02:00
John Ferlan
dc428f450a qemu: Add new secret info type
Add 'encinfo' to the extended disk structure. This will contain the
encryption secret (if present).

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-06-23 12:30:28 -04:00
Martin Kletzander
f670008b58 qemu: Move channel path generation out of command creation
Put it into separate function called qemuDomainPrepareChannel() and call
it from the new qemuProcessPrepareDomain().

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-06-09 13:23:15 +02:00
Peter Krempa
91a6eacc8f qemu: domain: Implement helper for one-shot log entries to the VM log file
Along with the virtlogd addition of the log file appending API implement
a helper for logging one-shot entries to the log file including the
fallback approach of using direct file access.

This will be used for noting the shutdown of the qemu proces and
possibly other actions such as VM migration and other critical VM
lifecycle events.
2016-06-07 18:10:29 +02:00
John Ferlan
fb06350021 qemu: Remove unused persistentAddrs
Based on some digital archaeology performed by jtomko, it's been determined
that the persistentAddrs variable is no longer necessary...

The variable was added by:
commit 141dea6bc7
CommitDate: 2010-02-12 17:25:52 +0000
    Add persistence of PCI addresses to QEMU

Where it was set to 0 on domain startup if qemu did not support the
QEMUD_CMD_FLAG_DEVICE capability, to clear the addresses at shutdown,
because QEMU might make up different ones next time.

As of commit f5dd58a608
CommitDate: 2012-07-11 11:19:05 +0200
    qemu: Extended qemuDomainAssignAddresses to be callable from
    everywhere.

this was broken, when the persistentAddrs = 0 assignment was moved
inside qemuDomainAssignPCIAddresses and while it pretends to check
for !QEMU_CAPS_DEVICE, its parent qemuDomainAssignAddresses is only
called if QEMU_CAPS_DEVICE is present.
2016-05-25 06:02:42 -04:00
Peter Krempa
894dc85fd1 qemu: process: Fix and improve disk data extraction
Extract information for all disks and update tray state and source only
for removable drives. Additionally store whether a drive is removable
and whether it has a tray.
2016-05-25 10:15:54 +02:00
Peter Krempa
f1690dc3d7 qemu: Extract more information about qemu drives
Extract whether a given drive has a tray and whether there is no image
inserted.

Negative logic for the image insertion is chosen so that the flag is set
only if we are certain of the fact.
2016-05-25 10:15:54 +02:00
Peter Krempa
5f963d89b1 qemu: Move struct qemuDomainDiskInfo to qemu_domain.h 2016-05-25 10:15:54 +02:00