Introduce a new function to prepare domain disks which will also do the
volume source to actual disk source translation.
The 'pretend' condition is not transferred to the new location since it
does not help in writing tests and also no tests abuse it.
qemu 2.7.0 introduces multiqueue virtio-blk(commit 2f27059).
This patch introduces a new attribute "queues". An example of
the XML:
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' queues='4'/>
The corresponding QEMU command line:
-device virtio-blk-pci,scsi=off,num-queues=4,id=virtio-disk0
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Introduce a function to setup any TLS needs for a disk source.
If there's a configuration or other error setting up the disk source
for TLS, then cause the domain startup to fail.
For VxHS, follow the chardevTLS model where if the src->haveTLS hasn't
been configured, then take the system/global cfg->haveTLS setting for
the storage source *and* mark that we've done so via the tlsFromConfig
setting in storage source.
Next, if we are using TLS, then generate an alias into a virStorageSource
'tlsAlias' field that will be used to create the TLS object and added to
the disk object in order to link the two together for QEMU.
Signed-off-by: John Ferlan <jferlan@redhat.com>
VM private data is cleared when the VM is turned off and also when the
VM object is being freed. Some of the clearing code was duplicated.
Extract it to a separate function.
This also removes the now unnecessary function
qemuDomainClearPrivatePaths.
Since commit v2.2.0-199-g7ce711a30e libvirt stores an updated guest CPU
in domain's live definition and there's no need to update it every time
we want to format the definition. The commit itself tried to address
this in qemuDomainFormatXML, but forgot to fix qemuDomainDefFormatLive.
Not to mention that masking a previously set flag is only acceptable if
the flag was set by a public API user. Internally, libvirt should have
never set the flag in the first place.
https://bugzilla.redhat.com/show_bug.cgi?id=1485022
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Block job QMP commands with underscores rather than dashes were never
released in upstream qemu, (they were added, but modified in the same
release [1]), but a certain distro managed to backport the version in the
middle.
The change also slightly modified semantics for the abort command, which
made us have a lot of code which was only ever present in certain
downstream distros.
Clean the upstream code from the legacy cruft and support only the
upstream implementations.
[1] See qemu commit v1.0-2176-gdb58f9c060
Reviewed-by: Eric Blake <eblake@redhat.com>
No need to pass a @driver parameter since all that's done is deref
the @cfg especially since the only caller can just pass an already
referenced @cfg.
Also, looks like commit id '0298531b' at one time had a different
name for the API, so I took the liberty of fixing the comments too
since I would already be updating them for the @cfg variable.
In case of real migration (not migrating to file on save, dump etc)
migration info is not complete at time qemu finishes migration
in normal (non postcopy) mode. We need to update disks stats,
downtime info etc. Thus let's not expose this job status as
completed.
To archive this let's set status to 'qemu completed' after
qemu reports migration is finished. It is not visible as complete
job to clients. Cookie code on confirm phase will finally turn
job into completed. As we don't need more things to do when
migrating to file status is set to 'completed' as before
in this case.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When getting job info in case mirror does not reach ready phase
fetch mirror stats from qemu. Otherwise mirror stats are already
saved in current job.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Instead of checking stat.status let's set status to migrating
as soon as migrate command is send (waiting for completion
is a good place too).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Let's introduce QEMU_DOMAIN_JOB_STATUS_POSTCOPY state for job.current->status
instead of checking job.current->stats.status. The latter can be changed
when fetching migration statistics. Moving state function from the variable
and leave only store function seems more managable.
This patch removes all state checking usage of stats except for
qemuDomainGetJobStatsInternal. This place will be handled separately.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This patch simply switches code from using VIR_DOMAIN_JOB_* to
introduced QEMU_DOMAIN_JOB_STATUS_*. Later this gives us freedom
to introduce states for postcopy and mirroring phases.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
At some places we either already have synchronous job or we just
released it. Also, some APIs might want to use this code without
having to release their job. Anyway, the job acquire code is
moved out to qemuDomainRemoveInactiveJob so that
qemuDomainRemoveInactive does just what it promises.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
It is more related to a domain as we might use it even when there is
no systemd and it does not use any dbus/systemd functions. In order
not to use code from conf/ in util/ pass machineName in cgroups code
as a parameter. That also fixes a leak of machineName in the lxc
driver and cleans up and de-duplicates some code.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
vcpu properties gathered from query-hotpluggable cpus need to be passed
back to qemu. As qemu did not use the node-id property until now and
libvirt forgot to pass it back properly (it was parsed but not passed
around) we did not honor this.
This patch adds node-id to the structures where it was missing and
passes it around as necessary.
The test data was generated with a VM with following config:
<numa>
<cell id='0' cpus='0,2,4,6' memory='512000' unit='KiB'/>
<cell id='1' cpus='1,3,5,7' memory='512000' unit='KiB'/>
</numa>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1452053
In QEMU driver we can use virtlogd as stdio handler for source backend
of char devices if current QEMU is new enough and it's enabled in
qemu.conf. We should store this information while starting a guest
because the config option may change while the guest is running.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
When making ABI stability checks for an active domain, we need to make
sure we use the same migratable definition which virDomainGetXMLDesc
with the MIGRATABLE flag provides, otherwise the ABI check will fail.
This is implemented in the new qemuDomainCheckABIStability which takes a
domain object and generates the right migratable definition from it.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
If QEMU is new enough and we have the live updated CPU definition in
either save or migration cookie, we can use it to enforce ABI. The
original guest CPU from domain XML will be stored in private data.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Since the domain XML saved in a snapshot or saved image uses the
original guest CPU definition but we still want to enforce ABI when
restoring the domain if libvirt and QEMU are new enough, we save the
live updated CPU definition in a save cookie.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The destination host may not be able to start a domain using the live
updated CPU definition because either libvirt or QEMU may not be new
enough. Thus we need to send the original guest CPU definition.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When starting a domain we update the guest CPU definition to match what
QEMU actually provided (since it is allowed to add or removed some
features unless check='full' is specified). Let's store the original CPU
in domain private data so that we can use it to provide a backward
compatible domain XML.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This patch implements a new save cookie object and callbacks for qemu
driver. The actual useful content will be added in the object later.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This will be used later when a save cookie will become part of the
snapshot XML using new driver specific parser/formatter functions.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1450349
Problem is, qemu fails to load guest memory image if these
attribute change on migration/restore from an image.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Not all async jobs are visible via virDomainGetJobStats (either they are
too fast or getting the stats is not allowed during the job), but
forcing all of them to advertise the operation is easier than hunting
the jobs for which fetching statistics is allowed. And we won't need to
think about this when we add support for getting stats for more jobs.
https://bugzilla.redhat.com/show_bug.cgi?id=1441563
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Introduce new wrapper functions without *Machine* in the function
name that take the whole virDomainDef structure as argument and
call the existing functions with *Machine* in the function name.
Change the arguments of existing functions to *machine* and *arch*
because they don't need the whole virDomainDef structure and they
could be used in places where we don't have virDomainDef.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This way qemuDomainLogContextRef() and qemuDomainLogContextFree() is
no longer needed. The naming qemuDomainLogContextFree() was also
somewhat misleading. Additionally, it's easier to turn
qemuDomainLogContext in a self-locking object.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Instead of having a separate function, we can simply return
zero from the existing qemuDomainGetMemLockLimitBytes() to
signal the caller that the memory locking limit doesn't need
to be set for the guest.
Having a single function instead of two makes it less likely
that we will use the wrong value, which is exactly what
happened when we started applying the limit that was meant
for VFIO-using guests to <memoryBacking><locked>-using
guests.
So far, the official support is for x86_64 arch guests so unless a
different device API than vfio-pci is available let's only turn on
support for PCI address assignment. Once a different device API is
introduced, we can enable another address type easily.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
To allow matching the node names gathered via 'query-named-block-nodes'
we need to query and then use the top level nodes from 'query-block'.
Add the data to the structure returned by qemuMonitorGetBlockInfo.
The code is currently simple, but if we later add node names, it will be
necessary to generate the names based on the node name. Add a helper so
that there's a central point to fix once we add self-generated node
names.
If the migration flags indicate this migration will be using TLS,
then set up the destination during the prepare phase once the target
domain has been started to add the TLS objects to perform the migration.
This will create at least an "-object tls-creds-x509,endpoint=server,..."
for TLS credentials and potentially an "-object secret,..." to handle the
passphrase response to access the TLS credentials. The alias/id used for
the TLS objects will contain "libvirt_migrate".
Once the objects are created, the code will set the "tls-creds" and
"tls-hostname" migration parameters to signify usage of TLS.
During the Finish phase we'll be sure to attempt to clear the
migration parameters and delete those objects (whether or not they
were created). We'll also perform the same reset during recovery
if we've reached FINISH3.
If the migration isn't using TLS, then be sure to check if the
migration parameters exist and clear them if so.
https://bugzilla.redhat.com/show_bug.cgi?id=1430634
If a qemu process has died, we get EOF on its monitor. At this
point, since qemu process was the only one running in the
namespace kernel has already cleaned the namespace up. Any
attempt of ours to enter it has to fail.
This really happened in the bug linked above. We've tried to
attach a disk to qemu and while we were in the monitor talking to
qemu it just died. Therefore our code tried to do some roll back
(e.g. deny the device in cgroups again, restore labels, etc.).
However, during the roll back (esp. when restoring labels) we
still thought that domain has a namespace. So we used secdriver's
transactions. This failed as there is no namespace to enter.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The functions in virCommand() after fork() must be careful with regard
to accessing any mutexes that may have been locked by other threads in
the parent process. It is possible that another thread in the parent
process holds the lock for the virQEMUDriver while fork() is called.
This leads to a deadlock in the child process when
'virQEMUDriverGetConfig(driver)' is called and therefore the handshake
never completes between the child and the parent process. Ultimately
the virDomainObjectPtr will never be unlocked.
It gets much worse if the other thread of the parent process, that
holds the lock for the virQEMUDriver, tries to lock the already locked
virDomainObject. This leads to a completely unresponsive libvirtd.
It's possible to reproduce this case with calling 'virsh start XXX'
and 'virsh managedsave XXX' in a tight loop for multiple domains.
This commit fixes the deadlock in the same way as it is described in
commit 61b52d2e3813cc8c9ff3ab67f232bd0c65f7318d.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
So far, qemuDomainGetHostdevPath has no knowledge of the reasong
it is called and thus reports /dev/vfio/vfio for every VFIO
backed device. This is suboptimal, as we want it to:
a) report /dev/vfio/vfio on every addition or domain startup
b) report /dev/vfio/vfio only on last VFIO device being unplugged
If a domain is being stopped then namespace and CGroup die with
it so no need to worry about that. I mean, even when a domain
that's exiting has more than one VFIO devices assigned to it,
this function does not clean /dev/vfio/vfio in CGroup nor in the
namespace. But that doesn't matter.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
So far, we are allowing /dev/vfio/vfio in the devices cgroup
unconditionally (and creating it in the namespace too). Even if
domain has no hostdev assignment configured. This is potential
security hole. Therefore, when starting the domain (or
hotplugging a hostdev) create & allow /dev/vfio/vfio too (if
needed).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Since these two functions are nearly identical (with
qemuSetupHostdevCgroup actually calling virCgroupAllowDevicePath)
we can have one function call the other and thus de-duplicate
some code.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The bare fact that mnt namespace is available is not enough for
us to allow/enable qemu namespaces feature. There are other
requirements: we must copy all the ACL & SELinux labels otherwise
we might grant access that is administratively forbidden or vice
versa.
At the same time, the check for namespace prerequisites is moved
from domain startup time to qemu.conf parser as it doesn't make
much sense to allow users to start misconfigured libvirt just to
find out they can't start a single domain.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
These functions do not need to see the whole virDomainDiskDef.
Moreover, they are going to be called from places where we don't
have access to the full disk definition. Sticking with
virStorageSource is more than enough.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So far we allow to set MTU for libvirt networks. However, not all
domain interfaces have to be plugged into a libvirt network and
even if they are, they might want to have a different MTU (e.g.
for testing purposes).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1405269
If a secret was not provided for what was determined to be a LUKS
encrypted disk (during virStorageFileGetMetadata processing when
called from qemuDomainDetermineDiskChain as a result of hotplug
attach qemuDomainAttachDeviceDiskLive), then do not attempt to
look it up (avoiding a libvirtd crash) and do not alter the format
to "luks" when adding the disk; otherwise, the device_add would
fail with a message such as:
"unable to execute QEMU command 'device_add': Property 'scsi-hd.drive'
can't find value 'drive-scsi0-0-0-0'"
because of assumptions that when the format=luks that libvirt would have
provided the secret to decrypt the volume.
Access to unlock the volume will thus be left to the application.