For example if disk is not in the group and we want to move it
there then it makes sense to specify only the group name in API call.
Currently the destination group iotune settings will be overwritten
with the disk settings which I would say is not what one would expect.
Thus let's get defaults from the group we are moving to.
And if we are moving the brand new group then is makes sense to
copy the current disk iotune settings to the group.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
virDomainSetBlockIoTune not simply sets the iotune params given in API
but use current settings for all the omitted params. Unfortunately
it uses current settings for active config when setting inactive
params. Let's fix it.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Currently it is possible to start a domain which have disks
in same iotune group and at the same time having different iotune
params. Both params set are passed to qemu in command line and the one
that is passed later down command line is get actually set.
Let's prohibit such configurations.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Currently upon successfull call to qemu's implementation of
virDomainSetBlockIoTune iotune settings are changed only for the
disk given in API if the disk is in iotune group while we need
to change the settings for all disks in the group.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
A recent commit added an error check for too-nested backing chains
followed by a return, even though errors above jump to cleanup.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: b168fa88b85dec181882816ab65a59a6c4500667
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Similarly to 510d154a0b41aa70aadabc0918d16dee22882394 we need to prevent
doing too deeply nested backing chains and reject them with a sane error
message.
Add a loop to go through the snapshots prior to attempting actually
creating them to prevent some possible inconsistent scenarios.
We don't need to do it when reusing backing chains as we'll be
re-detecting the backing chain in that case anyways.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Don't adopt the backing store data when reusing images provided by the
user. This will force a backing chain re-probe as users might have
passed in something unexpected in the overlay where our view of the
backing chain would not correspond.
This is done only for inactive snapshots as there we have way less
verification.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This enables support for running QEMU embedded to the calling
application process using a URI:
qemu:///embed?root=/some/path
Note that it is important to keep the path reasonably short to
avoid risk of hitting the limit on UNIX socket path names
which is 108 characters.
When using the embedded mode with a root=/var/tmp/embed, the
driver will use the following paths:
logDir: /var/tmp/embed/log/qemu
swtpmLogDir: /var/tmp/embed/log/swtpm
configBaseDir: /var/tmp/embed/etc/qemu
stateDir: /var/tmp/embed/run/qemu
swtpmStateDir: /var/tmp/embed/run/swtpm
cacheDir: /var/tmp/embed/cache/qemu
libDir: /var/tmp/embed/lib/qemu
swtpmStorageDir: /var/tmp/embed/lib/swtpm
defaultTLSx509certdir: /var/tmp/embed/etc/pki/qemu
These are identical whether the embedded driver is privileged
or unprivileged.
This compares with the system instance which uses
logDir: /var/log/libvirt/qemu
swtpmLogDir: /var/log/swtpm/libvirt/qemu
configBaseDir: /etc/libvirt/qemu
stateDir: /run/libvirt/qemu
swtpmStateDir: /run/libvirt/qemu/swtpm
cacheDir: /var/cache/libvirt/qemu
libDir: /var/lib/libvirt/qemu
swtpmStorageDir: /var/lib/libvirt/swtpm
defaultTLSx509certdir: /etc/pki/qemu
At this time all features present in the QEMU driver are available when
running in embedded mode, availability matching whether the embedded
driver is privileged or unprivileged.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The intent here is to allow the virt drivers to be run directly embedded
in an arbitrary process without interfering with libvirtd. To achieve
this they need to store all their configuration & state in a separate
directory tree from the main system or session libvirtd instances.
This can be useful for doing testing of the virt drivers in "make check"
without interfering with the user's own libvirtd instances.
It can also be used for applications using KVM/QEMU as a piece of
infrastructure to build an service, rather than for general purpose
OS hosting. A long standing example is libguestfs, which would prefer
if its temporary VMs did show up in the main libvirtd VM list, because
this confuses apps such as OpenStack Nova. A more recent example would
be Kata which is using KVM as a technology to build containers.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When using blockdev configurations the 'device' argument of
'blockdev-commit' must correspond to the topmost node in the block node
graph. Libvirt didn't do this properly in case when 'copy_on_read'
option was enabled on the disk.
Use qemuDomainDiskGetTopNodename to fix it when calling block-commit.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
When using blockdev configurations the 'device' argument of
'blockdev-mirror' must correspond to the topmost node in the block node
graph. Libvirt didn't do this properly in case when 'copy_on_read'
option was enabled on the disk.
Use qemuDomainDiskGetTopNodename to fix it for the blockdev-mirror calls
in qemuDomainBlockCopy and the non-shared-storage migration.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
If a mirror job fails to start in -blockdev mode we'd not unplug the
backing files we added first because the code on the error path checked
the wrong value. 'rc' is used as status of the code which added the
images, but the state of the 'block(dev)-mirror' call is stored in 'ret'
at that point.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
QEMU driver has two functions: qemuGetDHCPInterfaces() and
qemuARPGetInterfaces() that are being used inside only one single
function. They can be turned into generic functions that other drivers
can use. This commit move both from QEMU driver tree to domain conf
tree.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This function grabs an agent job but ends a monitor job.
End the agent job instead.
https://bugzilla.redhat.com/show_bug.cgi?id=1792723
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reported-by: Dan Zheng <dzheng@redhat.com>
Fixes: e005c95f56fee9ed780be7f8db103d690bd34cbd
gmtime_r/localtime_r are mostly used in combination with
strftime to format timestamps in libvirt. This can all
be replaced with GDateTime resulting in simpler code
that is also more portable.
There is some boundary condition problem in parsing POSIX
timezone offsets in GLib which tickles our test suite.
The test suite is hacked to avoid the problem. The upsteam
GLib bug report is
https://gitlab.gnome.org/GNOME/glib/issues/1999
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
G_STATIC_ASSERT() is a drop-in functional equivalent of
the GNULIB verify() macro.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Switch from old VIR_ allocation APIs to glib equivalents.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In order to avoid holding an agent job and a normal job at the same
time, we want to avoid accessing the domain's definition while holding
the agent job. To achieve this, qemuAgentGetFSInfo() only returns the
raw information from the agent query to the caller. The caller can then
release the agent job and then proceed to look up the disk alias from
the vm definition. This necessitates moving a few helper functions to
qemu_driver.c and exposing the agent data structure (qemuAgentFSInfo) in
the header.
In addition, because the agent function no longer returns the looked-up
disk alias, we can't test the alias within qemuagenttest. Instead we
simply test that we parse and return the raw agent data correctly.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When resuming a domain from a save file, we read the domain XML
from the file, add it onto our internal list of domains, start
the qemu process, let it load the incoming migration stream and
resume its vCPUs afterwards. If anything goes wrong, the domain
object is removed from the list of domains and error is returned
to the caller. However, the qemu process might be left behind -
if resuming vCPUs fails (e.g. because qemu is unable to acquire
write lock on a disk) then due to a bug the qemu process is not
killed but the domain object is removed from the list.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1718707
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
We have to keep the default - querying the agent if no flag is
set.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Internal snapshots of a non-running domain do not carry any memory state
and restoring such a snapshot will not replace existing saved memory
state. This allows a scenario, where a user first suspends a domain into
managedsave, restores a non-running snapshot and then resumes the domain
from managedsave. After that, the guest system will run with its
previous memory state atop a different disk state. The most obvious
possible fallout from this is extensive file system corruption. Swap
content and RAID bitmaps might also be off.
This has been discussed[1] and fixed[2] from the end-user perspective for
virt-manager.
This patch marks the restore operation as risky at the libvirt level,
requiring the user to remove the saved memory state first or force the
operation.
[1] https://www.redhat.com/archives/virt-tools-list/2019-November/msg00011.html
[2] https://www.redhat.com/archives/virt-tools-list/2019-December/msg00049.html
Signed-off-by: Michael Weiser <michael.weiser@gmx.de>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
These functions are meant to replace verbose check for the old
style of specifying UEFI with a simple function call.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The underlying resctrl monitoring is actually using 64 bit counters,
not the 32bit one. Correct this by using 64bit data type for reading
hardware value.
To keep the interface consistent, the result of CPU last level cache
that occupied by vcpu processors of specific restrl monitor group is
still reported with a truncated 32bit data type. because, in silicon
world, CPU cache size will never exceed 4GB.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com>
When cancelling the blockjobs as part of failed backup job startup
recover we didn't pass in the correct async job type. Luckily the block
job handler and cancellation code paths use no block job at all
currently so those were correct.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
When freeing qemu driver struct members, we forgot to free
@hostcpu and @hostnuma members.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
This function is supposed to clean up virQEMUDriver structure and
free individual members. However, it's doing that in random order
which makes it hard to track which members are being freed and
which are not. Do the free in reverse order than the structure
definition - assuming that the most important members (like
mutex) are declared first and freed last.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Prior to commit 55ce6564634 (first in libvirt 4.6.0), the XML sent to
virDomainAttachDeviceFlags() was parsed only once, and the results of
that parse were inserted into both the live object of the running
domain and into the persistent config. Thus, if MAC address was
omitted from in XML for a network device (<interface>), both the live
and config object would have the same MAC address.
Commit 55ce6564634 changed the code to parse the incoming XML twice -
once for live and once for config. This does eliminate the problem of
PCI (/scsi/sata) address conflicts caused by allocating an address
based on existing devices in live object, but then inserting the
result into the config (which may already have a device using that
address), BUT it also means that when the MAC address of a network
device hasn't been specified in the XML, each copy will get a
different auto-generated MAC address.
This results in the MAC address of the device changing the next time
the domain is shutdown and restarted, which creates havoc with the
guest OS's network config.
There have been several discussions about this in the last > 1 year,
attempting to find the ideal solution to this problem that makes MAC
addresses consistent and accounts for all sorts of corner cases with
PCI/scsi/sata addresses. All of these discussions fizzled out because
every proposal was either too difficult to implement or failed to fix
some esoteric case someone thought up.
So, in the interest of solving the MAC address problem while not
making the "other address" situation any worse than before, this patch
simply adds a qemuDomainAttachDeviceLiveAndConfigHomogenize() function
that (for now) copies the MAC address from the config object to the
live object (if the original xml had <mac address='blah'/> then this
will be an effective NOP (as the macs already match)).
Any downstream libvirt containing upstream commit
55ce6564634 should have this patch as well.
https://bugzilla.redhat.com/1783411
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
If we use glib alloc functions, we can drop the 'cleanup' label
and @rv variable and also simplify the code a bit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Some variables are not used outside of the for() loop. Move their
declaration to clean up the code a bit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
When using the monolithic daemon, then dom->conn has all driver
tables filled in properly and thus it's safe to call an API other
than virDomain*(). However, when using split daemons then
dom->conn has only hypervisor driver table set
(dom->conn->driver) and the rest is NULL. Therefore, if we want
to call a non-domain API (virNetworkLookupByName() in this case),
we have obtain the cached connection object accessible via
virGetConnectNetwork().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
If we place qemuDomainInterfaceAddresses() a few lines below the
two functions its using then we can drop forward declarations of
those functions.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
To simplify implementation, some restrictions are added. For
instance, an NVMe disk can't go to any bus but virtio and has to
be type of 'disk' and can't have startupPolicy set.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
As of commit 2a00ef6e71f30241f9ca6288da984d75f3cef957 which
was released in v5.2.0, we require YAJL to build the QEMU driver.
Remove the checks from code that requires the QEMU driver
or checks that also check for WITH_QEMU.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Re-create any active persistent bitmap in the snapshot overlay image so
that tracking for a checkpoint is persisted. While this basically
duplicates data in the allocation map it's currently the only possible
way as qemu can't mirror the allocation map into a dirty bitmap if we'd
ever want to do a backup.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
qemuDomainSnapshotDiskPrepareOne is already called for each disk which
is member of the snapshot so we don't need to iterate through the
snapshot list again to generate members of the 'transaction' command for
each snapshot.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This function will be removed in a future commit because it allows the
caller to acquire both monitor and agent jobs at the same time. Holding
both job types creates a vulnerability to denial of service from a
malicious guest agent.
qemuDomainSetVcpusFlags() always passes NONE for either the monitor job
or the agent job (and thus is not vulnerable to the DoS), so we can
simply replace this function with the functions for acquiring the
appropriate type of job.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We have to assume that the guest agent may be malicious so we don't want
to allow any agent queries to block any other libvirt API. By holding
a monitor job while we're querying the agent, we open ourselves up to a
DoS.
Split the function so that the portion issuing the agent command only
holds an agent job and the portion issuing the monitor command holds
only a monitor job.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We have to assume that the guest agent may be malicious so we don't want
to allow any agent queries to block any other libvirt API. By holding a
monitor job while we're querying the agent, we open ourselves up to a
DoS.
So split the function up a bit to only hold the monitor job while
querying qemu for whether the domain supports suspend. Then acquire only
an agent job while issuing the agent suspend command.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We have to assume that the guest agent may be malicious so we don't want
to allow any agent queries to block any other libvirt API. By holding
a monitor job while we're querying the agent, we open ourselves up to a
DoS.
Split the function so that we only hold the appropriate type of job
while rebooting.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We have to assume that the guest agent may be malicious so we don't want
to allow any agent queries to block any other libvirt API. By holding
a monitor job while we're querying the agent, we open ourselves up to a
DoS. So split the function into separate parts: one that does the agent
shutdown and one that does the monitor shutdown. Each part holds only a
job of the appropriate type.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
With all plumbing in place, we can now enable the new functionality.
Signed-off-by: Pavel Mores <pmores@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>