The handler finalizing the active layer block commit doesn't actually
reopen the file for active layer block commit, so the comment and check
are invalid.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
To remove dependecy of `qemuDomainJob` on job specific
paramters, a `privateData` pointer is introduced.
To handle it, structure of callback functions is
also introduced.
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use variable autocleanup and remove the now obsolete 'cleanup'
label.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use g_autoptr() on pointers and remove the unneeded 'cleanup'
label.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use g_autofree and remove the unneeded 'cleanup' label.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Currently, when restoring from a domain the path that the domain
restores from is labelled under qemuSecuritySetAllLabel() (and after
v6.3.0-rc1~108 even outside transactions). While this grants QEMU
the access, it has a flaw, because once the domain is restored, up
and running then qemuSecurityDomainRestorePathLabel() is called,
which is not real counterpart. In case of DAC driver the
SetAllLabel() does nothing with the restore path but
RestorePathLabel() does - it chown()-s the file back and since there
is no original label remembered, the file is chown()-ed to
root:root. While the apparent solution is to have DAC driver set the
label (and thus remember the original one) in SetAllLabel(), we can
do better.
Turns out, we are opening the file ourselves (because it may live on
a root squashed NFS) and then are just passing the FD to QEMU. But
this means, that we don't have to chown() the file at all, we need
to set SELinux labels and/or add the path to AppArmor profile.
And since we want to restore labels right after QEMU is done loading
the migration stream (we don't want to wait until
qemuSecurityRestoreAllLabel()), the best way to approach this is to
have separate APIs for labelling and restoring label on the restore
file.
I will investigate whether AppArmor can use the SavedStateLabel()
API instead of passing the restore path to SetAllLabel().
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1851016
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
The output of vcpupin and emulatorpin for a domain with vcpu
placement='static' is based on a default bitmap that contains
all possible CPUs in the host, regardless of the CPUs being offline
or not. E.g. for a Linux host with this CPU setup (from lscpu):
On-line CPU(s) list: 0,8,16,24,32,40,(...),184
Off-line CPU(s) list: 1-7,9-15,17-23,25-31,(...),185-191
And a domain with this configuration:
<vcpu placement='static'>1</vcpu>
'virsh vcpupin' will return the following:
$ sudo ./run tools/virsh vcpupin vcpupin_test
VCPU CPU Affinity
----------------------
0 0-191
This is benign by its own, but can make the user believe that all
CPUs from the 0-191 range are eligible for pinning. Which can lead
to situations like this:
$ sudo ./run tools/virsh vcpupin vcpupin_test 0 1
error: Invalid value '1' for 'cpuset.cpus': Invalid argument
This is exarcebated by the fact that 'virsh vcpuinfo' considers only
available host CPUs in the 'CPU Affinity' field:
$ sudo ./run tools/virsh vcpuinfo vcpupin_test
(...)
CPU Affinity: y-------y-------y-------(...)
This patch changes the default bitmap of vcpupin and emulatorpin, in
the case of domains with static vcpu placement, to all available CPUs
instead of all possible CPUs. Aside from making it consistent with
the behavior of 'vcpuinfo', users will now have one less incentive to
try to pin a vcpu in an offline CPU.
https://bugzilla.redhat.com/show_bug.cgi?id=1434276
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Firstly, SEV is present only on AMD, so we can safely assume x86.
Secondly, the problem with looking up capabilities in the cache by arch
is that it's using virHashSearch with a callback to find the right
capabilities and get the binary name from it as well, but since the
cache is empty, it will return NULL and we won't get the corresponding
binary name out of the lookup either. Then, during the cache validation
we try to create a new cache entry for the emulator, but since we don't
have the binary name, nothing gets created.
Therefore, virQEMUCapsCacheLookupDefault is used to fix this issue,
because it doesn't rely on the capabilities cache to construct the
emulator binary name.
https://bugzilla.redhat.com/show_bug.cgi?id=1852311
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reuse qemuBlockGetBitmapMergeActions which allows the removal of the
ad-hoc implementation of bitmap merging for block copy.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
New semantics of the bitmap handling don't need this. Remove the field
and all uses of it including the status XML.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reuse qemuBlockGetBitmapMergeActions which allows removing the ad-hoc
implementation of bitmap merging for block commit. The new approach is
way simpler and more robust and also allows us to get rid of the
disabling of bitmaps done prior to the start as we actually do want to
update the bitmaps in the base.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
In a few cases we might set seclabels on a path outside of
namespaces. For instance, when restoring a domain from a file,
the file is opened, relabelled and only then the namespace is
created and the FD is passed to QEMU (see v6.3.0-rc1~108 for more
info). Therefore, when restoring the label on the restore file,
we must ignore domain namespaces and restore the label directly
in the host.
This bug demonstrates itself when restoring a domain from a block
device. We don't create the block device inside the domain
namespace and thus the following error is reported at the end of
(otherwise successful) restore:
error : virProcessRunInFork:1236 : internal error: child reported (status=125): unable to stat: /dev/sda: No such file or directory
error : virProcessRunInFork:1240 : unable to stat: /dev/sda: No such file or directory
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
The function calls virSecurityManagerDomainRestorePathLabel()
after all.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
There are two places within qemu driver that misuse
qemuSecuritySetSavedStateLabel() to set seclabels on tempfiles
that are not state files: qemuDomainScreenshot() and
qemuDomainMemoryPeek(). They are doing so because of lack of
qemuSecurityDomainSetPathLabel() at the time of their
introduction.
In all three secdrivers (well, four if you count NOP driver) the
implementation of .domainSetSavedStateLabel and
.domainSetPathLabel callbacks is the same anyway.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Libvirt allows the user to define an incomplete NUMA topology, where
the sum of all CPUs in each cell is less than the total of VCPUs.
What ends up happening is that QEMU allocates the non-enumerated CPUs
in the first NUMA node. This behavior is being flagged as 'to be
deprecated' at least since QEMU commit ec78f8114bc4 ("numa: use
possible_cpus for not mapped CPUs check").
In [1], Maxiwell suggested that we forbid the user to define such
topologies. In his review [2], Peter Krempa pointed out that we can't
break existing guests, and suggested that Libvirt should emulate the
QEMU behavior of putting the remaining vCPUs in the first NUMA node
in these cases.
This patch implements Peter Krempa's suggestion. Since we're going
to most likely end up with disjointed NUMA configuration in node 0
after the auto-fill, we're making auto-fill dependent on QEMU_CAPS_NUMA.
A following patch will update the documentation not just to inform
about the auto-fill mechanic with incomplete NUMA topologies, but also
to discourage the user to create such topologies in the future. This
approach also makes Libvirt independent of whether QEMU changes
its current behavior since we're either auto-filling the CPUs in
node 0 or the user (hopefully) is aware that incomplete topologies,
although supported in Libvirt, are to be avoided.
[1] https://www.redhat.com/archives/libvir-list/2019-June/msg00224.html
[2] https://www.redhat.com/archives/libvir-list/2019-June/msg00263.html
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The same functionality can be achieved using migrate-set-parameters QMP
command with xbzrle-cache-size parameter.
https://bugzilla.redhat.com/show_bug.cgi?id=1845012
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The same functionality can be achieved using query-migrate-parameters
QMP command and checking the xbzrle-cache-size parameter.
https://bugzilla.redhat.com/show_bug.cgi?id=1829544
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The same functionality can be achieved using migrate-set-parameters QMP
command with downtime-limit parameter.
https://bugzilla.redhat.com/show_bug.cgi?id=1829543
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The same functionality can be achieved using migrate-set-parameters QMP
command with max-bandwidth parameter.
https://bugzilla.redhat.com/show_bug.cgi?id=1829545
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This function handles the change of NUMA nodeset for a given
guest, setting CpusetMems for the emulator, vcpus and IOThread
sub-groups. It doesn't set the same nodeset to the root cgroup
though. This means that cpuset.mems of the root cgroup ends up
holding the new nodeset and the old nodeset as well. For
a guest with placement=strict, nodeset='0', doing
virsh numatune <vm> 0 8 --live
Will make cpuset.mems of emulator, vcpus and iothread to be
"8", but cpuset.mems of the root cgroup will be "0,8".
This means that any new tasks that ends up landing in the
root cgroup, aside from the emulator/vcpus/iothread sub-groups,
will be split between the old nodeset and the new nodeset,
which is not what we want.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Commit b50a8354f6d added call to qemuDomainDiskBlockJobIsSupported prior
to filling the 'disk' variable resulting in a crash when attempting a
block commit.
https://gitlab.com/libvirt/libvirt/-/issues/31
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemuxml2argv test suite is way more comprehensive than the hotplug
suite. Since we share the code paths for monitor and command line
hotplug we can easily test the properties of devices against the QAPI
schema.
To achieve this we'll need to skip the JSON->commandline conversion for
the test run so that we can analyze the pure properties. This patch adds
flags for the comand line generator and hook them into the
JSON->commandline convertor for -netdev. An upcoming patch will make use
of this new infrastructure.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
'blockdev-mirror' requires the write permission internally to do the
copy. This means that we have to force the image to be read-write for
the duration of the copy and can fix it after the copy is done.
https://bugzilla.redhat.com/show_bug.cgi?id=1832204
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
With -blockdev or when reusing externally created images and thus
without the need for formatting the image we actually can support
snapshots of read-only disks. Arguably it's not very useful so they are
not done by default but users of libvirt such as oVirt are actually
using this.
https://bugzilla.redhat.com/show_bug.cgi?id=1832204
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
SD cards need to be instantiated via -drive if=sd. This means that all
cases where we use the blockdev path need to be special-cased for SD
cards.
Note that at this point QEMU_CAPS_BLOCKDEV is still cleared if the VM
config has a SD card.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use the drive alias for all cases when we can't generate qomName. This
is meant to handle disks on 'sd' bus which are instantiated via -drive
if=sd as there isn't any specific QOM name for them.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We still have to use -drive to instantiate sd disks. Combining that with
the new logic for blockjobs would be very complicated and not worth it
given that 'sd' cards work only on few rarely used machine types of
non-common architectures and libvirt didn't implement support for 'sd'
bus controllers. This will allow us to use -blockdev for other kinds on
such machines while sacrificing block jobs.
Note: this is currently no-op as we mask-out the QEMU_CAPS_BLOCKDEV
capability if any of the disks has bus='sd'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Make sure that we don't try to reload node names with -blockdev. If
something doesn't have a node name the update will not make the
situation better.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Remove the function and passing of 'def' through the callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Instead of the following pattern:
type ret;
...
ret = func();
return ret;
we can use:
return func()
directly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
In the past we added 1024 bytes of padding to saved state images so that
users can run "virsh managedsave-edit $GUEST" and make XML changes which
increase the size of the XML document. This padding was accidentally
lost a while back
commit 6b9b21db7079888a05d192b079e68290bdf14a76
Author: Peter Krempa <pkrempa@redhat.com>
Date: Wed Feb 17 13:10:11 2016 +0100
qemu: Remove unnecessary calculations in qemuDomainSaveMemory
The original 1024 bytes was unreasonably stingy when we consider that
the QEMU state is typically going to be many 100's of MB in size. Thus
this adds 64 KB of padding after the XML which should cope with any
plausible modifications a user will want to make.
https://bugzilla.redhat.com/show_bug.cgi?id=1229255
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
qemuDomainSupportsCheckpointsBlockjobs checks if the
QEMU_CAPS_INCREMENTAL_BACKUP capability is supported to do the
interlocking. Capabilities are not present when the VM isn't running
though which would create false errors.
Move the checks after the liveness check in block job implementations.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Pavel Mores <pmores@redhat.com>
In order to add a string to qemuDomainJobInfo we must ensure that it's
freed and copied properly. Add helpers to copy and free the structure
and adjust the code to use them properly for the new semantics.
Additionally also allocation is changed to g_new0 as it includes the
type and thus it's very easy to grep for all the allocations of a given
type.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Return error codes directly and fix weird reporting of errors via
temporary variable.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use VIR_AUTOCLOSE to declare it and remove all internal closing of the
filedescriptor. This will allow getting rid of 'error' completely.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In an attempt to simplify qemuDomainSaveImageOpen we need to add
automatic pointer clearing for virQEMUSaveData.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Commit 21ad56e932 introduced a regression where a VM with a corrupted
save image file would fail to start on the first attempt. This was
caused by returning a wrong return code as 'fd' was abused to also hold
the return code.
Since it's easy to miss this nuance, introduce a 'ret' variable for the
return code and return it' value in the error section.
https://bugzilla.redhat.com/show_bug.cgi?id=1791522
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Mores <pmores@redhat.com>
When preparing to do a blockcopy, the mirror image is modified so
that QEMU can access it. For instance, the mirror has seclabels
set, if it is a NVMe disk it is detached from the host and so on.
And usually, the restore is done upon successful finish of the
blockcopy operation. But, if something fails then we need to
explicitly revoke the access to the mirror image (and thus
reattach NVMe disk back to the host).
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1822538
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Mores <pmores@redhat.com>
Replace vm->def->disks[i] with dom_disk variable which is
initialized to the same disk.
Signed-off-by: Yi Li <yili@winhong.com>
Reviewed-by: Pavel Mores <pmores@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
So far, libvirt generates the following path for automatic dumps:
$autoDumpPath/$id-$shortName-$timestamp
where $autoDumpPath is where libvirt stores dumps of guests (e.g.
/var/lib/libvirt/qemu/dump), $id is domain ID and $shortName is
shortened version of domain name. So for instance, the generated
path may look something like this:
/var/lib/libvirt/qemu/dump/1-QEMUGuest-2020-03-25-10:40:50
While in case of embed driver the following path would be
generated by default:
$root/lib/libvirt/qemu/dump/1-QEMUGuest-2020-03-25-10:40:50
which is not clashing with other embed drivers, we allow users to
override the default and have all embed drivers use the same
prefix. This can create clashing paths. Fortunately, we can reuse
the approach for machined name generation
(v6.1.0-178-gc9bd08ee35) and include part of hash of the root in
the generated path.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
So far, libvirt generates the following path for hugepages:
$mnt/libvirt/qemu/$id-$shortName
where $mnt is the mount point of hugetlbfs corresponding to
hugepages of desired size (e.g. /dev/hugepages), $id is domain ID
and $shortName is shortened version of domain name. So for
instance, the generated path may look something like this:
/dev/hugepages/libvirt/qemu/1-QEMUGuest
But this won't work with embed driver really, because if there
are two instances of embed driver, and they both want to start a
domain with the same name and with hugepages, both drivers will
generate the same path which is not desired. Fortunately, we can
reuse the approach for machined name generation
(v6.1.0-178-gc9bd08ee35) and include part of hash of the root in
the generated path.
Note, the important change is in qemuGetBaseHugepagePath(). The
rest is needed to pass driver around.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Initially introduced in v3.10.0-rc1~172.
When generating a path for memory-backend-file or -mem-path, qemu
driver will use the following pattern:
$memoryBackingDir/libvirt/qemu/$id-$shortName
where $memoryBackingDir defaults to /var/lib/libvirt/qemu/ram but
can be overridden in qemu.conf. Anyway, the "/libvirt/qemu/" part
looks redundant, because it's already contained in the default,
or creates unnecessary nesting if overridden in qemu.conf.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Introduced in v1.2.17-rc1~121, the assumption was that the
driver->privileged is immutable at the time but it might change
in the future. Well, it did not ever since. It is still immutable
variable. Drop the needless accessor then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
virStorageFileSupportsSecurityDriver ends up initializing the storage
file backend which after the recent changes to the daemon architecture
may end up dlopening of the backend modules.
Since this is required only for remote storage we can optimize the call
by moving the check whether the backend is supported to the branch which
deals with remote storage.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Mores <pmores@redhat.com>