Similar to the qemu.conf knob 'deprecation_behavior' add a per-VM knob
in the QEMU namespace:
<qemu:deprecation behavior='...'/>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Just like VFIO devices, vDPA devices may need to have all guest memory
pages locked/pinned in order to operate properly. In the case of VFIO
devices (including mdev and NVME, which also use VFIO) libvirt
automatically increases the locked memory limit when one of those
devices is present. This patch modifies that code to also increase the
limit if there are any vDPA devices present.
Resolves: https://bugzilla.redhat.com/1939776
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
This function is a specialized version of
qemuDomainGetMemLockLimitBytes() for PPC64. Simplifying it in the same
manner as the previous patch has the nice side effect of accounting
for the possibility of an mdev device
(I don't know if mdev devices are supported on PPC, but even if not
then a) the additional check for mdev devices gained by using
qemuDomainNeedsVFIO() in place of open coding will be an effective
NOP, and b) if mdev devices are supported on PPC64 in the future, this
function will be prepared for it).
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
This function goes through a loop checking if each hostdev is a VFIO
or mdev device, and then later it calls virDomainDefHasNVMEDisk(). The
function qemuDomainNeedsVFIO() does exactly the same thing, so let's
just call that instead.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
virQEMUCapsGet checks for qemuCaps itself, no need to do it explicitly.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Attempting to set the memlock limit might fail if we're running
in a containerized environment where CAP_SYS_RESOURCE is not
available, and if the limit is already high enough there's no
point in trying to raise it anyway.
https://bugzilla.redhat.com/show_bug.cgi?id=1916346
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Store the current memory locking limit and the desired one
separately, which will help with later changes.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Now that we've implemented a fallback for the function that
obtains the information from /proc, there is no reason we would
get a failure unless there's something seriously wrong with the
environment we're running in, in which case we're better off
reporting the issue to the user rather than pretending
everything is fine.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
qemuBackupBegin can take a full backup of the disks (excluding any
operations with bitmaps) without the need to wait for the
blockdev-reopen support in qemu.
Add a check that no checkpoint creation is required and the disk backup
mode isn't VIR_DOMAIN_BACKUP_DISK_BACKUP_MODE_INCREMENTAL.
Call to virDomainBackupAlignDisks is moved earlier as it initializes the
disk backup mode if not present in user config.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In short, virXXXPtr type is going away. With big bang. And to
help us rewrite the code with a sed script, it's better if each
variable is declared on its own line.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Currently the QEMU driver secretly sets the QEMU_AUDIO_DRV env variable
- VNC - set to "none", unless passthrough of host env variable is set
- SPICE - always set to "spice"
- SDL - always passthrough host env
- No graphics - set to "none", unless passthrough of host env variable is set
The setting of the QEMU_AUDIO_DRV env variable is done in the code which
configures graphics.
If no <audio> element is present, we now auto-populate <audio> elements
to reflect this historical default config. This avoids need to set audio
env when processing graphics.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
That's more consistent with our usual naming convention.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The current code is written under the assumption that, for all
limits except the core size, asking for the limit to be set to
zero is a no-op, and so the operation is performed
unconditionally.
While this is the behavior we want for the QEMU driver, the
virCommand and virProcess facilities are generic, and should not
implement this kind of policy: asking for a limit to be set to
zero should result in that limit being set to zero every single
time.
Add some checks in the QEMU driver, effectively moving the
policy where it belongs.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Generated using the following spatch:
@@
expression path;
@@
- virFileMakePath(path)
+ g_mkdir_with_parents(path, 0777)
However, 14 occurrences were not replaced, e.g. in
virHostdevManagerNew(). I don't really understand why.
Fixed by hand afterwards.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add status XML infrastructure for storing a list of block dirty bitmaps
which are temporarily used when migrating a VM with
VIR_MIGRATE_NON_SHARED_DISK for cleanup after a libvirtd restart during
migration.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
These messages are only valid while the domain is running.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
To make it easier to split out the parsing/formatting of the <teaming>
element into separate functions (so we can more easily add the
<teaming> element to <hostdev>, change its virDomainNetDef so that it
points to a virDomainNetTeamingInfo rather than containing one.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The conversion removes the use of virStringListAdd/virStringListRemove
which try to add dynamic properties to a string list which is really
inefficient.
Storing the dbus VMState ids in a GSList is pretty straightforward and
the slightly increased complexity of the code will be paid back by
removing the string list helpers later.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The "max" model can be treated the same way as "host" model in general.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
For hardware virtualization this is functionally identical to the
existing host-passthrough mode so the same caveats apply.
For emulated guest this exposes the maximum featureset supported by
the emulator. Note that despite being emulated this is not guaranteed
to be migration safe, especially if different emulator software versions
are used on each host.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Implements QEMU support for vhost-user-blk together with live
hotplug/unplug.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Phase out use of VIR_DISPOSE_N from the qemu driver. Use memset in the
appropriate cases.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
When virRandomBytes fails we don't get any random bytes and even if we
did they don't have to be treated as secret as they weren't used in any
way.
Add a temporary variable with automatic freeing for the secret buffer
and assign it only on success.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Up until now we had a runtime code and XML related code in the same
source file inside util directory.
This patch takes the runtime part and extracts it into the new
storage_file directory.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The virtio-pmem is a virtio variant of NVDIMM and just like
NVDIMM virtio-pmem also allows accessing host pages bypassing
guest page cache. The difference is that if a regular file is
used to back guest's NVDIMM (model='nvdimm') the persistence of
guest writes might not be guaranteed while with virtio-pmem it
is.
To express this new model at domain XML level, I've chosen the
following:
<memory model='virtio-pmem' access='shared'>
<source>
<path>/tmp/virtio_pmem</path>
</source>
<target>
<size unit='KiB'>524288</size>
</target>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</memory>
Another difference between NVDIMM and virtio-pmem is that while
the former supports NUMA node locality the latter doesn't. And
also, the latter goes onto PCI bus and not into a DIMM module.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
qemu's qcow2 driver allows control of the metadata cache of qcow2 driver
by the 'cache-size' property. Wire it up to the recently introduced
elements.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Implement support for the 'nfs' native protocol driver in the qemu
driver.
QEMU accepts numeric UID/GID for 'nfs' protocol file driver thus libvirt
needs to perform the lookup prior to passing it to qemu.
Signed-off-by: Ryan Gahagan <rgahagan@cs.utexas.edu>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The next objective is to move virDomainDeviceDefValidate() to
domain_validate.c. First let's move all the static helpers.
The net device validation functions are used across multiple
drivers, so let's move them separately first.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Commit 926563dc3a6 which refactored the function call deleting the
snapshot's on disk state introduced a logic bug, which skips over the
deletion of libvirt metadata after the disk state deletion is done.
To fix it we must not return early.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/109
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
We already calculated the guest area, which is what is subject
to minimum size requirements, a few lines earlier.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Nodename may be asociated to a disk backup job, add support to looking
up in that chain too. This is specifically useful for the
BLOCK_WRITE_THRESHOLD event which can be registered for any nodename.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Nodename may be asociated to a disk backup job, add support to looking
up in that chain too. This is specifically useful for the
BLOCK_WRITE_THRESHOLD event which can be registered for any nodename.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'virStorageFileChainLookup' reports an error when the lookup of the
backing chain entry is unsuccessful. Since we possibly use it multiple
times when looking up backing for 'disk->mirror' the function can report
error which won't be actually reported.
Replace the call to virStorageFileChainLookup by lookup in the chain by
index.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use dummy variable to fill 'src' so that access to it doesn't need to be
conditionalized and use temporary variable for 'disk' rather than
dereferencing the array multiple times.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In v6.0.0-rc1~439 (and friends) we tried to cache NUMA
capabilities because we assumed they are immutable. And to some
extent they are (NUMA hotplug is not a thing, is it). However,
our capabilities contain also some runtime info that can change,
e.g. hugepages pool allocation sizes or total amount of memory
per node (host side memory hotplug might change the value).
Because of the caching we might not be reporting the correct
runtime info in 'virsh capabilities'.
The NUMA caps are used in three places:
1) 'virsh capabilities'
2) domain startup, when parsing numad reply
3) parsing domain private data XML
In cases 2) and 3) we need NUMA caps to construct list of
physical CPUs that belong to NUMA nodes from numad reply. And
while this may seem static, it's not really because of possible
CPU hotplug on physical host.
There are two possible approaches:
1) build a validation mechanism that would invalidate the
cached NUMA caps, or
2) drop the caching and construct NUMA caps from scratch on
each use.
In this commit, the latter approach is implemented, because it's
easier.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1819058
Fixes: 1a1d848694f6c2f1d98a371124928375bc3bb4a3
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Since the function is now only used in qemu_domain.c, move it from
domain_conf.c and rename it.
This reverts the work done in commit ace5931553c87beebb6b3cd994061742b3f88238
(conf, qemu: move qemuDomainNVDimmAlignSizePseries to domain_conf.c).
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
qemuDomainAlignMemorySizes() has an operation order problem. We are
calculating 'initialmem' without aligning the memory modules first.
Since we're aligning the dimms afterwards this can create inconsistencies
in the end result. x86 has alignment of 1-2MiB and it's not severely
impacted by it, but pSeries works with 256MiB alignment and the difference
is noticeable.
This is the case of the existing 'memory-hotplug-ppc64-nonuma' test.
The test consists of a 2GiB (aligned value) guest with 2 ~520MiB dimms,
both unaligned. 'initialmem' is calculated by taking total_mem and
subtracting the dimms size (via virDomainDefGetMemoryInitial()), which
wil give us 2GiB - 520MiB - 520MiB, ending up with a little more than
an 1GiB of 'initialmem'. Note that this value is now unaligned, and
will be aligned up via VIR_ROUND_UP(), and we'll end up with 'initialmem'
of 1GiB + 256MiB. Given that the dimms are aligned later on, the end
result for QEMU is that the guest will have a 'mem' size of 1310720k,
plus the two 512 MiB dimms, exceeding in 256MiB the desired 2GiB
memory and currentMemory specified in the XML.
Existing guests can't be fixed without breaking ABI, but we have
code already in place to align pSeries NVDIMM modules for new guests.
Let's extend it to align all pSeries mem modules.
A new test, 'memory-hotplug-ppc64-nonuma-abi-update', a copy of the
existing 'memory-hotplug-ppc64-nonuma', was added to demonstrate the
result for new pSeries guests. For the same unaligned XML mentioned
above, after applying this patch:
- starting QEMU mem size without PARSE_ABI_UPDATE:
-m size=1310720k,slots=16,maxmem=4194304k \ (no changes)
- starting QEMU mem size with PARSE_ABI_UPDATE:
-m size=1048576k,slots=16,maxmem=4194304k \ (size fixed)
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
A previous patch removed the pSeries NVDIMM align that wasn't
being done properly. This patch reintroduces it in the right
fashion, making it reliant on VIR_DOMAIN_DEF_PARSE_ABI_UPDATE.
This makes it complying with the intended design defined by
commit c7d7ba85a624.
Since the PARSE_ABI_UPDATE is more restrictive than checking for
!migrate && !snapshot, like is being currently done with
qemuDomainAlignMemorySizes(), this means that we'll align the
pSeries NVDIMMs in two places - in post parse time for new
guests, and in qemuDomainAlignMemorySizes() for all guests
that aren't migrating or in a snapshot.
Another difference is that the logic is now in the QEMU driver
instead of domain_conf.c. This was necessary because all
considerations made about the PARSE_ABI_UPDATE flag were done
under QEMU. Given that no other driver supports ppc64 there is no
impact in this change.
A new test was added to exercise what we're doing. It consists
of a a copy of the existing 'memory-hotplug-nvdimm-ppc64' xml2xml
test, called with the PARSE_ABI_UPDATE flag. As intended, we're
not changing QEMU command line or any XML without the flag,
while the pseries NVDIMM memory is being aligned when the
flag is used.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>