QEMU version 3.1 introduced PV_SEND_IPI CPUID feature bit under
commit 7f710c32bb8 (target-i386: adds PV_SEND_IPI CPUID feature bit).
This patch adds a new KVM feature 'pv-ipi' to disable this feature
(enabled by default). Newer CPU platform (Ex, AMD Zen2) supports
hardware accelation for IPI in guest, to use this feature to get
better performance in some scenarios. Detailed about the discussion:
https://lkml.org/lkml/2021/10/20/423
To disable kvm-pv-ipi and have libvirt add "-cpu host,kvm-pv-ipi=off"
to the QEMU command line, the following XML code needs to be added to the
guest's domain description:
<features>
<kvm>
<pv-ipi state='off'/>
</kvm>
</features>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This reverts commit 7300ccc9b3eddb38306868534e7fc2d505a0a13c.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ani Sinha <ani@anisinha.ca>
This commit extends libvirt XML configuration to support luks2 encryption format.
This means that <encryption format="luks2" engine="librbd"> becomes valid.
Currently librbd is the only engine that supports this new format.
Signed-off-by: Or Ozeri <oro@il.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
rbd encryption is new in qemu 6.1.0.
This commit adds a new encryption engine property which
allows the user to use this new encryption engine.
Signed-off-by: Or Ozeri <oro@il.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This commit extends libvirt XML configuration to support a custom encryption engine.
This means that <encryption format="luks" engine="qemu"> becomes valid.
The only engine for now is qemu. However, a new engine (librbd) will be added in an upcoming commit.
If no engine is specified, qemu will be used (assuming qemu driver is used).
Signed-off-by: Or Ozeri <oro@il.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In a few places we declare a variable (which is optionally
followed by a code not touching it) then set the variable to a
value and return the variable immediately. It's obvious that the
variable is needless and the value can be returned directly
instead.
This patch was generated using this semantic patch:
@@
type T;
identifier ret;
expression E;
@@
- T ret;
... when != ret
when strict
- ret = E;
- return ret;
+ return E;
After that I fixed couple of formatting issues because coccinelle
formatted some lines differently than our coding style.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
* XML serialization and deserialization of PCI VPD;
* PCI VPD capability flags added and used in relevant places;
* XML to XML tests for the added capability.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dmitrii Shcherbakov <dmitrii.shcherbakov@canonical.com>
This will make it possible to limit changes to a single spot
later on, and is also just an overall nicer way to create and
destroy objects.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There doesn't seem to be a reason for IOMMUs not to be handled
by this function.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When parsing of the node device XML fails we'd still call the post-parse
and validation callbacks which makes no sense. Additionally the
callbacks were expecting a non-NULL pointer which leads to a crash.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2014139
Fixes: d5ae634ba28
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Use 'virXMLPropEnum' to parse it and fix all switch statements which
didn't include the VIR_DOMAIN_SMARTCARD_TYPE_LAST case.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Introduce infrastructure to format 'drive' addresses via the standard
helper rather than hand-rolled generators used inline.
The code needs to know the disk bus to format the correct address which
is passed in via an internal field in virDomainDeviceDriveAddress.
The field types according to QEMU are as following:
'ide-hd' for VIR_DOMAIN_DISK_BUS_IDE and VIR_DOMAIN_DISK_BUS_SATA
unit=<uint32> - (default: 4294967295)
'floppy' for VIR_DOMAIN_DISK_BUS_FDC
unit=<uint32> - (default: 4294967295)
'scsi-hd' for VIR_DOMAIN_DISK_BUS_SCSI
channel=<uint32> - (default: 0)
scsi-id=<uint32> - (default: 4294967295)
lun=<uint32> - (default: 4294967295)
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'effectiveBootIndex' is a copy of 'bootIndex' if '<boot order=' was
present and left unassigned if not. This allows hypervisor drivers to
reinterpret <os><boot> without being visible in the XML.
QEMU driver had a internal implementation for disks, which is now
replaced. Additionally this will simplify a refactor of network boot
assignment.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This change introduces a new libvirt sub-element <pci> under
<features> that can be used to configure all pci related features.
Currently the only sub-sub element supported by this sub-element is
'acpi-bridge-hotplug' as shown below:
<features>
<pci>
<acpi-bridge-hotplug state='on|off'/>
</pci>
</features>
The above option is only available for the QEMU driver, for x86 guests
only. It is a global option, affecting all PCI bridge controllers on
the guest.
The 'acpi-bridge-hotplug' option enables or disables ACPI hotplug
support for cold-plugged pci bridges. Examples of bridges include the
PCI-PCI bridge (pci-bridge controller) for pc (i440fx) machinetypes,
or PCIe-PCI bridges and pcie-root-port controllers for q35
machinetypes.
For pc machinetypes in x86, this option has been available in QEMU
since version 2.1. Please see the following changes in qemu repo:
9e047b982452c6 ("piix4: add acpi pci hotplug support")
133a2da488062e ("pc: acpi: generate AML only for PCI0 devices if PCI
bridge hotplug is disabled")
For q35 machinetypes, this was introduced in QEMU 6.1 with the
following changes in qemu repo:
(a) c0e427d6eb5fef ("hw/acpi/ich9: Enable ACPI PCI hot-plug")
(b) 17858a16950860 ("hw/acpi/ich9: Set ACPI PCI hot-plug as default on
Q35")
The reasons for enabling ACPI based hotplug for PCIe (q35) based
machines (as opposed to native hotplug) are outlined in (b). There are
use cases where users would still want to use native
hotplug. Therefore, this config option enables users to choose either
ACPI based hotplug or native hotplug for bridges (for example for pcie
root port controller in q35 machines).
Qemu capability validation checks have also been added along with
related unit tests to exercise the new conf option.
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Reviewed-by: Laine Stump <laine@redhat.com>
As advertised in previous commit, this event is delivered to us
when virtio-mem module changes the allocation inside the guest.
It comes with one attribute - size - which holds the new size of
the virtio-mem (well, allocated size), in bytes.
Mind you, this is not necessarily the same number as 'requested
size'. It almost certainly will be when sizing the memory up, but
it might not be when sizing the memory down - the guest kernel
might be unable to free some blocks.
This current size is reported in the domain XML as an output
element only.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function will be needed in the next commit where we will
want to find virtio-mem given its alias by QEMU on the monitor.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The virtio-mem has another property that isn't exposed yet:
current size exposed to the guest. Please note, that this is
different to <requested/> because esp. on sizing the memory
down guest may refuse to release some blocks. Therefore, let's
have another size to report in the XML. But because of its
nature, the <current/> won't be parsed and is report only (for
live XMLs).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
As advertised in one of previous commits, we want to be able to
change 'requested-size' attribute of virtio-mem on the fly. This
commit does exactly that. Changing anything else is checked for
and forbidden.
Once guest has changed the allocation, QEMU emits an event which
we will use to track the allocation. In the next commit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The virtio-mem is paravirtualized mechanism of adding/removing
memory to/from a VM. A virtio-mem-pci device is split into blocks
of equal size which are then exposed (all or only a requested
portion of them) to the guest kernel to use as regular memory.
Therefore, the device has two important attributes:
1) block-size, which defines the size of a block
2) requested-size, which defines how much memory (in bytes)
is the device requested to expose to the guest.
The 'block-size' is configured on command line and immutable
throughout device's lifetime. The 'requested-size' can be set on
the command line too, but also is adjustable via monitor. In
fact, that is how management software places its requests to
change the memory allocation. If it wants to give more memory to
the guest it changes 'requested-size' to a bigger value, and if it
wants to shrink guest memory it changes the 'requested-size' to a
smaller value. Note, value of zero means that guest should
release all memory offered by the device. Of course, guest has to
cooperate. Therefore, there is a third attribute 'size' which is
read only and reflects how much memory the guest still has. This
can be different to 'requested-size', obviously. Because of name
clash, I've named it 'current' and it is dealt with in future
commits (it is a runtime information anyway).
In the backend, memory for virtio-mem is backed by usual objects:
memory-backend-{ram,file,memfd} and their size puts the cap on
the amount of memory that a virtio-mem device can offer to a
guest. But we are already able to express this info using <size/>
under <target/>.
Therefore, we need only two more elements to cover 'block-size'
and 'requested-size' attributes. This is the XML I've came up
with:
<memory model='virtio-mem'>
<source>
<nodemask>1-3</nodemask>
<pagesize unit='KiB'>2048</pagesize>
</source>
<target>
<size unit='KiB'>2097152</size>
<node>0</node>
<block unit='KiB'>2048</block>
<requested unit='KiB'>1048576</requested>
</target>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</memory>
I hope by now it is obvious that:
1) 'requested-size' must be an integer multiple of
'block-size', and
2) virtio-mem-pci device goes onto PCI bus and thus needs PCI
address.
Then there is a limitation that the minimal 'block-size' is
transparent huge page size (I'll leave this without explanation).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When parsing CPU topology, which is described in <topology/>
attributes we can use virXMLPropUInt() instead of virXPathUInt()
as the former results in shorter code.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
There is no need to use virXPathULong() and a temporary UL
variable if we can use virXPathUInt() directly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The option "queue-size" for virtio-blk was added in qemu-2.12.0, and
default value increased from qemu-5.0.0.
However, increasing this value may lead to drop of random access
performance.
Signed-off-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
virtio-blk num-queue is visible to guest OS, so this must be kept while
live migration.
Signed-off-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The ACPI index of a device in a running guest can't be modified, and
libvirt doesn't actually attempt to modify it, but it was possible for
a user to request such a modification, and libvirt wouldn't complain,
thus misleading the user into thinking that it had actually been changed.
Resolves: https://bugzilla.redhat.com/1998920
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The next patch will add another check similar to the existing check
for a change in alias name. This patch reformats the code in
preparation so that the next patch's purpose will be clear.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The array of virtual functions @vfs in
virNodeDeviceGetPCISRIOVCaps() is allocated twice: the first time
during its declaration and the second time inside
virPCIGetVirtualFunctions() which leads to a memleak:
==16691== 1,128 bytes in 47 blocks are definitely lost in loss record 1,771 of 1,803
==16691== at 0x4844CC1: calloc (vg_replace_malloc.c:1117)
==16691== by 0x4E50070: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.6800.3)
==16691== by 0x4A7B034: virNodeDeviceGetPCISRIOVCaps (node_device_conf.c:2649)
==16691== by 0x4A7B5E2: virNodeDeviceGetPCIDynamicCaps (node_device_conf.c:2762)
==16691== by 0xA7F6E18: udevProcessPCI (node_device_udev.c:418)
Fixes: c97518d9b833a607f29b9bb02e3fbe74c011c088
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We don't need to propagate all public flags, only the information
about the presence of the validation one, which can differ from
function to function. This patch makes it easier and more
readable in case of a future additions of validation flags.
This change was suggested by Daniel.
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The disk target is mandatory and used as a designator in error messages
of other validation steps, so we must validate it first.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The code rejecting a XML when the disk target is missing was moved to
the validation code which goes after post parse. One of the cases in the
disk post parse code didn't check whether 'disk->dst' is set which at
that point isn't guaranteed.
Fixes: 61fd7174c2afbe128ef1896198919429bcaca3d7
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2001627
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The validation infrastructure doesn't modify the definition and
additionally it makes sense to run the global code first as it's
validating certain corner cases.
The changed error messages from qemuxml2argvtest show that this is
indeed the proper ordering as all changed messages are actually better
describing the error.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
LUN disks are supported only by VMX and QEMU drivers and the VMX
implementation is a subset of qemu's implementation, thus we can move
the qemu-specific validator to the global validation code providing that
we allow the format to be 'none' (qemu driver always sets 'raw' if it's
not set) and allow disk type 'volume' as a source (qemu always
translates the source, and VMX doesn't implement 'volume' at all).
Moving the code to the global validation allows us to stop calling it
from the qemu specific validation and also deduplicates the checks.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We need to validate the XML against schema if option '--validate'
was passed to the virsh command. This patch also includes
propagation of flags into the virNWFilterBindingDefParse().
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
We need to validate the XML against schema if option '--validate'
was passed to the virsh command. This patch also includes
propagation of flags into the virNetworkPortDefParse().
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
We need to validate the XML against schema if option '--validate'
was passed to the virsh command. This patch also includes
propagation of flags into the virStoragePoolDefParse() function.
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This patch also includes propagation of flags into the
virNetworkDefParse().
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
New clang has a false-positive about value of 'olddisks' being unused
after being set. This is clearly wrong because we want to use
'g_autofree' to clear it later.
While I'm against modifying good code for the sake of bad static
analysis in this case it's not obvious that we depend on the lifetime of
'olddisks' being needed until the end of the function as we store
pointers into it into the hash table and later copy them out.
Rewrite the code by assigning to 'olddisks' earlier and then using
'olddisks' in the loop, so it's clear where the lifetime of the objects
ends, and this should also silence the warning.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This is just a small helper that will be used later.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
We need to validate the XML against schema if option '--validate'
was passed to the virsh command. This patch also includes
propagation of flags into the virSecretDefParse() function.
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
We need to know if validation flag is present in order to
validate given XML against schema in virXMLParse().
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
This patch also includes propagation of flags into the
virNWFilterDefParse().
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>