<interface type='vhostuser'><backend type='passt'/> needs to run the
passt command just as is done for interface type='user', but then add
vhostuser bits to the qemu commandline/monitor command.
There are some changes to the parsing/validation along with changes to
the vhostuser codepath do do the extra stuff for passt. I tried
keeping them separated into different patches, but then the unit test
failed in a strange way deep down in the bowels of the commandline
generation, so this patch both 1) makes the final changes to
parsing/formatting and 2) adds passt stuff at appropriate places for
vhostuser (as well as making a couple of things *not* happen when the
passt backend is chosen). The result is that you can now have:
<interface type='vhostuser'>
<backend type='passt'/>
...
</interface>
Then as long as you also have the following as a subelement of
<domain>:
<memoryBacking>
<access mode='shared'/>
</memoryBacking>
your passt interfaces will benefit from the greatly improved
efficiency of a vhost-user data path, and all without requiring
special privileges or capabilities *anywhere* (i.e. it works for
unprivileged libvirt (qemu:///session) as well as privileged libvirt).
Resolves: https://issues.redhat.com/browse/RHEL-69455
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When passt is used with vhostuser, the vhostuser code that builds the
qemu commandline will need to have the same socket path that is given
to the passt command, so this patch makes it visible outside of
qemu_passt.c.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemuProcessPrepareDomain()'s comments say that it should be the only
place to change the "live XML" of a domain (i.e. the public parts of
the virDomainDef object that is shown in the domain's status
XML), and that seems like a reasonable idea (although there aren't
many users of it to date).
qemuProcessPrepareDomainNetwork() is called by the aforementioned
qemuProcessPrepareDomain() - this patch changes the "if (type ==
HOSTDEV)" in that function to a "switch(type)" so it's simpler to add
DomainDef modifications for various other types of virDomainNetDef,
and also so that anyone who adds a new interface type is forced to
look at the code and decide if anything needs to be done here for the
new type.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
For some reason, when vhostuser interface support was added in 2014,
the parser required that the XML for the <interface> have a <source>
element with type, mode, and path, all 3 also required. This in spite
of the fact that 'unix' is the only possible valid setting for type,
and 95% of the time the mode is set to 'client' (as I understand from
comments in the code, normally a guest will use mode='client' to
connect to an existing socket that is precreated (by OVS?), and the
only use for mode='server' is for test setups where one guest is setup
with a listening vhostuser socket (i.e. 'server') and another guest
connects to that socket (i.e. 'client')). (or maybe one guest connects
to OVS in server mode, and all the others connect in client mode, not
sure - I don't claim to be an expert on vhost-user.)
So from the point of view of existing vhost-user functionality, it
seems reasonable to make 'type' and 'mode' optional, and by default
fill in the vhostuser part of the NetDef as if they were 'unix' and
'client'.
In theory, the <source> element itself is also not *directly* required
after this patch, however, the path attribute of <source> *is*
required (for now), so effectively the <source> element is still
required.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Since vhostuser is only used/supported by the QEMU driver, and all the
rest of the vhostuser-specific validation is done in QEMU's
validation, lets move the final check (to see if they've tried to
enable auto-reconnect when this interface is on the server side of the
vhostuser socket) to the QEMU validate.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Both vdpa and vhostuser require that the guest device be virtio, and
for interface type='vdpa', we already set <model type='virtio'/> if it
is unspecified in the input XML, so let's be just as courteous for
interface type='vhostuser'.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Both vhostuser and vdpa interface types must use the virtio model in
the guest (because part of the functionality is implemented in the
guest virtio driver). Due to ["because that's the way it happened"]
this has been validated for vhostuser in the hypervisor-agnostic
validate function, but for vdpa it has been done in the QEMU-specific
validate. Since these interface models are only supported by QEMU
anyway, validate for both of them in the QEMU validation function.
Take advantage of this change to switch to using
virDomainNetIsVirtioModel(net) instead of "net->model ==
VIR_DOMAIN_NET_MODEL_VIRTIO" (the former also matches
...VIRTIO_TRANSITIONAL and ...VIRTIO_NON_TRANSITIONAL, so is more
correct).
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Because all the checks for VIR_DOMAIN_NET_TYPE_VDPA were inside an
else-if clause that was immediately followed by another else-if clause
that forbid setting guestIP.ips or guestIP.routes, we've been allowing
users to set guestIP.* for vdpa interfaces (but then not doing
validation of the attributes that should have been done if we *did*
support setting IPs for vdpa (but we don't anyway, so 🤷.)
This can be fixed by turning the vdpa else-if clause into a top-level
if - this way vdpa interfaces will hit the "else if
(net->guestIP.nips)" clause and reject guest-side IP address setting.
Also, since there are currently *no* interface types for QEMU that
support adding guest-side routes, we put that check by itself (I think
it may be possible to set some guest routes for passt interfaces, but
we don't do that)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We haven't checked for memalloc failure in many years, and that was
the only reason this function would have ever failed.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When a daemon (like libvirtd, virtqemud, etc.) is started as an
unprivileged user (which is exactly how KubeVirt does it), then
it tries to register on both session and system DBus-es so that
it can shut itself down (e.g. when system is powering off or user
logs out). It's worth noting that this is just opportunistic and
if no DBus is available then no error is reported.
Or at least that's what we thought. Because the way our
virGDBusGetSessionBus() and virGDBusGetSystemBus() are written an
error is actually reported every time the daemon starts.
Use virGDBusHasSessionBus() and virGDBusHasSystemBus() to check
if corresponding bus is available.
Resolves: https://issues.redhat.com/browse/RHEL-79088
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
This is just like virGDBusHasSystemBus() except it checks for the
session bus instead of the system one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
This allows a user specified delay between autostart of each VM, giving
parity with the equivalent feature of libvirt-guests.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This delay can reduce the CPU/IO load storm when autostarting many
guests.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This eliminates some duplicated code patterns aross drivers.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
There's a common pattern for autostart of iterating over VMs, acquiring
a lock and ref count, then checking the autostart & is-active flags.
Wrap this all up into a helper method.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Switch to the 'notify-reload' service type and send notifications to
systemd when reloading configuration.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We have a way to notify systemd when we're done starting the daemon.
Systemd supports many more notifications, however, and many of them
are quite relevant to our needs:
https://www.freedesktop.org/software/systemd/man/latest/sd_notify.html
This renames the existing notification API to better reflect its
semantics, and adds new APIs for reporting
* Initiation of config file reload
* Initiation of daemon shutdown process
* Adhoc progress status messages
* Request to extend service shutdown timeout
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
A connection object is not required because autostarted domains are
never marked for autodestroy.
The comment about needing a connection for the network driver is
obsolete since we can auto-open a connection on demand.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This allows for passinga NULL connection object in cases where
domain autodestroy is not required.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
On incoming migration qemu doesn't activate the block graph nodes right
away. This is to properly facilitate locking of the images.
The block nodes are normally re-activated when starting the CPUs after
migration, but in cases (e.g. when a paused VM was migrated) when the VM
is left paused the block nodes are not re-activated by qemu.
This means that blockjobs which would want to write to an existing
backing chain member would fail. Generally read-only jobs would succeed
with older qemu's but this was not intended.
Instead with new qemu you'll always get an error if attempting to access
a inactive node:
error: internal error: unable to execute QEMU command 'blockdev-mirror': Inactive 'libvirt-1-storage' can't be a backing child of active '#block052'
This is the case for explicit blockjobs (virsh blockcopy) but also for
non shared-storage migration (virsh migrate --copy-storage-all).
Since qemu now provides 'blockdev-set-active' QMP command which can
on-demand re-activate the nodes we can re-activate them in similar cases
as when we'd be starting vCPUs if the VM weren't left paused.
The only exception is on the source in case of a failed post-copy
migration as the VM already ran on destination so it won't ever run on
the source even when recovered.
Resolves: https://issues.redhat.com/browse/RHEL-78398
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
The command will be used to re-activate block nodes after migration when
we're leaving the VM paused so that blockjobs can be used.
As the 'node-name' field is optional the 'qemumonitorjsontest' case
tests both variants.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
The flag signals presence of the 'blockdev-set-active' QMP command.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Notable changes:
- 'blockdev-set-active' QMP command and the corresponding 'active'
flag for instantiating blockdev backends added
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
The query language allows querying whether a member is optional by using
the '*' "operator" but the dumper script didn't output those query
strings.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
'qemuDomainSupportsCheckpointsBlockjobs()' should really be used only
with active VMs based on the scope of interlocking it does.
This means that the inactive snapshot code path needs to do the
interlocking based on what's supported:
- external snapshot support was not implemented yet
(bitmaps need to be propagated to the new overlay image)
- internal snapshot support can be deferred to qemu
Move the check inside qemuSnapshotPrepare() which has knowledge about
the snapshot type and implement an explicit check for the inactive case.
See: https://gitlab.com/libvirt/libvirt/-/issues/739
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The commit in question made an incorrect change that resulted in getting
O_RDONLY FD instead of O_RDWR preventing any writes to happen with the
following error:
virQEMUSaveDataWrite:176 : failed to write header to domain save file '/path/to/save.img': Bad file descriptor
Pass 'bypass_cache' as proper bool as the original code did.
Fixes: 517248e2394476a3105ff5866b0b718fc6583073
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This replaces Fedora 39 with Fedora 41, updates the FreeBSD
Cirrus CI image names, and tweaks some package names
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When processing the PCI devices we can only read the configs for each of
them if running as privileged. That information is saved in the driver
state as a boolean introduced in commit 643c74abff01. However since
that version it is only written to once during nodeStateInitialize() and
only read from that point (apart from some commits around v3.9.0 release
when it was not even set, but that was fixed before v3.10.0). And it is
only read once, just to store that boolean in a temporary variable which
is also used in only one condition.
Rewrite this without locking and save few lines of code.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add reporting an internal error when the string to type conversion of
devtype fails as this indicates a serious problem since devtype was used
to get into this method during the udev event handling.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The call to 'qemuBlockStorageSourceNeedsFormatLayer()' bases the
decision also on the state of the passed FD, so we must initialize the
passthrough data via 'qemuDomainPrepareStorageSourceFDs()' before the
aforementioned call.
In the test change it's visible that we didn't add the necessary 'raw'
driver which allows the 'protocol' blockdev to be opened in 'rw' mode so
that qemu picks the proper file descriptior while keeping the device
read-only.
Resolves: https://issues.redhat.com/browse/RHEL-37519
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Add few examples of fd groups with the 'writable' flag set, when passing
a single FD. Notably as a top level image of a readonly disk (even when
that doesn't make much sense) and also as a base image of a chain.
Note that this documents a status quo of a bug fixed in upcoming patch.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Pass also the 'writable' state to the fake passed FDs so that we can
test it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
This API only queries the XML settings and not the running threads
themselves. In order to avoid confusion, change the wording slightly.
Resolves: https://issues.redhat.com/browse/RHEL-72052
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Recently, I was part of a discussion where it was suspected that
libvirt does not pick up correct FW for SEV-SNP guests. Update
our test to demonstrate it does.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Similarly to commit 1af45804 we should be safer by waiting for the whole
sysfs tree is created for the device.
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Users can choose to copy a disk into a destination where they want to
use persistent reservations. Start the daemon if the configuration
requires it.
Resolves: https://issues.redhat.com/browse/RHEL-7342
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Check if the persistent reservations manager daemon is still needed
after a disk (sub)-chain was dropped after a blockjob.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Export qemuHotplugAttachManagedPR/qemuHotplugRemoveManagedPR for reuse
in blockjob code.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Consider also the destination of a block-copy job.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Add data based on 'v9.2.0-1537-gd922088eb4'
Notable changes:
- '10.0' machine types added
- 'hub' chardev backend added
- 'cpr' migrate channel added
- 'nsamples' field for 'dbus' audio backend now reported
- 'ClearwaterForest-v1' cpu model added
- 'SierraForest-v2' cpu model added
- 'ivshmem-flat' device added
- new qom objects:
- 'virtio-mem-system-reset'
- 'vmclock'
- default value of 'rombar' changed from 1 to -1 for all devices
- 'intel-iommu' device:
- default value of 'aw-bit' changed from '39' to '48'
- 'fs1gp' boolean added
- 'x-flts' boolean added
- 'virtio-balloon-pci'/'virtio-mem-pci':
- 'ioeventfd' added
- 'vectors' added
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Update the data after the release.
Notable changes:
- the 6.2 machine types became deprecated
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Attempting to take an internal snapshot of a freshly defined VM with
qcow2 backed NVRAM results in failure as the NVRAM image doesn't get
populated until the VM is started for the first time.
Fix this by invoking qemuPrepareNVRAM() when qcow2 nvram is defined.
Resolves: https://issues.redhat.com/browse/RHEL-73315
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Export qemuPrepareNVRAM so that it doesn't require the VM object. The
snapshot code needs in the corner case of creating a snapshot of a
freshly defined VM ensure that the nvram image exists in order to
snapshot it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>