Commit Graph

35453 Commits

Author SHA1 Message Date
Boris Fiuczynski
e67bca23e4 nodedev: add an active config to mdev
The configuration of a defined mdev can be modified after the mdev is
started. The defined configuration and the active configuration can
therefore run out of sync. Handle this by storing the modifiable data
which is the mdev type and attributes in two separate active and
defined configurations. mdevctl supports with callout scripts to do an
attribute retrieval of started mdevs which is already implemented in
libvirt.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-26 10:59:47 +01:00
Boris Fiuczynski
c65c078655 node_device: remove unnecessary checks in virNodeDeviceDefFormat
virBufferEscapeString already contains the null check.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-26 10:55:55 +01:00
Boris Fiuczynski
c877908e14 node_device: refactor mdev attributes handling
Refactor attribute handling code into methods for easier reuse.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-26 10:55:17 +01:00
Boris Fiuczynski
47e57159b3 virmdev: prepare type and attributes for dual state
Create a new structure holding type and attributes as these are
modifiable in a persistent mdev configuration and run out of sync with
the active mdev configuration.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-26 10:50:43 +01:00
Vincent Vanlaer
93d67c58c2 daemon: fix wrong request count for sparse stream
Similar to when actual data is being written to the stream, it is
necessary to acknowledge handling of the client request when a hole is
encountered. This is done later in daemonStreamHandleWrite by sending a
fake zero-length reply if the status variable is set to
VIR_STREAM_CONTINUE. It seems that setting status from the message
header was missed for holes in the introduction of the sparse stream
feature.

Signed-off-by: Vincent Vanlaer <libvirt-e6954efa@volkihar.be>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-26 10:03:34 +01:00
Michal Privoznik
4545f313c2 domain_validate: Account for NVDIMM label size properly when checking for memory conflicts
As of v9.8.0-rc1~7 we check whether two <memory/> devices don't
overlap (since we allow setting where a <memory/> device should
be mapped to). We do this pretty straightforward, by comparing
start and end address of each <memory/> device combination.
But since only the start address is given (an exposed in the
XML), the end address is computed trivially as:

  start + mem->size * 1024

And for majority of memory device types this works. Except for
NVDIMMs. For them the <memory/> device consists of two separate
regions: 1) actual memory device, and 2) label.

Label is where NVDIMM stores some additional information like
namespaces partition and so on. But it's not mapped into the
guest the same way as actual memory device. In fact, mem->size is
a sum of both actual memory device and label sizes. And to make
things a bit worse, both sizes are subject to alignment (either
the alignsize value specified in XML, or system page size if not
specified in XML).

Therefore, to get the size of actual memory device we need to
take mem->size and substract label size rounded up to alignment.

If we don't do this we report there's an overlap between two
NVDIMMs even when in reality there's none.

Fixes: 3fd64fb0e2
Fixes: 91f9a9fb4f
Resolves: https://issues.redhat.com/browse/RHEL-4452?focusedId=23805174#comment-23805174
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-02-22 10:53:56 +01:00
Adam Julis
969353f978 virfile: Switch to virReportSystemError after failed VIR_CLOSE()
VIR_CLOSE() sets errno on failure so it's better to use
virReportSystemError() than plain virReportError() as the former
reports errno value too.

Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-22 10:23:20 +01:00
ray
04397de2a1 qemu: Fix guest-sync response time in qga command
The current implementation sets the guest-sync timeout to the
smaller value between the default value (QEMU_AGENT_WAIT_TIME)
and agent->timeout, without considering the timeout passed
via the qga command.

This patch enhances the guest-sync timeout logic to use the
minimum value among the default value, agent->timeout, and
the timeout passed via the qga command.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/590
Signed-off-by: ray <honglei.wang@smartx.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-22 09:51:23 +01:00
Peter Krempa
737e3daf5a qemuMigrationDstPrepareStorage: Annotate that existance of 'volume' disks is checked elswhere
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-21 14:15:49 +01:00
Peter Krempa
eeb1cecf1e qemuMigrationDstPrepareStorage: Move assumption that 'network' disks always exist
Move the assumption from the code pre-creating the storage to
qemuMigrationDstPrepareStorage where it's checked for other cases.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-21 14:15:49 +01:00
Peter Krempa
360fd479ae qemuMigrationDstPrepareStorage: Reject migration into 'dir' and 'vhost-user' types
Migrating into a 'directory' won't ever work as we ask qemu to emulate a
fat filesystem, so restoring of the files won't be possible. Same for
'vhost-user' disks which don't support blockjobs as there's no block
backend used in qemu.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-21 14:15:49 +01:00
Peter Krempa
d6ba6cbaa4 qemuMigrationDstPrepareStorage: Rework storage existence check
Check the existance of storage per-type rather than trying to come up
with a common "path".

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-21 14:15:49 +01:00
Peter Krempa
1e12394d3b qemuMigrationDstPrepareStorage: Move block device specific logic
Now that we have a switch statement, the code adding the 'slice' for
block devices of non-equal sizes can be moved to appropriate location.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-21 14:15:48 +01:00
Peter Krempa
e42a3935ac qemuMigrationDstPrecreateDisk: Refactor cleanup
Automatically free helper variables, remove the 'cleanup' label and
use virBufferCurrentContent() to take the XML from the buffer rather
than extracting it to a separate variable.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-21 14:15:48 +01:00
Peter Krempa
00c0a94ab5 qemuMigrationDstPrepareStorage: Properly consider path for 'vdpa' devices
Allow storage migration of VDPA devices by properly checking that they
exist on the destionation. Pre-creation is not supported but if the
device exists the migration should be able to succeed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-21 14:15:48 +01:00
Peter Krempa
e158b523b8 qemuMigrationDstPrepareStorage: Use 'switch' statement to include all storage types
Decrease the likelyhood that addition of a new storage type will be
forgotten.

This patch also unifies the type check to consult the 'actual' type of
the storage in both cases as the NVMe check looked for the XML declared
type while virStorageSourceIsLocalStorage() looks for the
actual/translated type.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-21 14:15:48 +01:00
Tim Wiederhake
064e77aa0a cpu_map: Rewrite feature sync script
Previously, the script would only detect differences between
libvirt's and qemu's list of x86 features, adding those features
to libvirt was a manual and error prone procedure.

Replace with a script that can generate libvirt's feature list
directly from qemu source code.

Usage: sync_qemu_features_i386.py [--output OUTPUT] [qemu]

If not specified otherwise, "output" defaults to x86_features.xml
in the same directory as sync_qemu_features_i386.py. If a checkout
of the qemu source code resides next to the libvirt directory, it
will be found automatically and need not be specified.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-02-20 17:29:27 +01:00
Tim Wiederhake
836644ba3d cpu_map: Format comments
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-02-20 17:29:27 +01:00
Tim Wiederhake
3ba88f543c cpu_map: Format register values
Use "0x%08x" as format for all values:

    sed \
        -e "s/'0x\(..\)'/'0x000000\\1'/g" \
        -e "s/'0x\(...\)'/'0x00000\\1'/g"

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-02-20 17:29:27 +01:00
Tim Wiederhake
986be35f2e cpu_map: Sort cpu features
Some feature words were not sorted correctly.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-02-20 17:29:27 +01:00
Laine Stump
41fe852487 Set stubDriverName from hostdev driver model attribute during pci device setup
commit v9.10.0-129-g8b93d78c83 (first appearing in libvirt-10.0.0) was
supposed to allow forcing a PCI hostdev to be bound to a particular
driver by adding <driver model='blah'/> to the XML for the
device. Unfortunately, a single line was missed during the final
changes to the patch prior to pushing, and the result was that the
driver model could be set to *anything* and it would be accepted but
just ignored.

This patch adds the missing line, which will set the stubDriverName
field of the virPCIDevice object from the hostdev object as the
virPCIDevice is being created. This ends up being used by
virPCIDeviceBindToStub() as the driver that it binds the device to.

Fixes: 8b93d78c83
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-20 11:19:44 -05:00
Jiri Denemark
2a6799fd43 build: Add userfaultfd_sysctl build option
This option controls whether the sysctl config for enabling unprivileged
userfaultfd will be installed.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-13 17:44:26 +01:00
Jiri Denemark
66643931e7 qemu: Add support for /dev/userfaultfd
/dev/userfaultfd device is preferred over userfaultfd syscall for
post-copy migrations. Unless qemu driver is configured to disable mount
namespace or to forbid access to /dev/userfaultfd in cgroup_device_acl,
we will copy it to the limited /dev filesystem QEMU will have access to
and label it appropriately. So in the default configuration post-copy
migration will be allowed even without enabling
vm.unprivileged_userfaultfd sysctl.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-13 17:44:26 +01:00
Timothée Ravier
a2c3e390f7 qemu: Add sysusers config file for qemu & kvm user/groups
Install a systemd sysusers config file for the qemu & kvm user/groups.

We can not use the sysusers_create_compat macro in the RPM specfile to
create those users as we want to keep the specfile standalone and not
relying on additionnal files.

Update the specfile to make the commands closer to what is generated by
the current macro.

See: https://src.fedoraproject.org/rpms/libvirt/pull-request/22
See: https://gitlab.com/libvirt/libvirt/-/merge_requests/319
See: https://bugzilla.redhat.com/show_bug.cgi?id=2095429
See: https://docs.fedoraproject.org/en-US/packaging-guidelines/UsersAndGroups/

Based on previous work by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Timothée Ravier <tim@siosm.fr>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-02-13 16:59:57 +01:00
Michal Privoznik
d96a414c03 secret_conf: Modernize XML parsing & formatting
Our virSecret XML is still parsed and formatted using old way
(e.g. virXPathString() + virXXXTypeFromString() combo, or
formatting elements using plain virBufferAsprintf() instead of
virXMLFormatElement()). Modernize the code as it'll make it
easier for future expansion.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-13 16:11:21 +01:00
Michal Privoznik
bad17c4d88 virSecretDef: Convert 'usage_type' field to proper enum type
Convert the field and adjust the XML parsers to use
virXMLPropEnum().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-13 16:11:17 +01:00
Michal Privoznik
6db5362a30 secret_conf: Simplify calling of virSecretDefParseUsage()
The virSecretDefParseUsage() function is called conditionally.
Call it unconditionally and keep pointer to the <usage/> node as
it'll come handy soon.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-13 16:11:12 +01:00
Michal Privoznik
63a416f3a1 viraccessdriverpolkit: Add missing vtpm case
When adding vtpm virSecret usage type (in v5.6.0-rc1~61) we
forgot to update polkit access check. This limited user's ability
to match secrets in their rules. Add missing case into switch in
virAccessDriverPolkitCheckSecret().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-13 16:11:00 +01:00
Jonathon Jongsma
94365a4871 qemu: handle adding/removing nbdkit-backed disk sources
Previously we were only starting or stopping nbdkit when the guest was
started or stopped or when hotplugging/unplugging a disk. But when doing
block operations, the disk backing store sources can also be be added or
removed independently of the disk device. When this happens the nbdkit
backend was not being handled properly. For example, when doing a
blockcopy from a nbdkit-backed disk to a new disk and pivoting to that
new location, the nbdkit process did not get cleaned up properly. Add
some functionality to qemuDomainStorageSourceAccessModify() to handle
this scenario.

Since we're now starting nbdkit from the ChainAccessAllow/Revoke()
functions, we no longer need to explicitly start nbdkit in hotplug code
paths because the hotplug functions already call these allow/revoke
functions and will start/stop nbdkit if necessary.

Add a check to qemuNbdkitProcessStart() to report an error if we
are trying to start nbdkit for a disk source that already has a running
nbdkit process. This shouldn't happen, and if it does it indicates an
error in another part of our code.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-02-12 16:13:17 -06:00
Jonathon Jongsma
4495ec7d9b qemu: roll back if not all nbdkit backends are successful
When starting nbdkit processes for the backing store of a disk, we were
returning an error if any backing store failed, but we were not cleaning
up processes that succeeded higher in the chain. Make sure that if we
return a failure status from qemuNbdkitStartStorageSource() that we roll
back any processes that had been started.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-02-12 16:13:17 -06:00
Jonathon Jongsma
7a03785d88 qemu: add a 'chain' parameter to nbdkit start/stop
This will allow us to start or stop nbdkit for just a single disk source
or for every source in the backing chain. This will be used in following
patches.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-02-12 16:13:17 -06:00
Jiri Denemark
43c0325b10 network: Make virtual domains resolvable from the host
This patch adds a new attribute "register" to the <domain> element. If
set to "yes", the DNS server created for the virtual network is
registered with systemd-resolved as a name server for the associated
domain. The names known to the dnsmasq process serving DNS and DHCP
requests for the virtual network will then be resolvable from the host
by appending the domain name to them.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-07 14:19:42 +01:00
Jiri Denemark
6b5c0ea45a util: Introduce virSystemdResolvedRegisterNameServer
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-07 14:19:42 +01:00
Jiri Denemark
55996822b1 util: Introduce virSocketAddrBytes
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-07 14:19:42 +01:00
Jiri Denemark
6a5f21632f util: Introduce virSystemdHasResolved
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-07 14:19:42 +01:00
Jiri Denemark
3869c2a281 util: Unify virSystemdHas{Machined,Logind}
When checking for machined we do not really care whether systemd itself
is running, we just need machined to be either running or socket
activated by systemd. That is, exactly the same we do for logind.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-07 14:19:42 +01:00
Michal Privoznik
f2a5494fbd qemu_monitor: Simplify qemuMonitorIOWriteWithFD()
After previous cleanups, qemuMonitorIOWriteWithFD() is but a thin wrapper
over virSocketSendMsgWithFDs(). Replace the body of the former
with a call to the latter.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-07 12:35:02 +01:00
Michal Privoznik
91f4ebbac8 virsocket: Simplify virSocketSendFD()
After previous cleanups, virSocketSendFD() is but a thin wrapper
over virSocketSendMsgWithFDs(). Replace the body of the former
with a call to the latter.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-07 12:34:54 +01:00
Michal Privoznik
495a826dbf virSocketSendMsgWithFDs: Introduce @payload_len argument
Instead of using strlen() to calculate length of payload we're
sending, let caller specify the size: they may want to send just
a portion of a buffer (even though the only current user
doesn't).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-07 11:01:30 +01:00
Michal Privoznik
e82f99283d virSocketSendMsgWithFDs: Don't report errors, just set errno
Currently, virSocketSendMsgWithFDs() reports two errors:

1) if CMSG_FIRSTHDR() fails,
2) if sendmsg() fails.

Well, the latter sets an errno, so caller can just use
virReportSystemError(). And the former - it is very unlikely to
fail because memory for whole control message was allocated just
a few lines above.

The motivation is to unify behavior of virSocketSendMsgWithFDs()
and virSocketSendFD() because the latter is just a subset of the
former (will be addressed later).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-07 10:57:00 +01:00
Peter Krempa
6eaf3614b6 qemuBlockStorageSourceNeedsFormatLayer: Stop formatting 'raw' driver when not needed
The 'raw' driver without any special configuration is not needed and
creates overhead in qemu.

Stop using the 'raw' format driver in cases when it's not needed. A
special case when it is needed is for FD passed images with only a
single writable FD passed, where we need an overlay driver to properly
reflect the 'read-only' flag.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-02 16:03:08 +01:00
Peter Krempa
abdcb46012 qemu: monitor: Use 'backing-mask-protocol' for blockjobs when available
Store whether qemu supports the appropriate option for block-stream and
block-commit commands and always use it if available.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-02 16:03:08 +01:00
Peter Krempa
c38b4e34c8 qemu: capabilities: Introduce QEMU_CAPS_BLOCKJOB_BACKING_MASK_PROTOCOL
The capability is asserted when both block-stream and block-commit QMP
commands support the 'backing-mask-protocol' argument.

The argument causes qemu to record 'raw' as the backing file format in
case when a protocol node is used directly. This is needed to preserve
compatibility of images after a block-commit or block-pull libvirt
operation with older libvirt versions in case when we'll want to remove
the unneded 'raw' format drivers from the block graph.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-02 16:03:08 +01:00
Michal Privoznik
6f55137a1c virsocket: Drop unused #include and #define
Inside of virsocket.c there is an include of poll.h and
PKT_TIMEOUT_MS macro definition. Neither of these is really
needed and in fact it's a leftover after I reworked one of
previously merged commits during review.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-02 16:01:06 +01:00
Praveen K Paladugu
6316f26cd2 ch: Enable ETHERNET Network mode support
enable VIR_DOMAIN_NET_TYPE_ETHERNET network support for ch guests.

Tested with following interface config:

    <interface type='ethernet'>
      <target dev='chtap0' managed="yes"/>
      <model type='virtio'/>
      <driver queues='2'/>
    <interface>

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-02 10:58:35 +01:00
Praveen K Paladugu
05d46e4e08 ch: Introduce version based cap for network support
This capability checks if ch can receive multiple fds along with net-add
api. This capability is required to enable multiple queues for
domain/guest interfaces.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-02 10:58:31 +01:00
Praveen K Paladugu
e7daa49a15 util: Add util methods required by ch networking
virSocketSendMsgWithFDs method send fds along with payload using
SCM_RIGHTS. virSocketRecv method polls, receives and sends the response
to callers.

These methods are required to add network suppport in ch driver.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-02 10:58:29 +01:00
Praveen K Paladugu
4bfd513d92 hypervisor: Move domain interface mgmt methods
Move domain interface management methods from qemu to hypervisor. This
refactoring allows the domain management methods to be shared between CH and
qemu drivers.

This commit does not introduce any functional changes.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-02 10:58:26 +01:00
Praveen K Paladugu
a22d7fde17 conf: Drop unused parameter
Drop unused parameter from virDomainNetReleaseActualDevice method.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-02 10:58:21 +01:00
Jiri Denemark
a6b6107656 conf: Fix error message in virNetworkForwardDefParseXML
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-02-01 12:21:27 +01:00
Peter Krempa
b8d9419e12 util: json: Remove 'virJSONValueObjectReplaceValue'
The helper was used only in 'qemucapabilitiesnumbering' test which was
removed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-02-01 10:39:40 +01:00
Andrea Bolognani
ac29f9396c qemu: Use virDomainControllerDefNew() more
Instead of open-coding a partial version of it.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
2024-02-01 10:37:26 +01:00
Andrea Bolognani
518e70158b qemu: Handle MODEL_SCSI_{AUTO,DEFAULT} appropriately
The qemuDomainGetSCSIControllerModel() function, which is
responsible for choosing a model for a SCSI controller that
didn't have one provided by the user, considers values >0 to
mean "model has been set".

Since MODEL_SCSI_AUTO == 0, this means that such a value is
considered the same as MODEL_SCSI_DEFAULT (-1). This makes
sense, as not specifying a model name or explicitly asking for
one to be automatically chosen intuitively should result in
the same behavior.

Specifically, there is no case in which a value of
MODEL_SCSI_AUTO or MODEL_SCSI_DEFAULT is encountered after the
initial controller creation: it is either replaced with an
actual model, or an error is raised.

Despite this, there are a few places in the QEMU driver where
we incorrectly treat these values as if they were actual
model names. To reduce confusion, make sure that no longer
happens.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
2024-02-01 10:37:22 +01:00
Peter Krempa
f85a382a0e virPCIVPDParse: Do reasonable error reporting
Remove the wannabe error reporting via 'VIR_DEBUG/VIR_INFO' in favor of
proper errors.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
dfc85658bd virPCIVPDParseVPDLargeResourceFields: Report proper errors
The code abused 'VIR_INFO' as an attempt at error reporting. Rework the
code to return the usual 0/-1 and raise proper errors.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
a352bcf1c6 virPCIVPDParseVPDLargeResourceFields: Refactor return logic
Rewrite the conditions after exiting the parser so that they are easier
to understand. This partially decreases the granularity of "error"
messages as they are not strictly necessary albeit for debugging.

As it was already observed in this code the logic itself often does
something else than the comment claims, thus the code logic is
preserved.

Changes:
 - any case when not all data was processed is aggregated together and
   gets a common "error" message
 - absence of 'checksum' field is checked separately
 - helper variables are removed as they are no longer needed

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
378b82dac2 virPCIVPDParseVPDLargeResourceFields: Refactor processing of read data
Use a 'switch' statement instead of a bunch of if/elseif statements.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
f1deac9635 virPCIVPDParseVPDLargeResourceFields: Remove impossible 'default' switch case
The 'fieldFormat' variable is guaranteed to have only the proper enum
values by virPCIVPDResourceGetFieldValueFormat.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
037803a949 virPCIVPDParseVPDLargeResourceFields: Merge logic conditions
Merge the pre-checks with the 'switch' statement which is operating on
the same values to simplify further refactoring.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
aa5e3cc449 virPCIVPDParseVPDLargeResourceString: Properly report errors
Replace VIR_INFO being used as form of error reporting with proper
virReportError and the usual return values.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
c15a495902 virPCIVPDReadVPDBytes: Refactor error handling
Each caller was checking that the function read as many bytes as it
expected. Move the check inside virPCIVPDReadVPDBytes and make it report
a proper error rather than just a combination of VIR_DEBUG inside the
function and a random VIR_INFO in the caller.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
e1dc851e7c virPCIDeviceGetVPD: Handle errors in callers
Until now 'virPCIDeviceGetVPD' couldn't reallistically raise an error,
but that will change. Handle the errors by either resetting it if we'd
be ignoring it or forward it.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
bac86dd36e virPCIDeviceGetVPD: Fix multiple error handling bugs
- fix passing of 'errno' to 'virReportSystemError'

 The 'open' syscall returns '-1' and sets 'errno' on failure. The code
 passed '-fd' as 'errno' rather than errno itself, thus always reporting
 EPERM.

- don't overwrite errors when closing FD

 Use VIR_AUTOCLOSE to avoid overwriting the errors from virPCIVPDParse.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
3ca1079318 virPCIDeviceHasVPD: Refactor "debug" messages
A checker function should not raise VIR_INFO or VIR_WARN messages
especially if they contain information useful only for debugging.

Turn the message into a VIR_DEBUG with universal meaning.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
9aa303a948 util: virpcivpd: Remove return value from virPCIVPDResourceUpdateKeyword
The function always succeeded and after the removal of programing error
checks doesn't even have a 'return false' case. Additionally one of the
tests in 'virpcivpdtest' tested that this function never failed on wrong
data. Embrace this logic and remove the return value and adjust logging
to VIR_DEBUG level to avoid spamming logs.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
dd36db2607 virNodeDeviceCapVPDParseXML: Fix error reporting
Don't overwrite already reported errors and improve parsing of
attributes.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
ea8d864d9e conf: node_device: Refactor 'virNodeDeviceCapVPDParseCustomFields' to fix error reporting
The errors raised in virNodeDeviceCapVPDParseCustomFields were actually
ignored by continuing the parse rather than raised.

Rather than just replace 'continue' by 'return -1' this patch refactors
the whole parser to simplify it as well as report reasonable errors.

Parsing of individual fields is done without XPath and is extracted into
a common helper.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
dd328cd48a util: virPCIVPDResourceUpdateKeyword: Remove impossible checks
All callers satisfy these checks as they are just for programming
errors.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
fb69acf5c2 conf: virNodeDeviceCapVPDParse*: Remove pointless NULL checks
The function are never called with NULL argument so the checks can be
removed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
d36da8ea4a util: virpcivpd: Remove return value from virPCIVPDResourceCustomUpsertValue
None of the callers pass NULL, so the NULL check is pointless. Remove it
an remove the return value.

The function is exported only for use in 'virpcivpdtest' thus marking
the arguments as NONNULL is unnecessary.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
ab3f4d1b0b virPCIVPDResourceGetKeywordPrefix: Fix logging
Use VIR_DEBUG instead of VIR_INFO as that's more appropriate and report
relevant information for debugging.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
810a3ca980 util: virpcivpd: Unexport 'virPCIVPDParseVPDLargeResourceString'
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
d395d7a20f util: pcivpd: Unexport virPCIVPDParseVPDLargeResourceFields
The function is not used in other files.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
4d229ef440 util: virpcivpd: Unexport 'virPCIVPDReadVPDBytes'
The function is no longer used outside of virpcivpd.c

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
a9f76d6ab7 Don't overwrite error message from 'virXPathNodeSet'
'virXPathNodeSet' returns -1 only when 'ctxt' or 'xpath' are NULL or
when the 'xpath' string is invalid. Both are programming errors. It
doesn't make sense for the code to overwrite the error message for
anything supposedly more relevant.

The majority of calls to 'virXPathNodeSet' already didn't do this, so
this patch fixes the rest to prevent it from spreading again.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
edaa1112ff schema: nodedev: Adjust allowed characters in 'vpdFieldValueFormat'
The check in 'virPCIVPDResourceIsValidTextValue' allows any printable
characters, thus the XML schema should do the same.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
2ccac1e42f virNodeDeviceCapVPDFormat: Properly escape system-originated strings
Similarly to previous commit other specific fields which come from the
system data and aren't sanitized enough to be safe for XML were also
formatted via virBufferAsprintf.

Other static and safe strings used virBufferEscapeString instead of
virBufferAddLit.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
5373b8c02c virNodeDeviceCapVPDFormatCustom*: Escape unsanitized strings
The custom field data is taken from PCI device data which can contain
any printable characters, and thus must be escaped when putting into
XML.

Originally, based on the comment and XML schema which was fixed in
previous commits the idea seemed to be that the parser would validate
that only characters which don't break the XML would be present but that
didn't seem to materialize.

Switch to proper escaping of the XML.

Fixes: 3954378d06
Resolves: https://issues.redhat.com/browse/RHEL-22314
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:06 +01:00
Peter Krempa
eb3844009d util: pcivpd: Refactor virPCIVPDResourceIsValidTextValue
The function is never called with NULL argument. Remove the check and
refactor the rest including the debug statement.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:06 +01:00
Peter Krempa
42df6cc1b4 virPCIVPDResourceIsValidTextValue: Adjust comment to reflect actual code
The function does not reject '&', '<', '>' contrary to what it actually
states. Move and adjust the comment.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:06 +01:00
Peter Krempa
36e11cca83 qemuMigrationDstStartNBDServer: Refactor cleanup
There's nothing under the 'cleanup:' label thus the whole code can be
simplified.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-01-31 15:25:54 +01:00
Peter Krempa
43f027b57c qemu: migration: Properly handle reservation of manually specified NBD port
Originally the migration code didn't register the NBD disk port with the
port allocator when it was manually specified. Later when commit
e74d627bb3 refactored the code and started registering it, the
old logic which was clearing 'priv->nbdPort' in case when it was manually
specified was not removed.

This caused following problems:
 - the port was not released after successful migration
 - the port was released even when it was not allocated on failures
   regarding the NBD server start
 - the port was not released on other failures of the migration after
   NBD server startup

To address this we remove the assumption that 'priv->nbdPort' is used
only for auto-allocated port and fill it only once the port is
allocated and make the caller of qemuMigrationDstStartNBDServer
responsible for releasing it.

Fixes: e74d627bb3
Resolves: https://issues.redhat.com/browse/RHEL-21543
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-01-31 15:25:54 +01:00
Peter Krempa
19eaa85438 util: virtportallocator: Add VIR_DEBUG statements for port allocations and release
Add a few debug statements to be able to trace lifetime of a
reserved/allocated port.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-01-31 15:25:53 +01:00
Peter Krempa
c697aff8a1 remoteDispatchAuthPolkit: Fix lock ordering deadlock if client closes connection during auth
Locks in following text:
A: virNetServer
B: virNetServerClient
C: daemonClientPrivate

'virNetServerSetClientAuthenticated' locks A then B

'remoteDispatchAuthPolkit' calls 'virNetServerSetClientAuthenticated'
while holding C.

If a client closes its connection 'virNetServerProcessClients' with the
lock A and B locked will call 'virNetServerClientCloseLocked' which will
try to dispose of the 'client' private data by:

  ref(b);
  unlock(b);
  remoteClientFreePrivateCallbacks();
  lock(b);
  unref(b);

Unfortunately remoteClientFreePrivateCallbacks() tries lock C.

Thus the locks are held in the following order:

 polkit auth: C -> A
 connection close: A -> C

causing a textbook-example deadlock. To resolve it we can simply drop
lock 'C' before calling 'virNetServerSetClientAuthenticated' as the lock
is not needed any more.

Resolves: https://issues.redhat.com/browse/RHEL-20337
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-01-31 15:25:53 +01:00
Stefano Brivio
f95675fdbb apparmor: Add user session path for PID and socket files used by passt
Commit 7a39b04d68 ("apparmor: Enable passt support") grants
passt(1) read-write access to /{,var/}run/libvirt/qemu/passt/* if
started by the libvirt daemon. That's the path where passt creates
PID and socket files only if the guest is started by the root user.

If the guest is started by another user, though, the path is more
commonly /var/run/user/$UID/libvirt/qemu/run/passt: add it as
read-write location. Otherwise, passt won't be able to start, as
reported by Andreas.

While at it, replace /{,var/}run/ in the existing rule by its
corresponding tunable variable, @{run}.

Fixes: 7a39b04d68 ("apparmor: Enable passt support")
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1061678
Reported-by: Andreas B. Mundt <andi@debian.org>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
2024-01-31 11:25:32 +01:00
Pavel Hrdina
189fdeff10 qemu_snapshot: create: don't require disk-only flag for offline external snapshot
Historically creating offline external snapshot required disk-only flag
as well. Now when user requests new snapshot for offline VM and at least
one disk is specified to use external snapshot we will no longer require
disk-only flag as all other not specified disk will use external
snapshots as well.

Resolves: https://issues.redhat.com/browse/RHEL-22797
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 13:54:42 +01:00
Pavel Hrdina
faa2e3bb54 qemu_snapshot: create: refactor external snapshot detection
Introduce new function qemuSnapshotCreateUseExternal() that will return
true if we will use external snapshots as default location.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 13:54:32 +01:00
Pavel Hrdina
7143c4e1f9 qemu_snapshot: fix detection if non-leaf snapshot isn't in active chain
The condition was completely wrong. As per the comment for function
virDomainMomentIsAncestor() it checks that the first argument is
descendant of the second argument.

Consider the following snapshot tree for VM:

  s1
    |
    +- s2
    |   |
    |   +- s3
    |
    +- s4
        |
        +- s5 (current)

When deleting s2 with the original code we checked if
virDomainMomentIsAncestor(s2, s5) which would return false basically for
any snapshot as s5 is leaf snapshot so no children.

When deleting s2 with fixed code we check if
virDomainMomentIsAncestor(s5, s2) which still returns false but when
deleting s4 it will correctly return true.

Before this fix it fails with the following error:

    error: Failed to delete snapshot s2
    error: invalid argument: could not find base disk source in disk source chain

After the fix it fails with correct error:

    error: Failed to delete snapshot s2
    error: unsupported configuration: deletion of non-leaf external snapshot that is not in active chain is not supported

Resolves: https://issues.redhat.com/browse/RHEL-23212
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 13:32:04 +01:00
Purna Pavan Chandra Aekkaladevi
18f2bf0a43 ch_driver: fix condition in virCHDomainRemoveInactive()
Rectify the condition to remove a domain only if it is not persistent.

Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-01-30 12:31:20 +01:00
Andrea Bolognani
19dc73d16e qemu: Move qemuDomainGetSCSIControllerModel()
It has nothing to do with assigning addresses, so it makes more
sense to have it in qemu_domain.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:58:13 +01:00
Andrea Bolognani
89a8862d42 qemu: Add missing error handling
qemuDomainGetSCSIControllerModel() can return -1 on failure,
but qemuDomainFindOrCreateSCSIDiskController() didn't implement
any handling for this scenario.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:58:13 +01:00
Andrea Bolognani
27cb524a9f qemu: Drop qemuDomainSetSCSIControllerModel()
It only has a single caller.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:58:13 +01:00
Andrea Bolognani
7d6ec89243 qemu: Drop qemuDomainFindSCSIControllerModel()
It only has a single caller.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:58:13 +01:00
Andrea Bolognani
d0087e65d8 qemu: Clean up qemuDomainDefaultNetModel()
Group things together where it makes sense, avoid unnecessary
uses of 'else if', plus other tweaks.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:58:13 +01:00
Andrea Bolognani
d583ff601f qemu: Default to no USB and no memballoon for new architectures
The current defaults, that can be altered on a per-architecture
basis, are derived from the historical x86 behavior.

Every time support for a new architecture is added to libvirt,
care must be taken to override these default: if that doesn't
happen, guests will end up with additional hardware, which is
something that's generally undesirable.

Turn things around, and require architectures to explicitly
ask for the devices to be created by default instead. The
behavior for existing architectures is preserved.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:58:13 +01:00
Andrea Bolognani
95e54bce7d qemu: Fix a few comments
They reference functions that have since been renamed.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:58:13 +01:00
Michal Privoznik
dab99eedcd qemu_command: Generate cmd line for virtio-mem dynamicMemslots
This is pretty straightforward.

Resolves: https://issues.redhat.com/browse/RHEL-15316
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:44:36 +01:00
Michal Privoznik
6be07af817 qemu_validate: Check capability for virtio-mem dynamicMemslots
The QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI_DYNAMIC_MEMSLOTS reflects
whether QEMU is capable of .dynamic-memslots for virtio-mem.
Use it when validating domain configuration.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:44:36 +01:00
Michal Privoznik
497cab753b qemu_capabilities: Add QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI_DYNAMIC_MEMSLOTS capability
Starting from v8.2.0-rc0~74^2~2 QEMU has .dynamic-memslots
attribute for virtio-mem-pci device. Introduce a capability which
reflects that.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:44:36 +01:00
Michal Privoznik
5325820585 conf: Introduce dynamicMemslots attribute for virtio-mem
Introduced in v8.2.0-rc0~74^2~2, QEMU now allows setting
.dynamic-memslots attribute for virtio-mem-pci devices. When
turned on, it allows memory exposed to guest to be split into
multiple memslots and thus smaller memory footprint (see the
original commit for detailed explanation).

Therefore, introduce new <target/> attribute which will control
that QEMU knob.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:44:36 +01:00
Michal Privoznik
3a3f73ea9f remote_driver: Restore special behavior of remoteDomainGetBlockIoTune()
In v9.10.0-rc1~103 the remote driver was switched to g_auto() for
client RPC return parameters. But whilst doing so a small bug
slipped in: previously, when virDomainGetBlockIoTune() was called
with *nparams == 0, the function set *nparams to the number of
supported params and zero was returned (so that client can
allocate memory and call the API second time). IOW - the usual,
old style of APIs where we didn't want to allocate memory on
caller's behalf. But because of this bug, a negative one is
returned instead.

Fixes: 501825011c
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-29 12:50:26 +01:00
Or Ozeri
66aee6e5c2 qemu: block: fix error when blockcopy target is librbd encrypted
Encryption secrets are considered a format dependency, even
when being used by the storage node itself, as in the case of
using encryption engine=librbd.
Currently, the storage node is created (blockdev-add) before
creating the format dependencies (including encryption secrets).
As a result, when trying to perform a blockcopy when the target
disk uses librbd encryption, an error of this form is returned:

  "error: internal error: unable to execute QEMU command 'blockdev-add': No secret with id 'libvirt-5-format-encryption-secret0'"

To overcome this error, we change the order of commands so that
format dependencies are created BEFORE creating the storage node.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2024-01-26 09:52:19 +01:00
Michal Privoznik
ccfc5c1e16 qemu_hotplug: Don't lose 'created' flag in qemuDomainChangeNet()
After v9.1.0-rc1~116 we track whether it's us who created a
macvtap or not. But when updating a vNIC its definition might be
replaced with a new one (though, ifname is not allowed to
change), e.g. to reflect new QoS, link state, etc.

Now, the fact whether we created macvtap for given vNIC is stored
in net->privateData->created. And replacing definition is done by
simply freeing the old definition and making the pointer point to
the new one. But this does not preserve the 'created' flag, which
in turn means when a domain is shutting off, the macvtap is not
removed (see loop inside of qemuProcessStop()).

Copy this flag into new definition and leave a note in
_qemuDomainNetworkPrivate struct.

Fixes: 61d1b9e659
Resolves: https://issues.redhat.com/browse/RHEL-22714
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-25 16:41:12 +01:00
Michal Privoznik
b49fb57395 vmx: Separate disk target name generation into a function
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-25 16:26:45 +01:00
Michal Privoznik
555b9c5827 vmx: Accept empty fileName for cdrom-image
Turns out, there are two ways to specify an empty CD-ROM drive in
a .vmx file:

  1) .fileName = "emptyBackingString"
  2) .fileName = ""

While we do parse 1) successfully, the code does not accept 2)
and an error is reported. Modify the code to treat both cases the
same.

Resolves: https://issues.redhat.com/browse/RHEL-19380
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-25 16:26:45 +01:00
Michal Privoznik
bee5301afa qemu_process: Skip over non-virtio non-TAP NIC models when refreshing rx-filter
After guest is started, or we are reconnecting to already running
one (after daemon restart), qemuProcessRefreshRxFilters() is
called to refresh rx-filters (basically MAC addresses of guest
NICs) as they might have changed while we were not running (for
the case when reconnecting to an already running guest), or we
need to enable them by running a command (for freshly started
guest - see processNicRxFilterChangedEvent()).

Now, our XML parser allowed trustGuestRxFilters attribute for all
types and models of <interface/> while in reality, only virtio
model AND TUN/TAP based types can see MAC address changes. For
other combinations, QEMU reports an error.

This all means that when the daemon is restarted and it
reconnects to a guest with, well invalid configuration, or when
such guest is restored from a saved image, or migrated then we
issue the monitor command, to which QEMU replies with an error
which is then propagated to users:

  error: internal error: unable to execute QEMU command 'query-rx-filter': invalid net client name: hostdev0

While on one hand users should fix their configuration (and after
v10.0.0-rc1~123 they can do that even on live domains), libvirt
can also has some logic built in that prevent issuing the command
in the first place (for obviously wrong cases).

Fixes: 060d4c83ef
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-25 15:55:33 +01:00
Jiri Denemark
dcfe548cb0 build: Make daemons depend on generated *_protocol.[ch]
This should fix build failures when a daemon code is compiled before the
included *_protocol.h headers are ready, such as:

    FAILED: src/virtqemud.p/remote_remote_daemon_config.c.o
    ../src/remote/remote_daemon_config.c: In function ‘daemonConfigNew’:
    ../src/remote/remote_daemon_config.c:111:30: error:
        ‘REMOTE_AUTH_POLKIT’ undeclared (first use in this function)
      111 |         data->auth_unix_rw = REMOTE_AUTH_POLKIT;
          |                              ^~~~~~~~~~~~~~~~~~
    ../src/remote/remote_daemon_config.c:111:30: note: each undeclared
        identifier is reported only once for each function it appears in
    ../src/remote/remote_daemon_config.c:115:30: error:
        ‘REMOTE_AUTH_NONE’ undeclared (first use in this function)
      115 |         data->auth_unix_rw = REMOTE_AUTH_NONE;
          |                              ^~~~~~~~~~~~~~~~
    ../src/remote/remote_daemon_config.c: In function
        ‘daemonConfigLoadOptions’:
    ../src/remote/remote_daemon_config.c:252:31: error:
        ‘REMOTE_AUTH_POLKIT’ undeclared (first use in this function)
      252 |     if (data->auth_unix_rw == REMOTE_AUTH_POLKIT) {
          |                               ^~~~~~~~~~~~~~~~~~

or

    FAILED: src/virtqemud.p/remote_remote_daemon_dispatch.c.o
    In file included from ../src/remote/remote_daemon.h:28,
                     from ../src/remote/remote_daemon_dispatch.c:26:
    src/remote/lxc_protocol.h:13:5: error:
        unknown type name ‘remote_nonnull_domain’
       13 |     remote_nonnull_domain dom;
          |     ^~~~~~~~~~~~~~~~~~~~~
    In file included from ../src/remote/remote_daemon.h:29,
                     from ../src/remote/remote_daemon_dispatch.c:26:
    src/remote/qemu_protocol.h:13:5: error:
        unknown type name ‘remote_nonnull_domain’
       13 |     remote_nonnull_domain dom;
          |     ^~~~~~~~~~~~~~~~~~~~~
    src/remote/qemu_protocol.h:14:5: error:
        unknown type name ‘remote_nonnull_string’
       14 |     remote_nonnull_string cmd;
          |     ^~~~~~~~~~~~~~~~~~~~~
    ...

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-01-25 10:36:05 +01:00
Jonathon Jongsma
95c843eae3 qemu: Fix bug in nbdkit-backed backing chains
When trying to start nbdkit-backed disks in backing chains, we were
accidentally always checking the private data of the top of the chain
instead of using the loop variable.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
2024-01-24 07:45:34 -06:00
Alexandra Diupina
ab9da5c4ab conf: make virNetDevVPortProfileFormat() void
Since commit 4af3cbafdd the function always returns 0, so it is
possible to make this function void and remove return value checks.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru>
2024-01-24 10:06:09 +01:00
Michal Privoznik
91f9a9fb4f domain_validate: Check for domain address conflicts fully
Current implementation of virDomainMemoryDefCheckConflict() does
only a one way comparison, i.e. if there's a memory device within
def->mems[] which address falls in [mem->address, mem->address +
mem->size] range (mem is basically an iterator within
def->mems[]). And for static XML this works just fine. Problem is
with hot/cold plugging of a memory device. Then mem points to
freshly parsed memory device and these half checks are
insufficient. Not only we must check whether an existing memory
device doesn't clash with freshly parsed memory device, but also
whether freshly parsed memory device does not fall into range of
already existing memory device.

Resolves: https://issues.redhat.com/browse/RHEL-4452
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-01-23 17:32:10 +01:00
Biswapriyo Nath
642af05e3e meson: drop explicit python interpreter
meson wraps python scripts already on win32, so we end up with these
failing commands:

[185/868] Generating src/rpc/virnetprotocol.h with a custom command
FAILED: src/rpc/virnetprotocol.h
"sh" "libvirt/scripts/meson-python.sh" "F:/msys64/ucrt64/bin/python3.EXE" "F:/msys64/ucrt64/bin/python.exe" "libvirt/scripts/rpcgen/main.py" "--mode=header" "../src/rpc/virnetprotocol.x" "src/rpc/virnetprotocol.h"
SyntaxError: Non-UTF-8 code starting with '\x90' in file F:/msys64/ucrt64/bin/python.exe on line 1, but no encoding declared; see https://peps.python.org/pep-0263/ for details

The issue was introduced in a62486b95f commit.
These changes are similar as e06beacec2 commit.

Signed-off-by: Biswapriyo Nath <nathbappai@gmail.com>
2024-01-22 09:14:29 +00:00
Andrea Bolognani
5df470f47d qemu: Improve qemuDomainSupportsPCIMultibus()
Rewrite the function so that it's more compact and easier to
extend as new architectures, which will likely come with
multibus support right out the gate, are introduced.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 19:28:50 +01:00
Andrea Bolognani
176c3b105e qemu: Move qemuDomainSupportsPCIMultibus()
It belongs next to qemuDomainSupportsPCI().

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 19:28:41 +01:00
Andrea Bolognani
11a861e9e9 qemu: Improve qemuDomainSupportsPCI()
The way the function is currently written sort of obscures this
fact, but ultimately we already unconditionally assume PCI
support on most architectures.

Arm and RISC-V need some additional checks to maintain
compatibility with existing configurations but for all future
architectures, such as the upcoming LoongArch64, we expect PCI
support to come out of the box.

Last but not least, the functions is made const-correct.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 19:28:29 +01:00
Andrea Bolognani
e622233eda qemu: Retire QEMU_CAPS_OBJECT_GPEX
It's no longer used anywhere.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 19:28:19 +01:00
Andrea Bolognani
ac48405fa7 qemu: Stop checking QEMU_CAPS_OBJECT_GPEX
For all versions of QEMU that we support, the virt machine type
has a hard dependency on this device, so we can stop checking
whether the capability is present and just use it unconditionally.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 19:28:07 +01:00
Andrea Bolognani
f39f15313a qemu: Fix handling of user aliases for default PHB
The bus name for the default PHB is always "pci.0".

Fixes: 937f319536
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 19:18:20 +01:00
Peter Krempa
55839c154d virDomainDefAddConsoleCompat: Fix numbering of console targets after modification
The XML parser for consoles sets the 'port=' attribute of '<target' to
be always the index of the console.

Thus when the "really crazy backcompat stuff for consoles" function
modifies the order of consoles by inserting the default one for a serial
port it must re-number the ports to ensure that the value will not
change on subsequent parse.

This luckily didn't cause any visible changes to the VM as the port
number isn't used for anything.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 17:31:12 +01:00
Peter Krempa
f51c6b5b02 qemuDomainAssignPCIAddresses: Assign extension addresses when auto-assigning PCI address
Assigning a PCI address needs to also assign any extension addresses
right away. Otherwise they'd be assigned only after subsequent
format->parse cycle and thus be potentially missing on first run after
defining the VM and thus could change.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 17:31:12 +01:00
Peter Krempa
7dd3d77940 qemu: Move 'shmem' device size validation to qemu_validate
The 'size' of a 'shmem' device is parsed and formatted as a "scaled"
value, stored in bytes, but the formatting scale is mebibytes. This
precission loss combined with the fact that the value was validated only
when starting and the size is formatted only when non-zero meant that
on first parse a value < 1 MiB would be accepted, but would be formatted
to the XML as 0 MiB as it was non-zero but truncated and a subsequent
parse would parse of such XML would parse it as 0 bytes, which in turn
would be interpreted as 'default' size.

Fix the issue by moving the validator, which ensures that the number is
a power of two and more than 1 MiB to the validator code so that it'll
be rejected at XML parsing time.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 17:31:12 +01:00
Peter Krempa
deb3c834e5 virDomainAssignControllerIndexes: Ensure controller ordering after assigning indexes
Similarly to auto-adding of controllers, the assignment of indexes can
cause them to be considered in different ordering according to the logic
in 'virDomainControllerInsert' than they currently are.

To prevent changes in commandline between first run after defining a VM
xml and any subsequent run or restart of the daemon, we need to reorder
them when assigning the index.

The simplest method is to assign indexes and then create a new list of
controllers and re-instert them.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 17:31:12 +01:00
Peter Krempa
4bc82cd7eb conf: domain: Insert auto-added controllers in same order as in XML parser
'virDomainDefAddController' which is used in code-paths which auto-add
controllers to the definition such as 'virDomainDefMaybeAddController',
'virDomainDefAddUSBController', 'qemuDomainDefAddDefaultDevices' was
adding the controller at the end of the list. However that is not how
the XML parser would order the controller in the list as it uses
virDomainControllerInsert grouping them by type and additional
properties.

This would cause that auto-added controllers would re-order:
 - between first and any subsequent run of the VM (even on commandline)
 - after a libvirtd/virtqemud restart
 - after any update of the definition based on the 'define' operation
   (e.g. virsh edit)

To ensure that the ordering of controllers is identical always use
virDomainControllerInsert.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 17:31:12 +01:00
Peter Krempa
5fb20c9902 virDomainDefMaybeAddVirtioSerialController: Reformat hard to read linebreaks
Format the code the usual way despite having more than 80 columns so
that it's easier to read.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 17:31:12 +01:00
Peter Krempa
5969ad097d tests: Rename 'qemuxml2argvtest' to 'qemuxmlconftest'
Since this tests inactive/config XML files rename it accordingly.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 17:31:12 +01:00
Peter Krempa
eb994dee5b qemuxml2xmltest: Remove pointless inactive->active testing
'virDomainDefFormatInternalSetRootName' which is the top level XML
formatter function has the following condition as the very first thing:

     if (def->id == -1)
         flags |= VIR_DOMAIN_DEF_FORMAT_INACTIVE;

This makes it pointless to separately do inactive->active and
inactive->inactive XML -> XML testing as both will be in the end treated
as inactive->inactive.

This patch adds a warning to virDomainDefFormatInternalSetRootName and
removes the second pointless invocation of the test from
qemuxml2xmtest.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 17:31:12 +01:00
Andrea Bolognani
763381df53 qemu: Make monitor aware of CPU clusters
This makes it so libvirt can obtain accurate information about
guest CPUs from QEMU, and should make it possible to correctly
perform operations such as CPU hotplug.

Of course this is mostly moot at the moment: only aarch64 can use
CPU clusters, and CPU hotplug is not yet implemented on that
architecture.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-15 14:56:36 +01:00
Andrea Bolognani
655459420a qemu: Use CPU clusters for guests
https://issues.redhat.com/browse/RHEL-7043

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-15 14:56:35 +01:00
Andrea Bolognani
beb27dc61e qemu: Introduce QEMU_CAPS_SMP_CLUSTERS
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-15 14:56:35 +01:00
Andrea Bolognani
ef5c397584 conf: Allow specifying CPU clusters
The default number of CPU clusters is 1, and values other than
that one are currently rejected by all hypervisor drivers.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-15 14:56:35 +01:00
Andrea Bolognani
5fc56aefb6 conf: Report CPU clusters in capabilities XML
For machines that don't expose useful information through sysfs,
the dummy ID 0 is used.

https://issues.redhat.com/browse/RHEL-7043

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-15 14:56:35 +01:00
Shaleen Bathla
9ef6fee129 conf: domain_conf: cleanup def in case of errors
Just like in rest of the function virDomainFSDefParseXML,
use goto error so that def will be cleaned up in error cases.

Signed-off-by: Shaleen Bathla <shaleen.bathla@oracle.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
2024-01-11 16:53:31 -06:00
Sergio Durigan Junior
6fa82fd8e2 apparmor: Allow access to /sys/devices/system/node/*/cpumap for libnuma
A QEMU change (10218ae6d006f76410804cc4dc690085b3d008b5) introduced
some libnuma calls that require read access to
/sys/devices/system/node/*/cpumap, which currently is forbidden by the
standard apparmor profile.

This commit allows read-only access to the file specified above.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/515

Signed-off-by: Sergio Durigan Junior <sergio.durigan@canonical.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
2024-01-11 15:15:23 -07:00
Michal Privoznik
9cbda34e98 qemu: Be less aggressive when dropping channel source paths
In v9.7.0-rc1~130 I've shortened the path that's generated for
<channel/> source. With that, I had to adjust regex that matches
all versions of paths we have ever generated so that we can drop
them (see comment around qemuDomainChrDefDropDefaultPath()). But
as it is usually the case with regexes - they are write only. And
while I attempted to make one portion of the path optional
("/target/") I accidentally made regex accept more, which
resulted in libvirt dropping the user provided path and
generating our own instead.

Fixes: d3759d3674
Resolves: https://issues.redhat.com/browse/RHEL-20807
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-01-10 08:44:32 +01:00
Daniel P. Berrangé
7cb03e6a28 rpc: fix race in waking up client event loop
The first thread to issue a client RPC request will own the event
loop execution, sitting in the virNetClientIOEventLoop function.

It releases the client lock while running:

   virNetClientUnlock()
   g_main_loop_run()
   virNetClientLock()

If a second thread arrives with an RPC request, it will queue it
for the first thread to process. To inform the first thread that
there's a new request it calls g_main_loop_quit() to break it out
of the main loop.

This works if the first thread is in g_main_loop_run() at that
time. There is a small window of opportunity, however, where
the first thread has released the client lock, but not yet got
into g_main_loop_run(). If that happens, the wakeup from the
second thread is lost.

This patch deals with that by changing the way the wakeup is
performed. Instead of directly calling g_main_loop_quit(), the
second thread creates an idle source to run the quit function
from within the first thread. This guarantees that the first
thread will see the wakeup.

Tested by: Fima Shevrin <efim.shevrin@virtuozzo.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-01-09 11:58:41 +00:00
Daniel P. Berrangé
024d6dc263 qemu: tighten semantics of 'size' when resizing block devices
When VIR_DOMAIN_BLOCK_RESIZE_CAPACITY is set, the 'size' parameter
is currently ignored. Since applications must none the less pass a
value for this parameter, it is preferrable to declare some explicit
semantics for it.

This declare that the parameter must be 0, or the exact size of the
underlying block device. The latter gives the management application
the ability to sanity check that the block device size matches what
they think it should be.

Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-01-09 11:57:13 +00:00
Jiri Denemark
8d693d79c4 qemu: Enable postcopy-preempt migration capability
During post-copy migration (once it actually switches to post-copy mode)
dirty memory pages are continued to be migrated iteratively, while the
destination can explicitly request a specific page to be migrated before
the iterative process gets to it (which happens when a guest wants to
read a page that was not migrated yet). Without the postcopy-preempt
capability enabled such pages need to wait until all other pages already
queued are transferred. Enabling this capability will instruct the
hypervisor to create a separate migration channel for explicitly
requested pages so that they can preempt the queue.

The only requirement for the feature to work is running a migration over
a protocol that supports multiple connections. In other words, we can't
pre-create the connection and pass its file descriptor to QEMU (i.e.,
using MIGRATION_DEST_CONNECT_SOCKET), but we have to let QEMU open the
connections itself (using MIGRATION_DEST_SOCKET). This change is applied
to all post-copy migrations even if postcopy-preempt is not supported to
avoid making the code even uglier than it is now. There's no real
difference between the two methods with modern QEMU (which can properly
report connection failures) anyway.

This capability is enabled for all post-copy migration as long as the
capability is supported on both sides of migration.

https://issues.redhat.com/browse/RHEL-7100

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 22:41:23 +01:00
Jiri Denemark
61e34b0856 qemu: Add support for optional migration capabilities
We enable various migration capabilities according to the flags passed
to a migration API. Missing support for such capabilities results in an
error because they are required by the corresponding flag. This patch
adds support for additional optional capability we may want to enable
for a given API flag in case it is supported. This is useful for
capabilities which are not critical for the flags to be supported, but
they can make things work better in some way.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 22:39:56 +01:00
Jiri Denemark
efc26a665d qemu: Rename remoteCaps parameter in qemuMigrationParamsCheck
The migration cookie contains two bitmaps of migration capabilities:
supported and automatic. qemuMigrationParamsCheck expects the letter so
lets make it more obvious by renaming the parameter as remoteAuto.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 22:38:53 +01:00
Jiri Denemark
c941106f7c qemu: Use C99 initializers for qemuMigrationParamsFlagMap
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 22:38:36 +01:00
Jiri Denemark
ff128d3761 qemu: Document qemuMigrationParamsFlagMapItem fields
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 22:38:24 +01:00
Peter Krempa
397218c433 qemu: Implement support for configuring iothread to virtqueue mapping for disks
Add validation and formatting of the commandline arguments for
'iothread-vq-mapping' parameter. The validation logic mirrors what qemu
allows.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-08 09:27:31 +01:00
Peter Krempa
0cb7b1b2c3 conf: Add possibility to configure multiple iothreads per disk
Introduce a new <iothreads> sub-element of disk's <driver> which will
allow configuring multiple iothreads and also map them to specific
virt-queues of virtio devices.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-08 09:27:31 +01:00
Peter Krempa
ee7121ab8e qemu: capabilities: Introduce QEMU_CAPS_VIRTIO_BLK_IOTHREAD_MAPPING
The capability represents the support for mapping virtqueues to
iothreads for the 'virtio-blk' device.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-08 09:27:31 +01:00
Peter Krempa
08a7fc834c util: xml: Return GPtrArray from virXMLNodeGetSubelement
Rework the helper to use a GPtrArray structure to simplify callers.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-08 09:27:31 +01:00
Laine Stump
82e2fac297 qemu: automatically bind to a vfio variant driver, if available
Rather than always binding to the vfio-pci driver, use the new
function virPCIDeviceFindBestVFIOVariant() to see if the running
kernel has a VFIO variant driver available that is a better match for
the device, and if one is found, use that instead.

virPCIDeviceFindBestVFIOVariant() function reads the modalias file for
the given device from sysfs, then looks through
/lib/modules/${kernel_release}/modules.alias for the vfio_pci alias
that matches with the least number of wildcard ('*') fields.

The appropriate "VFIO variant" driver for a device will be the PCI
driver implemented by the discovered module - these drivers are
compatible with (and provide the entire API of) the standard vfio-pci
driver, but have additional device-specific APIs that can be useful
for, e.g., saving/restoring state for migration.

If a specific driver is named (using <driver model='blah'/> in the
device XML), that will still be used rather than searching
modules.alias; this makes it possible to force binding of vfio-pci if
there is an issue with the auto-selected variant driver.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 01:00:11 -05:00
Laine Stump
8b93d78c83 conf: support manually specifying VFIO variant driver in <hostdev> XML
This patch makes it possible to manually specify which VFIO variant
driver to use for PCI hostdev device assignment, so that, e.g. you
could force use of a VFIO "variant" driver, with e.g.

  <driver model='mlx5_vfio_pci'/>

or alternately to force use of the generic vfio-pci driver with

  <driver model='vfio-pci'/>

when libvirt would have normally (after applying a subsequent patch)
found a "better match" for a device in the active kernel's
modules.alias file. (The main potential use of this manual override
would probably be to work around a bug in a new VFIO variant driver by
temporarily not using that driver).

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 01:00:08 -05:00
Laine Stump
9363c1cb69 xen: explicitly set hostdev driver.name at runtime, not in postparse
Xen only supports a single type of PCI hostdev assignment, so it is
superfluous to have <driver name='xen'/> peppered throughout the
config. It *is* necessary to have the driver type explicitly set in
the hostdev object before calling into the hypervisor-agnostic "hostdev
manager" though (otherwise the hostdev manager doesn't know whether it
should do Xen-specific setup, or VFIO-specific setup).

Historically, the Xen driver has checked for "default" driver name
(i.e. not set in the XML), and set it to "xen', during the XML
postparse, thus guaranteeing that it will be set by the time the
object is sent to the hostdev manager at runtime, but also setting it
so early that a simple round-trip of parse-format results in the XML
always containing an explicit <driver name='xen'/>, even if that
wasn't specified in the original XML.

The QEMU driver *doesn't* set driver.name during postparse though;
instead, it waits until domain startup time (or device attach time for
hotplug), and sets the driver.name then. The result is that a
parse-format round trip of the XML in the QEMU driver *doesn't* add in
the <driver name='vfio'/>.

This patch modifies the Xen driver to behave similarly to the QEMU
driver - the PostParse just checks for a driver.name that isn't
supported by the Xen driver, and any explicit setting to "xen" is
deferred until domain runtime rather than during the postparse, thus
Xen domain XML also doesn't get extraneous <driver name='xen'/>.

This delayed setting of driver.name of course results in slightly
different xml2xml parse-format results, so the unit test data is
modified accordingly.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-07 23:59:00 -05:00
Laine Stump
b9a1e7c436 conf: replace virHostdevIsVFIODevice with virHostdevIsPCIDevice
virHostdevIsVFIODevice() and virDomainDefHasVFIOHostdev() are only ever
called from the QEMU driver, and in the case of the QEMU driver, any
PCI hostdev by definition uses VFIO, so really all these callers only
need to know if the device is a PCI hostdev.

(It turned out that the less specific virHostdevIsPCIDevice() already
existed in hypervisor/virhostdev.c, so I had to remove one of them;
since conf is a lower level directory than hypervisor, and the
function is called from conf, keeping the copy in hypervisor would
have required moving its caller (virDomainDefHasPCIHostdev()) into
hypervisor as well, so I just removed the copy in hypervisor.)

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-07 23:58:44 -05:00
Laine Stump
bb1acb9ca2 conf: use new common parser/formatter for hostdev driver in network XML
Now if a new attribute is added to <driver>, we only need to update
the formatting/parsing in one place.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-07 23:57:09 -05:00
Laine Stump
195522ae87 conf: split out hostdev <driver> parse/format to their own functions
This is done so that we can re-use the same parser/formatter for
<network> and <networkport>

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-07 23:57:09 -05:00