Commit Graph

35846 Commits

Author SHA1 Message Date
Ján Tomko
8d3b239737 qemu: virtiofs: cache: use 'never' instead of 'none'
The new option style renamed one of the cache modes.

https://issues.redhat.com/browse/RHEL-50329

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-25 13:41:46 +02:00
Boris Fiuczynski
e62c26a20d qemu: add a monitor to /proc/$pid when killing times out
In cases when a QEMU process takes longer than the time sigterm and
sigkill are issued to kill the process do not simply fail and leave the
VM in state VIR_DOMAIN_SHUTDOWN until the daemon stops. Instead set up
an fd on /proc/$pid and get notified when the QEMU process finally has
terminated to cleanup the VM state.

Resolves: https://issues.redhat.com/browse/RHEL-28819
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-24 13:16:02 +02:00
Kristina Hanicova
e5eb64e9fd qemu_hotplug: Do not allow absent values in rom settings
If there are absent values in an already existing element
specifying rom settings, we simply use the old ones. This
behaviour is not desired, as users might think that deleting the
element from XML would delete the setting (because the hotplug
succeeds) - which does not happen. Because of that, we should not
accept an interface without elements that cannot be changed.

Therefore, we should not allow absent values for already existing
rom setting during hotplug.

Resolves: https://issues.redhat.com/browse/RHEL-7109
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-24 13:07:20 +02:00
Adam Julis
b53e9f834b virtiofs: rename member to 'openfiles' for clarity
New element 'openfiles' had confusing name. Since the patch with
this new element wasn't propagate yet, old name ('rlimit_nofile')
was changed.

...
<binary>
  <openfiles max='122333'/>
</binary>
...

Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-24 12:48:16 +02:00
Miroslav Los
c019350a76 security: AppArmor allow write when os loader readonly=no
Since libvirt commit 3ef9b51b10,
the pflash storage for the os loader file follows its read-only flag,
and qemu tries to open the file for writing if set so.

This patches virt-aa-helper to generate the VM's AppArmor rules
that allow this, using the same domain definition flag and default.

Signed-off-by: Miroslav Los <mirlos@cisco.com>
Tested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-07-19 16:03:05 +02:00
Andrea Bolognani
47d34ffb26 qemu: ROM firmware images are always readonly
By definition. Accordingly, filter them out when looking for
a read/write image.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-07-19 15:18:39 +02:00
Andrea Bolognani
f13b3f8098 qemu: Filter firmware images by type
If the configuration explicitly requests a specific type of
firmware image, be it pflash or ROM, we should ignore all images
that are not of that type.

If no specific type has been requested, of course, any type is
considered a match and the selection will be based upon the
other attributes.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-07-19 15:18:38 +02:00
Adam Julis
ea6c3ea2d5 qemu: virtiofs: format --rlimit-nofile
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/485
Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-17 13:18:11 +02:00
Adam Julis
562fc02ac1 conf: virtiofs: add rlimit_nofile element
Add an element to configure the rlimit nofile size:

...
<binary>
  <rlimit_nofile size='122333'/>
</binary>
...

Non-positive values are forbidden in 'domaincommon.rng'. Added separate
test file, created by modifying the 'vhost-user-fs-fd-memory.xml'.

Signed-off-by: Adam Julis <ajulis@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-17 13:17:13 +02:00
Martin Kletzander
239669049d vmx: Be even more lax when trying to comprehend serial ports
So much can happen in the fileName field of the VMX that the easiest
thing is to silently report a serial type="null".

This effectively reverts commits de81bdb8d4cd and 62c53db0421a, but
keeps the test files to show the fix is still in place.

There is one instance where an error gets reset, but since that is a
rare case on its own and on top of that does not happen in any of our
long-running daemons with a logfile that might get monitored it should
be fine to leave it there.

Resolves: https://issues.redhat.com/browse/RHEL-32182

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-16 12:41:37 +02:00
Andrea Bolognani
1ac1e4dae0 cpu_map: Add pauth Arm CPU feature
This CPU feature can be used to explicitly enable or disable
support for pointer authentication. By default, it will be
enabled if the host supports it.

https://issues.redhat.com/browse/RHEL-7044

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-15 13:08:11 +02:00
Jiri Denemark
bec903cae8 qemu: Don't leave beingDestroyed=true on inactive domain
Recent commit v10.4.0-87-gd9935a5c4f made a reasonable change to only
reset beingDestroyed back to false when vm->def->id is reset to make
sure other code can detect a domain is (about to become) inactive. It
even added a comment saying any caller of qemuProcessBeginStopJob is
supposed to call qemuProcessStop to clear beingDestroyed. But not every
caller really does so because they first call qemuProcessBeginStopJob
and then check whether a domain is still running. If not the
qemuProcessStop call is skipped leaving beingDestroyed=true. In case of
a persistent domain this may block incoming migrations of such domain as
the migration code would think the domain died unexpectedly (even though
it's still running).

The qemuProcessBeginStopJob function is a wrapper around
virDomainObjBeginJob, but virDomainObjEndJob was used directly for
cleanup. This patch introduces a new qemuProcessEndStopJob wrapper
around virDomainObjEndJob to properly undo everything
qemuProcessBeginStopJob did.

https://issues.redhat.com/browse/RHEL-43309

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-12 11:27:03 +02:00
Ján Tomko
d94b31a68a qemu: migration: allow migration for virtiofs
Allow migration if the "migrate-precopy" capability is present or
libvirt is not the one running the virtiofs daemon.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-10 12:32:23 +02:00
Ján Tomko
8dc04cafec qemu: do not use deprecated options for new virtiofsd
Use the to-be-introduced virtiofsd capability to mark whether
new options are safe to use.

Depends on:
https://gitlab.com/virtio-fs/virtiofsd/-/merge_requests/231

https://issues.redhat.com/browse/RHEL-7108

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-10 12:32:23 +02:00
Ján Tomko
730eaafaac qemu: fill capabilities for virtiofsd
Run the daemon with --print-capabilities first, to see what it supports.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-10 12:32:23 +02:00
Kshitij Jha
6d3955acf1 Include support for Vfio stats during Migration
As of now, libvirt supports few essential stats as
part of virDomainGetJobStats for Live Migration such
as memory transferred, dirty rate, number of iteration
etc. Currently it does not have support for the vfio
stats returned via QEMU. This patch adds support for that.

Signed-off-by: Kshitij Jha <kshitij.jha@nutanix.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-10 12:28:55 +02:00
Adam Julis
7a9e9dfb18 network: allow "modify" option for DNS-Txt records
The "modify" command allows to replace an existing record (its
text value). The primary key is the name of the record. If
duplicity or missing record detected, throw error.

Tests in networkxml2xmlupdatetest.c contain replacements of an
existing DNS-text record and failure due to non-existing record.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/639
Signed-off-by: Adam Julis <ajulis@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-10 10:29:00 +02:00
Adam Julis
cf934c87cc network: allow "modify" option for DNS-Srv records
The "modify" command allows to replace an existing Srv record
(some of its elements respectively: port, priority and weight).
The primary key used to choose the modify record is the remaining
parameters, only one of them is required. Not using some of these
parameters may cause duplicate records and error message. This
logic is there because of the previous implementation (Add and
Delete options) in the function.

Tests in networkxml2xmlupdatetest.c contain replacements of an
existing DNS-Srv record and failure due to non-existing record.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/639
Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-10 10:28:58 +02:00
Adam Julis
09a5d8165c network: allow "modify" option for DNS hostname
The "modify" command allows you to replace an existing record
(its hostname, sub-elements). IP address acts as the primary key.
If it is not found, the attempt ends with an error message. If
the XML contains a duplicate address, it will select the last
one.

Tests in networkxml2xmlupdatetest.c contain replacements of an
existing DNS-Host record and failure due to non-existing record.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/639
Signed-off-by: Adam Julis <ajulis@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-10 10:28:51 +02:00
Adam Julis
619a915862 domain_conf: comment not match the code below
The outdated comment refers to a non-existent member in the
virDomainObj structure.

Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-10 10:00:29 +02:00
Michal Privoznik
b5c54df901 virt-aa-helper: Drop needless comments
When generating paths for a domain specific AppArmor profile each
path undergoes a validation where it's matched against an array
of well known prefixes (among other things). Now, for
OVMF/AAVMF/... images we have a list and some entries have
comments to which type of image the entry belongs to. For
instance:

  "/usr/share/OVMF/",                  /* for OVMF images */
  "/usr/share/AAVMF/",                 /* for AAVMF images */

But these comments are pretty useless. The path itself already
gives away the image type. Drop them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
2024-07-10 09:25:32 +02:00
hongmianquan
0d3e962d47 security_manager: Remove redundant qemuSecurityGetNested() call
This commit removes the redundant call to qemuSecurityGetNested() in
qemuStateInitialize(). In qemuSecurityGetModel(), the first security manager
in the stack is already used by default, so this change helps to
simplify the code.

Signed-off-by: hongmianquan <hongmianquan@bytedance.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-09 13:24:57 +02:00
hongmianquan
790b4d8067 security_manager: Ensure top lock is acquired before nested locks
Fix libvirtd hang since fork() was called while another thread had
security manager locked.

We have the stack security driver, which internally manages other security drivers,
just call them "top" and "nested".

We call virSecurityStackPreFork() to lock the top one, and it also locks
and then unlocks the nested drivers prior to fork. Then in qemuSecurityPostFork(),
it unlocks the top one, but not the nested ones. Thus, if one of the nested
drivers ("dac" or "selinux") is still locked, it will cause a deadlock. If we always
surround nested locks with top lock, it is always secure. Because we have got top lock
before fork child libvirtd.

However, it is not always the case in the current code, We discovered this case:
the nested list obtained through the qemuSecurityGetNested() will be locked directly
for subsequent use, such as in virQEMUDriverCreateCapabilities(), where the nested list
is locked using qemuSecurityGetDOI, but the top one is not locked beforehand.

The problem stack is as follows:

libvirtd thread1          libvirtd thread2          child libvirtd
        |                           |                       |
        |                           |                       |
virsh capabilities      qemuProcessLanuch                   |
        |                           |                       |
        |                       lock top                    |
        |                           |                       |
    lock nested                     |                       |
        |                           |                       |
        |                           fork------------------->|(nested lock held by thread1)
        |                           |                       |
        |                           |                       |
    unlock nested               unlock top              unlock top
                                                            |
                                                            |
                                                qemuSecuritySetSocketLabel
                                                            |
                                                            |
                                                    lock nested (deadlock)

In this commit, we ensure that the top lock is acquired before the nested lock,
so during fork, it's not possible for another task to acquire the nested lock.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1303031

Signed-off-by: hongmianquan <hongmianquan@bytedance.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-09 13:22:26 +02:00
Miroslav Los via Devel
8515a178f8 qemuDomainChangeNet: check virtio options for non-virtio models
In a domain created with an interface with a <driver> subelement,
the device contains a non-NULL virDomainVirtioOptions struct, even
for non-virtio NIC models. The subelement need not be present again
after libvirt restarts, or when the interface is passed to clients.

When clients such as virsh domif-setlink put back the modified
interface XML, the new device's virtio attribute is NULL. This may
fail the equality checks for virtio options in qemuDomainChangeNet,
depending on whether libvird was restarted since define or not.

This patch modifies the check for non-virtio models, to ignore olddev
value of virtio (assumed valid), and to allow either NULL or a struct
with all values ABSENT in the new virtio options.

Signed-off-by: Miroslav Los <mirlos@cisco.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-09 13:20:05 +02:00
Martin Kletzander
db622081e0 vmx: Do not require all ID data for VMWare Distributed Switch
Similarly to commit 2482801608 we can safely ignore connectionId,
portId and portgroupId in both XML and VMX as they are only a blind
pass-through between XML and VMX and an ethernet without such parameters
was spotted in the wild.  On top of that even our documentation says the
whole VMWare Distrubuted Switch configuration is a best-effort.

Resolves: https://issues.redhat.com/browse/RHEL-46099

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-07-08 15:18:22 +02:00
Michal Privoznik
893800be49 virt-aa-helper: Allow RO access to /usr/share/edk2-ovmf
When binary version of edk2 is distributed, the files reside
under /usr/share/edk2-ovmf as can be seen from Gentoo's ebuild
[1]. Allow virt-aa-helper to generate paths under that dir.

1: https://gitweb.gentoo.org/repo/gentoo.git/tree/sys-firmware/edk2-ovmf-bin/edk2-ovmf-bin-202202.ebuild
Resolves: https://bugs.gentoo.org/911786
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-07-07 07:24:56 +02:00
Daniel P. Berrangé
e40a533118 qemu: set swtpm log level parameter
This wires up the emulator 'debug' parameter to control the
/usr/bin/swtpm 'level' parameter for logging.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-07-05 14:43:15 +01:00
Daniel P. Berrangé
5c77ecd5f3 conf: add support for 'debug' parameter on TPM emulator
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-07-05 14:43:15 +01:00
John Levon
9559130693 test_driver: support VIR_DOMAIN_AFFECT_LIVE in testUpdateDeviceFlags()
Pick up some more of the qemu_driver.c code so this function supports
both CONFIG and LIVE updates.

Note that qemuDomainUpdateDeviceFlags() passed vm->def to
virDomainDeviceDefParse() for the VIR_DOMAIN_AFFECT_CONFIG case, which
is technically incorrect; in the test driver code we'll fix this.

Signed-off-by: John Levon <john.levon@nutanix.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-04 15:29:33 +02:00
Rayhan Faizel
1ebb892472 conf: Fix out-of-bounds write during cleanup of virDomainNumaDefNodeDistanceParseXML
mem_nodes[i].ndistances is written outside the loop causing an out-of-bounds
write leading to heap corruption.

While we are at it, the entire cleanup portion can be removed as it can be
handled in virDomainNumaFree. One instance of VIR_FREE is also removed and
replaced with g_autofree.

This patch also adds a testcase which would be picked up by ASAN, if this
portion regresses.

Fixes: 742494eed8
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-04 14:58:15 +02:00
Tim Wiederhake
f67b12ba35 cpu_map: Ignore feature "kvm-asyncpf-vmexit"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:24 +02:00
Tim Wiederhake
9c46fb8d3d cpu_map: Add missing feature "vmx-nested-exception"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:20 +02:00
Tim Wiederhake
7e395b4ef0 cpu_map: Add missing feature "rfds-clear"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:18 +02:00
Tim Wiederhake
3ff2d2d502 cpu_map: Add missing feature "rfds-no"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:16 +02:00
Tim Wiederhake
aba89e2f98 cpu_map: Add missing feature "succor"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:13 +02:00
Tim Wiederhake
62dc5d44a7 cpu_map: Add missing feature "overflow-recov"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:11 +02:00
Tim Wiederhake
bcb4b246a9 cpu_map: Add missing feature "lam"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:09 +02:00
Tim Wiederhake
4b556699c6 cpu_map: Add missing feature "wrmsrns"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:06 +02:00
Tim Wiederhake
261fe98dee cpu_map: Add missing feature "lkgs"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:04 +02:00
Tim Wiederhake
4d981bdb2c cpu_map: Add missing feature "fred"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:35:36 +02:00
Adam Julis
c3302ceb1d qemuDomainChangeNet: forbid changing portgroup
Changing the postgroup attribute caused unexpected behavior.
Although it can be implemented, it has a non-trivial solution.
No requirement or use has yet been found for implementing this
feature, so it has been disabled for hot-plug.

Resolves: https://issues.redhat.com/browse/RHEL-7299
Signed-off-by: Adam Julis <ajulis@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 09:59:10 +02:00
Rayhan Faizel
70e826ec6a conf: Fix rawio/sgio checks for non-scsi hostdev devices
The current hostdev parsing logic sets rawio or sgio even if the hostdev type
is not 'scsi'. The rawio field in virDomainHostdevSubsysSCSI overlaps with
wwpn field in virDomainHostdevSubsysSCSIVHost, consequently setting a bogus
pointer value such as 0x1 or 0x2 from virDomainHostdevSubsysSCSIVHost's
point of view. This leads to a segmentation fault when it attempts to free
wwpn.

While setting sgio does not appear to crash, it shares the same flawed logic
as setting rawio.

Instead, we ensure these are set only after the hostdev type check succeeds.
This patch also adds two test cases to exercise both scenarios.

Fixes: bdb95b520c
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 09:54:43 +02:00
John Levon
738b201aad test_driver: add testUpdateDeviceFlags implementation
Add basic coverage of device update; for now, only support disk updates
until other types are needed or tested.

Signed-off-by: John Levon <john.levon@nutanix.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-02 16:06:19 +02:00
Michal Privoznik
cf7d495324 qemu: Drop _virQEMUDriver::hostFips
The 'hostFips' member of _virQEMUDriver struct is not used
really, due to previous cleanups. Drop it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-02 09:14:24 +02:00
Michal Privoznik
ce48d584cc qemu_capabilities: Retire QEMU_CAPS_VXHS
The support for VXHS device was removed in QEMU commit
v5.1.0-rc1~16^2~10. Since we require QEMU-5.2.0 at least there's
no QEMU that has the device and thus the corresponding capability
can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-02 09:14:23 +02:00
Michal Privoznik
295eb1b3d8 qemu_capabilities: Retire QEMU_CAPS_ENABLE_FIPS
The capability is no longer used. Retire it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-02 09:14:22 +02:00
Michal Privoznik
8cf81de8bf qemu_capabilities: Drop version check for QEMU_CAPS_ENABLE_FIPS and QEMU_CAPS_NETDEV_USER
Now that the minimal required version of QEMU is 5.2.0 the
conditional setting of QEMU_CAPS_ENABLE_FIPS and
QEMU_CAPS_NETDEV_USER is effectively a dead code. Drop it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-02 09:14:20 +02:00
Michal Privoznik
073bf16784 qemu_capabilities: Require QEMU-5.2.0 or newer
According to repology.org and/or distro repos these are the version of QEMU:

     CentOS Stream 9: qemu-kvm-9.0.0
           Debian 11: qemu-5.2.0
           Fedora 39: qemu-8.3.1
  openSUSE Leap 15.3: qemu-5.2.0
              RHEL-8: qemu-6.2.0
        Ubuntu 22.04: qemu-6.2.0

Since the minimal version is 5.2.0 we can bump from 4.2.0 to
5.2.0.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-02 09:14:18 +02:00
Michal Privoznik
8f34fd0c4c qemu_domain: Set 'passt' net backend if 'default' is unsupported
It may happen that QEMU is compiled without SLIRP but with
support for passt. In such case it is acceptable to alter user
provided configuration and switch backend to passt as it offers
all the features as SLIRP.

Resolves: https://issues.redhat.com/browse/RHEL-45518
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:40:06 +02:00
Michal Privoznik
bd6060d1c3 qemu_validate: Use domaincaps to validate supported net backend type
Now that the logic for detecting supported net backend types has
been moved to domain capabilities generation, we can just use it
when validating net backend type. Just like we do for device
models and so on.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:39:10 +02:00
Michal Privoznik
751a327423 conf: Accept 'default' backend type for <interface type='user'/>
After previous commits, domain capabilities XML reports basically
two possible values for backend type: 'default' and 'passt'.
Despite its misleading name, 'default' really means 'use
hypervisor's builtin SLIRP'. Since it's reported in domain
capabilities as a value accepted, make our parser and XML schema
accept it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:38:21 +02:00
Michal Privoznik
6a0f45a9e0 qemu_capabilities: Fill supported net backend types
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:37:27 +02:00
Michal Privoznik
2d3a42cb7c domain_capabilities: Introduce netdev capabilities
If mgmt apps on top of libvirt want to make a decision on the
backend type for <interface type='user'/> (e.g. whether past is
supported) we currently offer them no way to learn this fact.
Domain capabilities were invented exactly for this reason. Report
supported net backend types there.

Now, because of backwards compatibility, specifying no backend
type (which translates to VIR_DOMAIN_NET_BACKEND_DEFAULT) means
"use hyperviosr's builtin SLIRP". That behaviour can not be
changed. But it may happen that the hypervisor has no support for
SLIRP. So we have to report it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:36:28 +02:00
Michal Privoznik
73fc20e262 qemu_validate: Validate net backends against QEMU caps
Now that we have a capability for each domain net backend we can
start validating user's selection against QEMU capabilities.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:33:14 +02:00
Michal Privoznik
e28bc15f09 qemu_capabilities: Introduce QEMU_CAPS_NETDEV_USER
Since -netdev user can be disabled during QEMU compilation, we
can't blindly expect it to just be there. We need a capability
that tracks its presence.

For qemu-4.2.0 we are not able to detect the capability so do the
next best thing - assume the capability is there. This is
consistent with our current behaviour where we blindly assume the
capability, anyway.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:32:16 +02:00
Michal Privoznik
e42f9e40b9 libvirt_private.syms: Export virDomainNetBackendType enum handlers
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:28:03 +02:00
Pavel Hrdina
67fdc636bf vircgroup: fix g_variant_new_parsed format string causing abort
The original code was incorrect and never tested because at the time of
implementing it the cgroup file `io.weight` was not available.

Resolves: https://issues.redhat.com/browse/RHEL-45185
Introduced-by: 9c1693eff4
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-06-28 16:51:33 +02:00
Jon Kohler
76e2dae01a qemu: fix switchover-ack regression for old qemu
When enabling switchover-ack on qemu from libvirt, the .party value
was set to both source and target; however, qemuMigrationParamsCheck()
only takes that into account to validate that the remote side of the
migration supports the flag if it is marked optional or auto/always on.

In the case of switchover-ack, when enabled on only the dst and not
the src, the migration will fail if the src qemu does not support
switchover-ack, as the dst qemu will issue a switchover-ack msg:
qemu/migration/savevm.c ->
  loadvm_process_command ->
    migrate_send_rp_switchover_ack(mis) ->
      migrate_send_rp_message(mis, MIG_RP_MSG_SWITCHOVER_ACK, 0, NULL)

Since the src qemu doesn't understand messages with header_type ==
MIG_RP_MSG_SWITCHOVER_ACK, qemu will kill the migration with error:
  qemu-kvm: RP: Received invalid message 0x0007 length 0x0000
  qemu-kvm: Unable to write to socket: Bad file descriptor

Looking at the original commit [1] for optional migration capabilities,
it seems that the spirit of optional handling was to enhance a given
existing capability where possible. Given that switchover-ack
exclusively depends on return-path, adding it as optional to that cap
feels right.

[1] 61e34b0856 ("qemu: Add support for optional migration capabilities")

Fixes: 1cc7737f69 ("qemu: add support for qemu switchover-ack")

Signed-off-by: Jon Kohler <jon@nutanix.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Avihai Horon <avihaih@nvidia.com>
Cc: Jiri Denemark <jdenemar@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: YangHang Liu <yanghliu@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-06-28 08:50:12 +02:00
Michal Privoznik
ea73fcb3e3 remote_daemon_dispatch: Unref sasl session when closing client connection
In ideal world, where clients close connection gracefully their
SASL session is freed in virNetServerClientDispose() as it's
stored in client->sasl. Unfortunately, if client connection is
closed prematurely (e.g. the moment virsh asks for credentials),
the _virNetServerClient member is never set and corresponding
SASL session is never freed. The handler is still stored in
client private data, so free it in remoteClientCloseFunc().

  20,862 (288 direct, 20,574 indirect) bytes in 3 blocks are definitely lost in loss record 1,763 of 1,772
     at 0x50390C4: g_type_create_instance (in /usr/lib64/libgobject-2.0.so.0.7800.6)
     by 0x501BDAF: g_object_new_internal.part.0 (in /usr/lib64/libgobject-2.0.so.0.7800.6)
     by 0x501D43D: g_object_new_with_properties (in /usr/lib64/libgobject-2.0.so.0.7800.6)
     by 0x501E318: g_object_new (in /usr/lib64/libgobject-2.0.so.0.7800.6)
     by 0x49BAA63: virObjectNew (virobject.c:252)
     by 0x49BABC6: virObjectLockableNew (virobject.c:274)
     by 0x4B0526C: virNetSASLSessionNewServer (virnetsaslcontext.c:230)
     by 0x18EEFC: remoteDispatchAuthSaslInit (remote_daemon_dispatch.c:3696)
     by 0x15E128: remoteDispatchAuthSaslInitHelper (remote_daemon_dispatch_stubs.h:74)
     by 0x4B0FA5E: virNetServerProgramDispatchCall (virnetserverprogram.c:423)
     by 0x4B0F591: virNetServerProgramDispatch (virnetserverprogram.c:299)
     by 0x4B18AE3: virNetServerProcessMsg (virnetserver.c:135)

Resolves: https://issues.redhat.com/browse/RHEL-22574
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-27 17:02:51 +02:00
Michal Privoznik
fbe97ee17d qemu_validate: Use domaincaps to validate supported launchSecurity type
Now that the logic for detecting supported launchSecurity types
has been moved to domain capabilities generation, we can just use
it when validating launchSecurity type. Just like we do for
device models and so on.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:46:08 +02:00
Michal Privoznik
66df7992d8 qemu: Fill launchSecurity in domaincaps
The inspiration for these rules comes from
qemuValidateDomainDef().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:46:05 +02:00
Michal Privoznik
d460e17282 domcaps: Report launchSecurity
In order to learn what types of <launchSecurity/> are supported
users can turn to domain capabilities and find <sev/> and
<s390-pv/> elements. While these may expose some additional info
on individual launchSecurity types, we are lacking clean
enumeration (like we do for say device models). And given that
SEV and SEV SNP share the same basis (info found under <sev/> is
applicable to SEV SNP too) we have no other way to report SEV SNP
support.

Therefore, report supported launchSecurity types in domain
capabilities.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:46:03 +02:00
Michal Privoznik
d00816209e qemu_capabilities: Probe SEV capabilities even for QEMU_CAPS_SEV_SNP_GUEST
While it's very unlikely to have QEMU that supports SEV-SNP but
doesn't support plain SEV, for completeness sake we ought to
query SEV capabilities if QEMU supports either. And similarly to
QEMU_CAPS_SEV_GUEST we need to clear the capability if talking to
QEMU proves SEV is not really supported.

This in turn removes the 'sev-snp-guest' capability from one of
our test cases as Peter's machine he uses to refresh capabilities
is not SEV capable. But that's okay. It's consistent with
'sev-guest' capability.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:46:00 +02:00
Michal Privoznik
3a6ca064ca libvirt_private.syms: Export virDomainLaunchSecurity enum handlers
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:45:54 +02:00
Rayhan Faizel
9b0606ef8e qemu_block: Validate number of hosts for iSCSI disk device
An iSCSI device with zero hosts will result in a segmentation fault. This patch
adds a check for the number of hosts, which must be one in the case of iSCSI.

Minimal reproducing XML:

<domain type='qemu'>
    <name>MyGuest</name>
    <uuid>4dea22b3-1d52-d8f3-2516-782e98ab3fa0</uuid>
    <os>
        <type arch='x86_64'>hvm</type>
    </os>
    <memory>4096</memory>
    <devices>
        <disk type='network'>
            <source name='dummy' protocol='iscsi'/>
            <target dev='vda'/>
        </disk>
    </devices>
</domain>

Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-06-25 10:05:49 +02:00
Jon Kohler
1cc7737f69 qemu: add support for qemu switchover-ack
Add plumbing for QEMU's switchover-ack migration capability, which
helps lower the downtime during VFIO migrations. This capability is
enabled by default as long as both the source and destination support
it.

Note: switchover-ack depends on the return path capability, so this may
not be used when VIR_MIGRATE_TUNNELLED flag is set.

Extensive details about the qemu switchover-ack implementation are
available in the qemu series v6 cover letter [1] where the highlight is
the extreme reduction in guest visible downtime. In addition to the
original test results below, I saw a roughly ~20% reduction in downtime
for VFIO VGPU devices at minimum.

  === Test results ===

  The below table shows the downtime of two identical migrations. In the
  first migration swithcover ack is disabled and in the second it is
  enabled. The migrated VM is assigned with a mlx5 VFIO device which has
  300MB of device data to be migrated.

  +----------------------+-----------------------+----------+
  |    Switchover ack    | VFIO device data size | Downtime |
  +----------------------+-----------------------+----------+
  |       Disabled       |         300MB         |  1900ms  |
  |       Enabled        |         300MB         |  420ms   |
  +----------------------+-----------------------+----------+

  Switchover ack gives a roughly 4.5 times improvement in downtime.
  The 1480ms difference is time that is used for resource allocation for
  the VFIO device in the destination. Without switchover ack, this time is
  spent when the source VM is stopped and thus the downtime is much
  higher. With switchover ack, the time is spent when the source VM is
  still running.

[1] https://patchwork.kernel.org/project/qemu-devel/cover/20230621111201.29729-1-avihaih@nvidia.com/

Signed-off-by: Jon Kohler <jon@nutanix.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Avihai Horon <avihaih@nvidia.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: YangHang Liu <yanghliu@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-06-25 09:51:00 +02:00
Jiri Denemark
e622970c87 qemu: Fix migration with disabled vmx-* CPU features
When starting a domain on a host which lacks a vmx-* CPU feature which
is expected to be enabled by the CPU model specified in the domain XML,
libvirt properly marks such feature as disabled in the active domain
XML. But migrating the domain to a similar host which lacks the same
vmx-* feature will fail with libvirt reporting the feature as missing.
This is because of a bug in the hack ensuring backward compatibility
libvirt running on the destination thinks the missing feature is
expected to be enabled.

https://issues.redhat.com/browse/RHEL-40899

Fixes: v10.1.0-85-g5fbfa5ab8a
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-06-25 09:41:16 +02:00
Jonathon Jongsma
af437d2d64 qemu: Don't specify vfio-pci.ramfb when ramfb is false
Commit 7c8e606b64 attempted to fix
the specification of the ramfb property for vfio-pci devices, but it
failed when ramfb is explicitly set to 'off'. This is because only the
'vfio-pci-nohotplug' device supports the 'ramfb' property. Since we use
the base 'vfio-pci' device unless ramfb is enabled, attempting to set
the 'ramfb' parameter to 'off' this will result in an error like the
following:

  error: internal error: QEMU unexpectedly closed the monitor
  (vm='rhel'): 2024-06-06T04:43:22.896795Z qemu-kvm: -device
  {"driver":"vfio-pci","host":"0000:b1:00.4","id":"hostdev0","display":"on
  ","ramfb":false,"bus":"pci.7","addr":"0x0"}: Property 'vfio-pci.ramfb'
  not found.

This also more closely matches what is done for mdev devices.

Resolves: https://issues.redhat.com/browse/RHEL-28808

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-24 08:55:50 -05:00
Adam Julis
3a9095976e qemuDomainDiskChangeSupported: Fill in missing check
The attribute 'discard_no_unref' of <disk/> is not allowed to be
changed while the virtual machine is running.

Resolves: https://issues.redhat.com/browse/RHEL-37542
Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-24 11:14:56 +02:00
Laine Stump
43a0881274 network: allow for forward dev to be a transient interface
A user reported that if they set <forward mode='nat|route' dev='blah'>
starting the network would fail if the device 'blah' didn't already
exist.

This is caused by using "iif" and "oif" in nftables rules to check for
the forwarding device - these two commands work by saving the named
interface's ifindex (an unsigned integer) when the rule is added, and
comparing it to the ifindex associated with the packet's path at
runtime. This works great if the interface both 1) exists when the
rule is added, and 2) is never deleted and re-created after the rule
is added (since it would end up with a different ifindex).

When checking for the network's bridge device, it is okay for us to
use "iif" and "oif", because the bridge device is created before the
firewall rules are added, and will continue to exist until just after
the firewall rules are deleted when the network is shutdown.

But since the forward device might be deleted/re-added during the
lifetime of the network's firewall rules, we must instead us "oifname"
and "iifname" - these are much less efficient than "Xif" because they
do a string compare of the interface's name rather than just comparing
two integers (ifindex), but they don't require the interface to exist
when the rule is added, and they can properly cope with the named
interface being deleted and re-added later.

Fixes: a4f38f6ffe
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 06:52:57 -04:00
Michal Privoznik
da082e5927 domain_validate: Add missing 'break' in virDomainDefLaunchSecurityValidate()
A few commits ago (v10.4.0-101-gc65eba1f57) I've introduced
virDomainDefLaunchSecurityValidate() and a switch() statement in
it. Some cases are empty but are lacking 'break' statement which
is not valid. Provide missing 'break' statement.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-21 10:37:35 +02:00
Michal Privoznik
58b5219961 qemu_firmware: Pick the right firmware for SEV-SNP guests
The firmware descriptors have 'amd-sev-snp` feature which
describes whether firmware is suitable for SEV-SNP guests.
Provide necessary implementation to detect the feature and pick
the right firmware if guest is SEV-SNP enabled.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:59:04 +02:00
Michal Privoznik
a1d850b300 qemu: Build cmd line for SEV-SNP
Pretty straightforward as qemu has 'sev-snp-guest' object which
attributes maps pretty much 1:1 to our XML model. Except for
@vcek where QEMU has 'vcek-disabled`, an inverted boolean, while
we model it as virTristateBool. But that's easy to map too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:58:10 +02:00
Michal Privoznik
c65eba1f57 conf: Introduce SEV-SNP support
SEV-SNP is an enhancement of SEV/SEV-ES and thus it shares some
fields with it. Nevertheless, on XML level, it's yet another type
of <launchSecurity/>.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:56:57 +02:00
Michal Privoznik
1abcba9d4d qemu_capabilities: Introduce QEMU_CAPS_SEV_SNP_GUEST
This capability tracks sev-snp-guest object availability.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:56:18 +02:00
Michal Privoznik
be26d0ebbe qemu: Report snp-policy in virDomainGetLaunchSecurityInfo()
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:36:04 +02:00
Michal Privoznik
914b986275 qemu_monitor: Allow querying SEV-SNP state in 'query-sev'
In QEMU commit v9.0.0-1155-g59d3740cb4 the return type of
'query-sev' monitor command changed to accommodate SEV-SNP. Even
though we currently support launching plain SNP guests, this will
soon change.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:35:32 +02:00
Michal Privoznik
7d16c296e3 src: Convert some _virDomainSecDef::sectype checks to switch()
In a few instances there is a plain if() check for
_virDomainSecDef::sectype. While this works perfectly for now,
soon there'll be another type and we can utilize compiler to
identify all the places that need adaptation. Switch those if()
statements to switch().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:32:09 +02:00
Michal Privoznik
a44a43361f Drop needless typecast to virDomainLaunchSecurity
The sectype member of _virDomainSecDef struct is already declared
as of virDomainLaunchSecurity type. There's no need to typecast
it to the very same type when passing it to switch().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:31:33 +02:00
Michal Privoznik
faa3548ed5 conf: Separate SEV formatting into a function
To avoid convolution of switch() inside of virDomainSecDefFormat() even
more (as new sectypes are added), move formatting into a separate
function.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:30:24 +02:00
Michal Privoznik
d2cad18ca3 conf: Move some members of virDomainSEVDef into virDomainSEVCommonDef
Some parts of SEV are to be shared with SEV SNP. In order to
reuse XML parsing / formatting code cleanly, let's move those
common bits into a new struct (virDomainSEVCommonDef) and adjust
rest of the code.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:28:54 +02:00
Michal Privoznik
66efdfabd9 qemu_monitor_json: Report error in error paths in SEV related code
While working on qemuMonitorJSONGetSEVMeasurement() and
qemuMonitorJSONGetSEVInfo() I've noticed that if these functions
fail, they do so without appropriate error set. Fill in error
reporting.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:25:32 +02:00
Peter Krempa
e6b94cba7e qemu: migration: Preserve error across qemuDomainSetMaxMemLock() on error paths
When a VM terminates itself while it's being migrated in running state
libvirt would report wrong error:

 error: cannot get locked memory limit of process 2502057: No such file or directory

rather than the proper error:

 error: operation failed: domain is not running

Remember the error on error paths in qemuMigrationSrcConfirmPhase and
qemuMigrationSrcPerformPhase.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
e00a58c10a qemuMigrationSrcRun: Re-check whether VM is active before accessing job data
'qemuProcessStop()' clears the 'current' job data. While the code under
the 'error' label in 'qemuMigrationSrcRun()' does check that the VM is
active before accessing the job, it also invokes multiple helper
functions to clean up the migration including
'qemuMigrationSrcNBDCopyCancel()' which calls 'qemuDomainObjWait()'
invalidating the result of the liveness check as it unlocks the VM.

Duplicate the liveness check and explain why. The rest of the code e.g.
accessing the monitor is safe as 'qemuDomainEnterMonitorAsync()'
performs a liveness check. The cleanup path just ignores the return
values of those functions.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
9243e87820 qemu: migration: Inline 'qemuMigrationDstFinishResume()'
The function is a pointless wrapper on top of
qemuMigrationDstWaitForCompletion.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
a52e125d56 qemu: migration: Properly check for live VM after qemuDomainObjWait()
Similarly to the one change in commit 4d1a1fdffd
we should be checking that the VM is not being yet destroyed if we've
invoked qemuDomainObjWait().

Use the new helper qemuDomainObjIsActive().

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
9eb33b7f03 qemu: domain: Introduce qemuDomainObjIsActive helper
The helper checks whether VM is active including the internal qemu
state. This helper will become useful in situations when an async job
is in use as VIR_JOB_DESTROY can run along async jobs thus both checks
are necessary.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
d9935a5c4f qemu: process: Ensure that 'beingDestroyed' gets cleared only after VM id is reset
Prevent the possibility that a VM could be considered as alive while
inside qemuProcessStop.

A recently fixed bug which unlocked the domain object while inside
qemuProcessStop showed that there's possibility to confuse the state of
the VM to be considered active while 'qemuProcessStop' is processing
shutdown of the VM. Ensure that this doesn't happen by clearing the
'beingDestroyed' flag only after the VM id is cleared.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
3865410e7f qemuProcessStop: Move code not depending on 'vm->def->id' after reset of the ID
There are few function calls done while cleaning up a stopped VM which
do require the old VM id, to e.g. clean up paths containing the 'short'
domain name in the path.

Anything else, which doesn't strictly require it can be moved after
clearing the 'id' in order to decrease likelyhood of potential bugs.

This patch moves all the code which does not require the 'id' (except
for the log entry and closing the monitor socket) after the statement
clearing the id and adds a comment explaining that anything in the
section must not unlock the VM object.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
d29e0f3d4a qemuProcessStop: Prevent crash when qemuDomainObjStopWorker() unlocks the VM
'qemuDomainObjStopWorker()' which is meant to dispose of the event loop
thread for the monitor unlocks the VM object while disposing the thread
to prevent possible deadlocks with events waiting on the monitor thread.

Unfortunately 'qemuDomainObjStopWorker()' is called *before* the VM is
marked as inactive by clearing 'vm->def->id', but at the same time it's
no longer marked as 'beingDestroyed' when we're inside
'qemuProcessStop()'.

If 'vm' would be kept locked this wouldn't be a problem. Same way it's
not a problem for anything that uses non-ASYNC VM jobs, or when the
monitor is accessed in an async job, as the 'destroy' job interlocks
with those.

It is a problem for code inside an async job which uses
'qemuDomainObjWait()' though. The API contract of qemuDomainObjWait()
ensures the caller that the VM on successful return from it, but in this
specific reason it's not the case, as both 'beingDestroyed' is already
false, and 'vm->def->id' is not yet cleared.

To fix the issue move the 'qemuDomainObjStopWorker()' call *after*
clearing 'vm->def->id' and also add a note stating what the function is
doing.

Fixes: 860a999802
Closes: https://gitlab.com/libvirt/libvirt/-/issues/640
Reported-by: luzhipeng <luzhipeng@cestc.cn>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:21 +02:00
Peter Krempa
da8d97e4e2 qemuDomainObjWait: Add documentation
Document why this function exists and meaning of return values.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:52:55 +02:00
Peter Krempa
f9ad21996d qemuDomainDeviceBackendChardevForeach: Fix typo in comment
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:52:54 +02:00
Peter Krempa
b4423a753b qemuDomainDiskPrivateDispose: Prevent dangling 'disk' pointer in blockjob data
Clear the 'disk' member of 'blockjob' as we're freeing the disk object
at this point. While this should not normally happen it was observed
when other bug allowed the VM to be cleared while other threads didn't
yet finish.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:52:54 +02:00
Peter Krempa
737f897c29 qemuBlockJobProcessEventConcludedBackup: Handle potentially NULL 'job->disk'
Similarly to other blockjob handlers, if there's no disk associated with
the blockjob the handler needs to behave correctly. This is needed as
the disk might have been de-associated on unplug or other operations.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:52:54 +02:00
Boris Fiuczynski
09cc83dcf6 nodedev: add ccw device state and remove fencing
Instead of fencing offline ccw devices add the state to the ccw
capability.

Resolves: https://issues.redhat.com/browse/RHEL-39497
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:38:46 +02:00
Boris Fiuczynski
69d8a327f1 nodedev: prevent invalid DASD node object creation
Prevent the creation of a new DASD node object when the device does not
exist.

Resolves: https://issues.redhat.com/browse/RHEL-39497
Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:34:54 +02:00
Boris Fiuczynski
e9c23d906f nodedev: improve DASD detection
In newer DASD driver versions the ID_TYPE tag is supported. This tag is
missing after a system reboot but when the ccw device is set offline and
online the tag is included. To fix this version independently we need to
check if devices detected as type disk is actually a DASD to maintain
the node object consistency and not end up with multiple node objects
for DASDs.

Resolves: https://issues.redhat.com/browse/RHEL-39497
Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:34:19 +02:00
Boris Fiuczynski
4062440b4b nodedev: refactor storage type fixup
Refactor the storage type fixup into a reusable method.

Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:33:32 +02:00
Michal Privoznik
43d2edc08f virnetworkobj: Free fwRemoval before setting another one in virNetworkObjSetFwRemoval()
The virNetworkObjSetFwRemoval() function is called at least two
times when there's a network running and network driver
initializes:

1) when loading state XMLs:
  #0  virNetworkObjSetFwRemoval (obj=0x7fffd4028250, fwRemoval=0x7fffd4020ad0) at ../src/conf/virnetworkobj.c:258
  #1  0x00007ffff7a69c68 in virNetworkLoadState (...) at ../src/conf/virnetworkobj.c:952
  #2  0x00007ffff7a6a35d in virNetworkObjLoadAllState (...) at ../src/conf/virnetworkobj.c:1072
  #3  0x00007ffff7f9625f in networkStateInitialize (...) at ../src/network/bridge_driver.c:624

2) when firewall rules are being reloaded:
  #0  virNetworkObjSetFwRemoval (obj=0x7fffd4028250, fwRemoval=0x7fffd402e5b0) at ../src/conf/virnetworkobj.c:258
  #1  0x00007ffff7f997b4 in networkReloadFirewallRulesHelper (obj=0x7fffd4028250, opaque=0x0) at ../src/network/bridge_driver.c:1703
  #2  0x00007ffff7a6b09b in virNetworkObjListForEachHelper (payload=0x7fffd4028250, ...) at ../src/conf/virnetworkobj.c:1414
  #3  0x00007ffff79287b6 in virHashForEachSafe (...) at ../src/util/virhash.c:387
  #4  0x00007ffff7a6b119 in virNetworkObjListForEach (...) at ../src/conf/virnetworkobj.c:1441
  #5  0x00007ffff7f99978 in networkReloadFirewallRules (...) at ../src/network/bridge_driver.c:1742
  #6  0x00007ffff7f962f2 in networkStateInitialize (...) at ../src/network/bridge_driver.c:645

Since virNetworkObjSetFwRemoval() does not free the object stored
in the first call, the second call just overwrites the stored
pointer leading to a memory leak:

  5,530 (48 direct, 5,482 indirect) bytes in 1 blocks are definitely lost in loss record 1,863 of 1,880
     at 0x4848C43: calloc (vg_replace_malloc.c:1595)
     by 0x4F1E979: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.7800.6)
     by 0x4976E32: virFirewallNew (virfirewall.c:118)
     by 0x4979BA9: virFirewallParseXML (virfirewall.c:1071)
     by 0x4ABEB1E: virNetworkLoadState (virnetworkobj.c:938)
     by 0x4ABF35C: virNetworkObjLoadAllState (virnetworkobj.c:1072)
     by 0x4E9A25E: networkStateInitialize (bridge_driver.c:624)
     by 0x4CB1FA6: virStateInitialize (libvirt.c:665)
     by 0x15A6C6: daemonRunStateInit (remote_daemon.c:611)
     by 0x49E69F0: virThreadHelper (virthread.c:256)
     by 0x532B428: start_thread (in /lib64/libc.so.6)
     by 0x5397373: clone (in /lib64/libc.so.6)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-19 16:31:23 +02:00
Michal Privoznik
be1e745cd2 virfirewall: Fir a memleak in virFirewallParseXML()
As a part of parsing XML, virFirewallParseXML() calls
virXMLNodeContentString() and then passes the return value
further. But virXMLNodeContentString() is documented so that it's
the caller's responsibility to free the returned string, which
virFirewallParseXML() never does. This leads to a memory leak:

  14,300 bytes in 220 blocks are definitely lost in loss record 1,879 of 1,891
     at 0x4841858: malloc (vg_replace_malloc.c:442)
     by 0x5491E3C: xmlBufCreateSize (in /usr/lib64/libxml2.so.2.12.6)
     by 0x54C2401: xmlNodeGetContent (in /usr/lib64/libxml2.so.2.12.6)
     by 0x49F7791: virXMLNodeContentString (virxml.c:354)
     by 0x4979F25: virFirewallParseXML (virfirewall.c:1134)
     by 0x4ABEB1E: virNetworkLoadState (virnetworkobj.c:938)
     by 0x4ABF35C: virNetworkObjLoadAllState (virnetworkobj.c:1072)
     by 0x4E9A25E: networkStateInitialize (bridge_driver.c:624)
     by 0x4CB1FA6: virStateInitialize (libvirt.c:665)
     by 0x15A6C6: daemonRunStateInit (remote_daemon.c:611)
     by 0x49E69F0: virThreadHelper (virthread.c:256)
     by 0x532B428: start_thread (in /lib64/libc.so.6)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-19 16:31:23 +02:00
Martin Kletzander
025925a901 vmx: Accept more serial variations
Commit 23c4794488 added parsing of serial ports connected to vspc, but
the VM can also have a network serial port with an empty filename or no
filename at all.  Parse these the same way, as a <serial type='null'>.

Resolves: https://issues.redhat.com/browse/RHEL-32182

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-06-19 14:28:38 +02:00
Adam Julis
503a4e6a79 conf: Drop unused virDomainDiskFindByBusAndDst() declaration
Remove unused declaration of the virDomainDiskFindByBusAndDst()
function. Removed in v5.9.0-rc1~91 and then mistakenly
re-introduced in v5.9.0-rc1~65.

Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-19 13:55:53 +02:00
Swapnil Ingle
c772f1982d Pass shutoff reason to release hook
Sometimes in release hook it is useful to know if the VM shutdown was graceful
or not. This is especially useful to do cleanup based on the VM shutdown failure
reason in release hook. This patch proposes to use the last argument 'extra'
to pass VM shutoff reason in the call to release hook.
Making this change for Qemu and LXC.

Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-19 12:15:26 +02:00
Marc Hartmayer
2b199ad3f1 node_device_udev: remove incorrect G_GNUC_UNUSED
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
2024-06-18 09:00:37 -05:00
Marc Hartmayer
65214fcebd node_device_udev: Pass the udevEventData via parameter and use refcounting
Instead of accessing the global `driver` object pass the `udevEventData` as
parameter to the thread handler and watch callback. This has the advantage that:
1. proper refcounting
2. easier to read and test

Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 09:00:36 -05:00
Marc Hartmayer
0f8717b1c7 node_device_udev: Add support for g_autoptr to udevEventData
Use this feature in `udevEventDataNew`.

Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 09:00:34 -05:00
Marc Hartmayer
140cdf7f9a node_device_udev: Make the code easier to read
There is only one case where force is true, therefore let's inline that case.

Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 09:00:33 -05:00
Marc Hartmayer
b56458d443 node_device_udev: Use a worker pool for processing events and emitting nodedev event
Use a worker pool for processing the events (e.g. udev, mdevctl config changes)
and the initialization instead of a separate initThread and a mdevctl-thread.
This has the large advantage that we can leverage the job API and now this
thread pool is responsible to do all the "costly-work" and emitting the libvirt
nodedev events.

Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 09:00:32 -05:00
Marc Hartmayer
01ab7047e9 node_device_udev: Pass the driver state as parameter in preparation for the next commit
It's better practice for all functions called by the threads to pass the driver
via parameter and not global variables. Easier to test and cleaner.

Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 09:00:30 -05:00
Marc Hartmayer
2c3e4a0f6e node_device_udev: nodeStateShutdownPrepare: Disconnect the signals explicitly
The documentation of gobject signals reads:

"If you are connecting handlers to signals and using a GObject instance as your
signal handler user data, you should remember to pair calls to
g_signal_connect() with calls to g_signal_handler_disconnect() or
g_signal_handlers_disconnect_by_func(). While signal handlers are automatically
disconnected when the object emitting the signal is finalised..." [1]

This means that the signal handlers are automatically disconnected as soon as
the `priv->mdevCtlMonitors` are finalised/released by `udevEventDataDispose`.
But this also means that it's possible that new work is tried to be scheduled
for the workerpool by the `mdevctlEventHandleCallback` (main thread context)
even if the workerpool has already been stopped by `nodeStateShutdownWait`. To
fully understand this, it's important to know that the main loop of the main
thread is still running for some time even after `nodeStateShutdownPrepare` has
been called. Let's avoid this situation by explicitly disconnect the signals
during `nodeStateShutdownPrepare`, which is called in the main thread, so that
no new work is attempted to be scheduled for the worker pool.

[1] https://docs.gtk.org/gobject/signals.html#memory-management-of-signal-handlers

Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 09:00:29 -05:00
Marc Hartmayer
e89d39f5b8 node_device_udev: Introduce and use stateShutdownPrepare and stateShutdownWait
Introduce and use the driver functions for the node state shutdown preparation
and wait. As they're also called in the error/cleanup path of
`nodeStateInitialize`, they must be written in a way, that they can safely be
executed even if not everything is initialized.

In the next commit, these functions will be extended.

Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 09:00:28 -05:00
Marc Hartmayer
6e727d8bdc node_device_udev: Fix leak of mdevctlLock, udevThreadCond, and mdevCtlMonitors
Even if `priv->udev_monitor` was never initialized, the mdevctlLock, udevThread
were. Therefore let's match the order of releasing the resources the order of
allocating the resources in `nodeStateInitialize`.

In addition, use `g_steal_pointer` in `g_list_free_full`.

Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 09:00:26 -05:00
Marc Hartmayer
4daa362706 node_device_udev: Move responsibility to release (init|udev)Thread to udevEventDataDispose
Everything is released in `udevEventDataDispose` except for the threads, change
this as this makes the code easier to read as it can be simplified a little.

Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 09:00:24 -05:00
Marc Hartmayer
d7c8908be8 node_device_udev: Inline udevRemoveOneDevice
Inline `udevRemoveOneDevice` as it's used only once.

Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 09:00:23 -05:00
Marc Hartmayer
e6b70ae0c3 node_device_udev: Add prefix udev for udev related data
The new names make it easier to understand the purpose of the data.

Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 09:00:20 -05:00
Marc Hartmayer
f51d729dd0 node_device_udev: Take lock if driver->privateData is modified
Since @driver->privateData is modified take the lock.

Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 09:00:19 -05:00
Marc Hartmayer
c320c37917 node_device_udev: Don't take mdevctlLock for mdevctl list and add comments about locking
Commit a99d876a0f ("node_device: Use automatic mutex management") replaced the
locking mechanism and accidentally removed the comment with the reason why the
lock is taken. The reason was to "ensure only a single thread can query mdevctl
at a time", but this reason is no longer valid or maybe it never was. Therefore,
let's remove this lock and add a comment to `mdevCtl` what it protects.

Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 09:00:18 -05:00
Marc Hartmayer
1606d7ec99 node_device_udev: Test for mdevctlTimeout != -1
It is done a little differently everywhere in libvirt, but most common is to
test for != -1.

Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 09:00:16 -05:00
Marc Hartmayer
b13ddadc51 node_device_udev: Remove the timeout if the data is disposed
Remove the timeout when the udevEventData is disposed, analogous to priv->watch.

Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 09:00:15 -05:00
Boris Fiuczynski
0f87a53a0a nodedev: reset active config data on udev remove event
When a mdev device is destroyed or stopped the udev remove event
handling needs to reset the active config data of the node object
representing a persisted mdev.

Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
2024-06-18 09:00:11 -05:00
Boris Fiuczynski
23df65de2f nodedev: immediate update of active config on udev events
When an udev add, change or remove event occurs the mdev active config data
requires an update via mdevctl as the udev does not contain all config data.
This update needs to occur immediately and to be finished before the libvirt
nodedev event is issued to keep the API usage reliable.

After this change, scheduleMdevctlUpdate call is already called in
`udevAddOneDevice` and can therefore be removed in `udevHandleOneDevice`.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 09:00:05 -05:00
Marc Hartmayer
30354f5b1f node_device_udev: Set @def to NULL
@def is owned by @obj after adding it the node device object list. As soon as
the @obj lock has been released, another thread could free @obj and therefore
@def. If now someone accesses @def this would lead to a heap-use-after-free and
therefore most likely to a segmentation fault, therefore set @def to NULL after
the ownership has moved.

While at it, add comments to other code places why @def is set to NULL.

Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 08:59:46 -05:00
Boris Fiuczynski
7ccf76ea34 nodedev: fix mdev add udev event data handling
Two situations will trigger an udev add event:
 1) the mdev is created when started (transient) or
 2) the mdev was defined and is started
In case 1 there is no node object existing and no config data is copied.
In case 2 copying the active config data of an existing node object will
only copy invalid data. Instead copying the defined config data will
store valid data into the newly added node object.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2024-06-18 08:58:10 -05:00
Adam Julis
e145d182a6 qemu: implement iommu coldplug/unplug
Resolves: https://issues.redhat.com/browse/RHEL-23833
Signed-off-by: Adam Julis <ajulis@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-18 12:17:50 +02:00
Adam Julis
8ce138632c syms: Properly export virDomainIOMMUDefFree()
While the function is exported via header, the symbol itself was not.

Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-18 12:17:50 +02:00
Adam Julis
59f6e226bb qemu_driver: add validation of potential dependencies on cold plug
Although virDomainDeviceDefValidate() is called as a part of
parsing device XML routine, it validates only that single device.
The virDomainDefValidate() function performs a more comprehensive
check. It should detect errors resulting from dependencies
between devices, or a device and some other part of XML config.
Therefore, a call to virDomainDefValidate() is added at the end
of qemuDomainAttachDeviceConfig().

Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-18 08:46:28 +02:00
Daniel P. Berrangé
3fff8c91b0 network: introduce a "none" firewall backend type
There are two scenarios identified after the recent firewall backend
selection was introduced, which result in libvirtd failing to startup
due to an inability to find either iptables/nftables

 - On Linux if running unprivileged with $PATH lacking the dir
   containing iptables/nftables
 - On non-Linux where iptables/nftables never existed

In the former case, it is preferrable to restore the behaviour whereby
the driver starts successfully. Users will get an error reported when
attempting to start any virtual network, due to the lack of permissions
needed to create bridge devices. This makes the missing firewall backend
irrelevant.

In the latter case, the network driver calls the 'nop' platform
implementation which does not attempt to implement any firewall logic,
just allowing the network to start without firewall rules.

To solve this are number of changes are required

 * Introduce VIR_FIREWALL_BACKEND_NONE, which does nothing except
   report a fatal error from virFirewallApply(). This code path
   is unreachable, since we'll never create a virFirewall
   object with with VIR_FIREWALL_BACKEND_NONE, so the error reporting
   is just a sanity check.

 * Ignore the compile time backend defaults and assume use of
   the 'none' backend if running unprivileged.

   This fixes the first regression, avoiding the failure to start
   libvirtd on Linux in unprivileged context, instead allowing use
   of the driver and expecting a permission denied when creating a
   bridge.

 * Reject the use of compile time backend defaults no non-Linux
   and hardcode the 'none' backend. The non-Linux platforms have
   no firewall implementation at all currently, so there's no
   reason to permit the use of 'firewall_backend_priority'
   meson option.

   This fixes the second regression, avoiding the failure to start
   libvirtd on non-Linux hosts due to non-existant Linux binaries.

 * Change the Linux platform backend to raise an error if the
   firewall backend is 'none'. Again this code path is unreachable
   by default since we'll fail to create the bridge before getting
   here, but if someone modified network.conf to request the 'none'
   backend, this will stop further progress.

 * Change the nop platform backend to raise an error if the
   firewall backend is 'iptables' or 'nftables'. Again this code
   path is unreachable, since we should already have failed to
   find the iptables/nftables binaries on non-Linux hosts, so
   this is just a sanity check.

 * 'none' is not permited as a value in 'firewall_backend_priority'
   meson option, since it is conceptually meaningless to ask for
   that on Linux.

NB, 'firewall_backend_priority' allows repeated options temporarily,
which we don't want. Meson intends to turn this into a hard error

  DEPRECATION: Duplicated values in array option is deprecated. This will become a hard error in the future.

and we can live with the reduced error checking until that happens.

Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-17 15:55:14 +01:00
Michal Privoznik
be58733d90 conf: Drop needless NULL checks guarding virBufferEscapeString()
There's no need to guard virBufferEscapeString() with a call to
NULL as the very first thing the function does is check all three
arguments for NULL.

This patch was generated using the following spatch:

  @@
  expression X, Y, E;
  @@

  - if (E)
      virBufferEscapeString(X, Y, E);

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
2024-06-17 16:13:28 +02:00
Michal Privoznik
0f5ce3afd4 virprocess: Debug affinity map in virProcessSetAffinity()
The aim of virProcessSetAffinity() is to set affinity of given
process to given CPUs. While we currently print the PID into
logs, the CPU map is not printed. It may help when debugging
weird scenarios.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-06-17 12:30:39 +02:00
Michal Privoznik
095f22db21 qemu_process: Issue an info message when subtracting isolcpus
In one of my previous commits I've made us substract isolcpus
from all online CPUs when setting affinity on QEMU threads. See
commit below for more info on that. Nevertheless, this is
something that surely deserves an entry in log. I've chosen INFO
priority for now. We can promote that to a regular WARN if users
complain.

Fixes: da95bcb6b2
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-06-17 12:30:39 +02:00
Daniel P. Berrangé
f2828880b6 meson: allow systemd sysusersdir to be changed
We currently hardcode the systemd sysusersdir, but it is desirable to be
able to choose a different location in some cases. For example, Fedora
flatpak builds change the RPM %_sysusersdir macro, but we can't currently
honour that.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reported-by: Yaakov Selkowitz <yselkowi@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-13 10:23:11 +01:00
Peter Krempa
39bfd6c888 qemu_validate: Validate support for SCSI emulation support in 'virtio-blk' devices
The support will be dropped soon by qemu, and libvirt is not rejecting
such configurations. Add validation of this explicitly requested config.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-06-12 08:21:12 +02:00
Peter Krempa
126f95c1fe qemuValidateDomainDeviceDefDiskFrontend: Refactor validation of <disk type='lun'>
Use a switch statement for checks based on the disk bus.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-06-12 08:21:11 +02:00
Daniel P. Berrangé
9f549eb8a5 rpc: split TLS cert validation into separate file
The TLS cert validation logic will be reused for the new impl of the
virt-pki-validate tool.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-11 12:50:23 +01:00
Daniel P. Berrangé
14f4de4c73 rpc: refactor method for checking session certificates
This will facilitate moving much of the code into a new file in the
subsequent commit.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-11 12:50:23 +01:00
Daniel P. Berrangé
e66c3bcd0c rpc: split out helpers for TLS cert path location
We'll want to access these paths from outside the TLS context code,
so split them into a standalone file.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-11 12:50:23 +01:00
Georgia Garcia
a2455fd53d virt-aa-helper: use 'include if exists' on .files
Change the 'include' in the AppArmor policy to use 'include if exists'
when including <uuid>.files. Note that 'if exists' is only available
after AppArmor 3.0, therefore a #ifdef check must be added.

When the <uuid>.files is not present, there are some failures in the
AppArmor tools like the following, since they expect the file to exist
when using 'include':

ERROR: Include file /etc/apparmor.d/libvirt/libvirt-8534a409-a460-4fab-a2dd-0e1dce4ff273.files not found

Signed-off-by: Georgia Garcia <georgia.garcia@canonical.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-11 09:46:19 +02:00
Daniel P. Berrangé
a7eb7de531 meson: allow systemd unitdir to be changed
We currently hardcode the systemd unitdir, but it is desirable to be
able to choose a different location in some cases. For examples, Fedora
flatpak builds change the RPM %_unitdir macro, but we can't currently
honour that.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-07 14:04:19 +01:00
Andrea Bolognani
971e767805 qemu: Reject TPM 1.2 in most scenarios
Everywhere we use TPM 2.0 as our default, the chances of TPM
1.2 being supported by the guest OS are very slim. Just reject
such configurations outright.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-06-07 11:13:19 +02:00
Andrea Bolognani
220b2690da qemu: Default to TPM 2.0 in most scenarios
TPM 1.2 is a pretty bad default these days, especially for
architectures which were introduced when TPM 2.0 already existed.

We're already carving out exceptions for several scenarios, but
that's basically backwards: at this point, using TPM 1.2 is the
exception.

Restructure the code so that it reflects reality and we don't
have to remember to update it every time a new architecture is
introduced.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-06-07 11:13:16 +02:00
Michal Privoznik
86e511fafb lib: Annotate more function as NULL terminated
While __attribute((sentinel)) (exposed by glib under
G_GNUC_NULL_TERMINATED macro) is a gcc extension, it's supported
by clang too. It's already being used throughout our code but
some functions that take variadic arguments and expect NULL at
the end were lacking such annotation. Fill them in.

After this, there are still some functions left untouched because
they expect a different sentinel than NULL. Unfortunately, glib
does not provide macro for different sentinels. We may come up
with our own, but let's save that for future work.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-06-06 09:29:58 +02:00
Daniel P. Berrangé
3499354e12 interface: fix udev reference leak with invalid flags
The udevInterfaceGetXMLDesc method takes a reference on the udev
driver as its first action. If the virCheckFlags() condition
fails, however, this reference is never released.

Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-05 12:19:12 +01:00
Daniel P. Berrangé
98f1cf88fa rpc: avoid leak of GSource in use for interrupting main loop
We never release the reference on the GSource created for
interrupting the main loop, nor do we remove it from the
main context if our thread is woken up prior to the wakeup
callback firing.

This can result in a leak of GSource objects, along with an
ever growing list of GSources attached to the main context,
which will gradually slow down execution of the loop, as
several operations are O(N) for the number of attached GSource
objects.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-05 12:03:24 +01:00
Peter Krempa
f3e8c10fe4 qemu: validate: Fix check for unsupported FS-device bootindex use on un-assigned addresses
When hot-plugging a FS device with un-assigned address with a bootindex
the recently-added validation check would fail as validation on hotplug
is done prior to address assignment.

To fix this problem we can simply relax the check to also pass on _NONE
addresses. Unsupported configurations will still be caught as previous
commit re-checks the definition after address assignment prior to
hotplug.

Resolves: https://issues.redhat.com/browse/RHEL-39271
Fixes: 4690058b6d
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-05-31 12:54:32 +02:00
Peter Krempa
63b32dbe8b qemu: hotplug: Validate definition of 'FS' device after address allocation
Some of the checks make sense only after the address is allocated and
thus we need to re-do the validation after the address is assigned.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-05-31 12:54:32 +02:00
Peter Krempa
c249f909f3 syms: Properly export 'virDomainDeviceDefValidate'
While the function is exported via header, the symbol itself was not.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-05-31 12:54:32 +02:00
Michal Privoznik
a57195e79e log_cleaner: Detect rotated filenames properly
When removing rotated log files, their name is matched against a
regex (@log_regex) and if they contain '.N' suffix the 'N' is
then parsed into an integer. Well, due to a bug in
virLogCleanerParseFilename() this is not how the code works. If
the suffix isn't found then g_match_info_fetch() returns an empty
string instead of NULL which then makes str2int parsing fail.
Just check for this case before parsing the string.

Based on the original patch sent by David.

Reported-by: David Negreira <david.negreira@canonical.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-05-31 08:34:29 +02:00
Michal Privoznik
805b1eec7d qemu_hotplug: Clear QoS if required in qemuDomainChangeNet()
In one of my recent commits, I've introduced
virDomainInterfaceClearQoS() which is a helper that either calls
virNetDevBandwidthClear() ('tc' implementation) or
virNetDevOpenvswitchInterfaceClearQos() (for ovs ifaces). But I
made a micro optimization which leads to a bug: the function
checks whether passed iface has any QoS set and returns early if
it has none. In majority of cases this is right thing to do, but
when removing QoS on virDomainUpdateDeviceFlags() this is
problematic. The new definition (passed as argument to
virDomainInterfaceClearQoS()) contains no QoS (because user
requested its removal) and thus instead of removing the old QoS
setting nothing is done.

Fortunately, the fix is simple - pass olddev which contains the
old QoS setting.

Fixes: 812a146dfe
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-05-30 14:08:07 +02:00
Pavel Hrdina
2ea493598f qemu_snapshot: fix memory leak when reverting external snapshot
The code cleaning up virStorageSource doesn't free data allocated by
virStorageSourceInit() so we need to call virStorageSourceDeinit()
explicitly.

Fixes: 8e66473781
Resolves: https://issues.redhat.com/browse/RHEL-33044
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-05-29 15:23:55 +02:00
Andrea Bolognani
957eea376b meson: Improve default firewall backend configuration
The current implementation requires users to configure the
preference as such:

  -Dfirewall_backend_default_1=iptables
  -Dfirewall_backend_default_2=nftables

In addition to being more verbose than one would hope, there
are several things that could go wrong.

First of all, meson performs no validation on the provided
values, so mistakes will only be caught by the compiler.
Additionally, it's entirely possible to provide nonsensical
combinations, such as repeating the same value twice.

Change things so that the preference can now be configured
as such:

  -Dfirewall_backend_priority=iptables,nftables

Checks have been added to prevent invalid values from being
accepted.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-05-28 19:28:58 +02:00
Laine Stump
a4f38f6ffe network: use iif/oif instead of iifname/oifname in nftables rules
iifname/oifname need to lookup the string that contains the name of
the interface each time a packet is checked, while iif/oif compare the
ifindex of the interface, which is included directly in the
packet. Conveniently, the rule is created using the *name* of the
interface (which gets converted to ifindex as the rule is added), so
no extra work is required other than changing the commandline option.

If it was the case that the interface could be deleted and re-added
during the life of the rule, we would have to use Xifname (since
deleting and re-adding the interface would result in ifindex
changing), but for our uses this never happens, so Xif works for us,
and undoubtedly improves performance by at least 0.0000001%.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-05-27 23:53:58 +02:00
Peter Krempa
65cdc37a7e virFileOpenForked: Fix handling of return value from virSocketSendFD()
Commit 91f4ebbac8 (v10.0.0-185-g91f4ebbac8)
changed the return value of virSocketSendFD() from 0 to 1 on success.

Unfortunately in 'virFileOpenForked' the return value was used to report
the error back to the main process from the fork'd child. As process
return codes are positive only, the code negates the value of 'ret' and
reports it. This resulted in the parent thinking the process exited with
failure:

 # virsh save avocado-vt-vm1 /mnt/save
 error: Failed to save domain 'avocado-vt-vm1' to /mnt/save
 error: Error from child process creating '/mnt/save': Unknown error 255

This error reproduces on NFS mounts with 'root_squash' enabled. I've
also observed it in one specific migration case when root_squash NFS is
used with following error:

  Failed to open file '/var/lib/libvirt/images/alpine.qcow2': Unknown error 255'

To fix the issue the code is refactored so that it doesn't actually
touch the 'ret' variable needlessly and assigns to it only on failure
cases, which prevents the '1' to be propagated to the parent process as
'255' after negating and storing in the process return code.

Fixes: 91f4ebbac8
Resolves: https://issues.redhat.com/browse/RHEL-36721
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-05-23 14:32:24 +02:00
Peter Krempa
f63cbc7365 virGetGroupList: Refactor and fix callers
Use contemporary style for declarations and automatic memory clearing
for a helper string.

Since the function can't fail any more, remove any mention of returning
errno and remove error checks from all callers.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-05-23 14:32:24 +02:00
Peter Krempa
f2648fca1a virfile: Modernize definition of virFileOpenForked/virFileOpenForceOwnerMode/virFileOpenAs
Declare one argument per line and one variable per line and use boolean
operators at the end of the line rather than at the beginning.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-05-23 14:32:24 +02:00
Laine Stump
afbd1bb89e network: eliminate pointless host input/output rules from nftables backend
The iptables backend (which was used as the model for the nftables
backend) used the same "filter" and "nat" tables used by other
services on the system (e.g. firewalld or any other host firewall
management application), so it was possible that one of those other
services would be blocking DNS, DHCP, or TFTP from guests to the host;
we added our own rules at the beginning of the chain to allow this
traffic no matter if someone else rejected it later.

But with nftables, each service uses their own table, and all traffic
must be acepted by all tables no matter what - it's not possible for
us to just insert a higher priority/earlier rule that will override
some reject rule put in by, e.g., firewalld. Instead the firewalld (or
other) table must be setup by that service to allow the traffic. That,
along with the fact that our table is already "accept by default",
makes it possible to eliminate the individual accept rules for DHCP,
DNS, and TFTP. And once those rules are eliminated, there is no longer
any need for the guest_to_host or host_to_guest tables.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:20:49 -04:00
Laine Stump
958aa7f274 network: rename chains used by network driver nftables backend
Because the chains added by the network driver nftables backend will
go into a table used only by libvirt, we don't need to have "libvirt"
in the chain names. Instead, we can make them more descriptive and
less abrasive (by using lower case, and using full words rather than
abbreviations).

Also (again because nobody else is using the private "libvirt_network"
table) we can directly put our rules into the input ("guest_to_host"),
output ("host_to_guest"), and postrouting ("guest_nat") chains rather
than creating a subordinate chain as done in the iptables backend.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:20:49 -04:00
Laine Stump
0bd7a47356 network: name the nftables table "libvirt_network" rather than "libvirt"
This way when we implement nftables for the nwfilter driver, we can
create a separate table called "libvirt_nwfilter" and everything will
look all symmetrical and stuff.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:20:49 -04:00
Laine Stump
b89c4991da network: add an nftables backend for network driver's firewall construction
Support using nftables to setup the firewall for each virtual network,
rather than iptables. The initial implementation of the nftables
backend creates (almost) exactly the same ruleset as the iptables
backend, determined by running the following commands on a host that
has an active virtual network:

  iptables-save >iptables.txt
  iptables-restore-translate -f iptables.txt

(and the similar ip6tables-save/ip6tables-restore-translate for an
IPv6 network). Correctness of the new backend was checked by comparing
the output of:

  nft list ruleset

when the backend is set to iptables and when it is set to nftables.

This page was used as a guide:

  https://wiki.nftables.org/wiki-nftables/index.php/Moving_from_iptables_to_nftables

The only differences between the rules created by the nftables backed
vs. the iptables backend (aside from a few inconsequential changes in
display order of some chains/options) are:

1) When we add nftables rules, rather than adding them in the
system-created "filter" and "nat" tables, we add them in a private
table (ie only we should be using it) created by us called "libvirt"
(the system-created "filter" and "nat" tables can't be used because
adding any rules to those tables directly with nft will cause failure
of any legacy application attempting to use iptables when it tries to
list the iptables rules (e.g. "iptables -S").

(NB: in nftables only a single table is required for both nat and
filter rules - the chains for each are differentiated by specifying
different "hook" locations for the toplevel chain of each)

2) Since the rules that were added to allow tftp/dns/dhcp traffic from
the guests to the host are unnecessary in the context of nftables,
those rules aren't added.

(Longer explanation: In the case of iptables, all rules were in a
single table, and it was always assumed that there would be some
"catch-all" REJECT rule added by "someone else" in the case that a
packet didn't match any specific rules, so libvirt added these
specific rules to ensure that, no matter what other rules were added
by any other subsystem, the guests would still have functional
tftp/dns/dhcp. For nftables though, the rules added by each subsystem
are in a separate table, and in order for traffic to be accepted, it
must be accepted by *all* tables, so just adding the specific rules to
libvirt's table doesn't help anything (as the default for the libvirt
table is ACCEPT anyway) and it just isn't practical/possible for
libvirt to find *all* other tables and add rules in all of them to
make sure the traffic is accepted. libvirt does this for firewalld (it
creates a "libvirt" zone that allows tftp/dns/dhcp, and adds all
virtual network bridges to that zone), however, so in that case no
extra work is required of the sysadmin.)

3) nftables doesn't support the "checksum mangle" rule (or any
equivalent functionality) that we have historically added to our
iptables rules, so the nftables rules we add have nothing related to
checksum mangling.

(NB: The result of (3) is that if you a) have a very old guest (RHEL5
era or earlier) and b) that guest is using a virtio-net network
device, and c) the virtio-net device is using vhost packet processing
(the default) then DHCP on the guest will fail. You can work around
this by adding <driver name='qemu'/> to the <interface> XML for the
guest).

There are certainly much better nftables rulesets that could be used
instead of those implemented here, and everything is in place to make
future changes to the rules that are used simple and free of surprises
(e.g. the rules that are added have coresponding "removal" commands
added to the network status so that we will always remove exactly the
rules that were previously added rather than trying to remove the
rules that "the current build of libvirt would have added" (which will
be incorrect the first time we run a libvirt with a newly modified
ruleset). For this initial implementation though, I wanted the
nftables rules to be as identical to the iptables rules as possible,
just to make it easier to verify that everything is working.

The backend can be manually chosen using the firewall_backend setting
in /etc/libvirt/network.conf. libvirtd/virtnetworkd will read this
setting when it starts; if there is no explicit setting, it will check
for availability of FIREWALL_BACKEND_DEFAULT_1 and then
FIREWALL_BACKEND_DEFAULT_2 (which are set at build time in
meson_options.txt or by adding -Dfirewall_backend_default_n=blah to
the meson commandline), and use the first backend that is available
(ie, that has the necessary programs installed). The standard
meson_options.txt is set to check for nftables first, and then
iptables.

Although it should be very safe to change the default backend from
iptables to nftables, that change is left for a later patch, to show
how the change in default can be undone if someone really needs to do
that.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:20:07 -04:00
Laine Stump
865eea30f4 meson: stop looking for iptables/ip6tables/ebtables at build time
This was the only reason we required the iptables and ebtables
packages at build time, and many other external commands already have
their binaries found at runtime by looking through $PATH (virCommand
automatically does this), so we may as well do it for these commands
as well.

Since we no longer need iptables or iptables at build time, we can
also drop the BuildRequires for them from the rpm specfile.

Inspired-by: 6aa2fa38b0
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:20:07 -04:00
Laine Stump
110383fa30 network: save network status when firewall rules are reloaded
In the case that a new version of libvirt is started that uses
different rules to build the network firewall, we need to re-save the
status so that when the network is destroyed (or the *next* time
libvirt is restarted and wants to remove/re-add the firewall), it will
have the proper information to perform the firewall removal.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:20:07 -04:00
Laine Stump
97061d576b network: use previously saved list of firewall removal commands
When destroying a network, the network driver has always assumed that
it knew what firewall rules had been added as the network was
started. This was usually correct - I only recall one time in the past
that the firewall rules added by libvirt were changed. But if the
exact rules used for a network *were* ever changed from one
build/version of libvirt to another, then we would end up attempting
to remove rules that hadn't been added, and could possibly *not*
remove rules that had been added.

The solution to this to not make such brash assumptions about the
past, but instead to save (in the network status object at network
start time) a list of all the rules needed to remove the rules that
were added for the network, and then use that saved list during
network destroy to remove exactly what was previous added.

Beyond making net-destroy more precise, there are other benefits:

1) We can change the details of the rules we add for networks from one
build/release of libvirt to another and painlessly upgrade.

2) The user can switch from one firewall backend to another by simply
changing the setting in network.conf and restarting
libvirtd/virtnetworkd.

In both cases, the restarted libvirtd/virtnetworkd will remove all the
rules that had been previously added (based on the network status),
and then add new rules (saving the new removal commands back into the
network status)

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:20:07 -04:00
Laine Stump
0fa79844a1 conf: add a virFirewall object to virNetworkObj
This virFirewall object will store the list of actions required to
remove the firewall that was added for the currently active instance
of the network, so it has been named "fwRemoval" (and when parsed into
XML, the <firewall> element will have the name "fwRemoval").

There are no uses of the fwRemoval object in the virNetworkObj yet,
but everything is in place to add it to the XML when formatted, parse
it from the XML when reading network status, and free the virFirewall
object when the virNetworkObj is freed.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:20:07 -04:00
Laine Stump
df9a505961 util: new functions virFirewallParseXML() and virFirewallFormat()
These functions convert a virFirewall object to/from XML so that it
can be serialized to disk (in a virNetworkObj's status file) and
restored later (e.g. after libvirtd/virtnetworkd is restarted).

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:58 -04:00
Laine Stump
b77b2fc314 util: new function virFirewallNewFromRollback()
virFirewallNewFromRollback() creates a new virFirewall object that
contains a copy of the "rollback" commands from an existing
virFirewall object, but in reverse order. The intent is that this
virFirewall be saved and used later to remove the firewall rules that
were added for a network.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:48 -04:00
Laine Stump
d24b7501dc util: add name attribute to virFirewall
This will be used to label (via "name='blah'") a firewall when it is
formatted to XML and written to the network status.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:36 -04:00
Laine Stump
e1b6b0646f network: turn on auto-rollback for the rules added for virtual networks
So far this will only affect what happens if there is some failure
while applying the firewall rules; the rollback rules aren't yet
persistent beyond that time. More work is needed to remember the
rollback rules while the network is active, and use those rules to
remove the firewall for the network when it is destroyed.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:36 -04:00
Laine Stump
e23907635c util: implement rollback rule autocreation for iptables commands
If the VIR_FIREWALL_TRANSACTION_AUTO_ROLLBACK flag is set, each time
an iptables command is executed that is adding a rule or chain, a
corresponding command that will *delete* the same rule/chain is
constructed and added to the list of rollback commands. If we later
want to undo the entire firewall, we can just run those commands.

This isn't yet used anywhere, since
VIR_FIREWALL_TRANSACTION_AUTO_ROLLBACK isn't being set.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:36 -04:00
Laine Stump
f94c82b0a6 util: new functions to support adding individual firewall rollback commands
In the past virFirewall required all rollback commands for a group
(those commands necessary to "undo" any rules that had been added in
that group in case of a later failure) to be manually added by
switching into the virFirewall object into "rollback mode" and then
re-calling the inverse of the exact virFirewallAddCmd*() APIs that had
been called to add the original rules (ie. for each
"iptables --insert" command, for rollback we would need to add a
command with all arguments identical except that "--insert" would be
replaced by "--delete").

Because nftables can't search for rules to remove by comparing all the
arguments (it instead expects *only* a handle that is provided via
stdout when the rule was originally added), we won't be able to follow
the iptables method and manually construct the command to undo any
given nft command by just duplicating all the args of the command
(except the action). Instead we will need to be able to automatically
create a rollback command at the time the rule-adding command is
executed (e.g. an "nft delete rule" command that would include the
rule handle returned in stdout by an "nft add rule" command).

In order to make this happen, we need to be able to 1) learn whether
the user of the virFirewall API desires this behavior (handled by a new
transaction flag called VIR_FIREWALL_TRANSACTION_AUTO_ROLLBACK that
can be retrieved with the new virFirewallTransactionGetFlags() API),
and 2) add a new command to the current group's rollback command list (with
the new virFirewallAddRollbackCmd()).

We will actually use this capability in an upcoming patch.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:36 -04:00
Laine Stump
c737f225a9 network: framework to call backend-specific function to init private filter chains
Modify networkSetupPrivateChains() in the network driver to accept a
firewallBackend argument so it will know which backend to call. (right
now it always calls the iptables version of the lower level function,
but in the future it could instead call the nftables version based on
configuration).

But networkSetupPrivateChains() was being called with virOnce(), and
virOnce() doesn't support calling functions that require an argument
(it's based on pthread_once(), which accepts no arguments, so it's not
something we can easily fix in our implementation of virOnce()). To
solve this dilemma, this patch eliminates use of virOnce() by adding a
static lock, and putting all of networkSetupPrivateChains() (including
the setting of "chainInitDone") inside a lock guard - now the places
that used to call it via virOnce() can safely call it directly
instead (adding in the necessary argument to specify backend).

(If it turns out to be significant, we could optimize this by checking
for chainInitDone outside the lock guard, returning immediately if
it's already set, and then moving the setting of chainInitDone up to
the top of the guarded section.)

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:36 -04:00
Laine Stump
64b966558c network: support setting firewallBackend from network.conf
It still can have only one useful value ("iptables"), but once a 2nd
value is supported, it will be selectable by setting
"firewall_backend=nftables" in /etc/libvirt/network.conf.

If firewall_backend isn't set in network.conf, then libvirt will check
to see if FIREWALL_BACKEND_DEFAULT_1 is available and, if so, set
that. (Since FIREWALL_BACKEND_DEFAULT_1 is currently "iptables", this
means checking to see it the iptables binary is present on the
system).  If the default backend isn't available, that is considered a
fatal error (since no networks can be started anyway), so an error is
logged and startup of the network driver fails.

NB: network.conf is itself created from network.conf.in at build time,
and the advertised default setting of firewall_backend (in a commented
out line) is set from the meson_options.txt setting
"firewall_backend_default_1". This way the conf file will have correct
information no matter what ordering is chosen for default backend at
build time (as more backends are added, settings will be added for
"firewall_backend_default_n", and those will be settable in
meson_options.txt and on the meson commandline to change the ordering
of the auto-detection when no backend is set in network.conf).

virNetworkLoadDriverConfig() may look more complicated than necessary,
but as additional backends are added, it will be easier to add checks
for those backends (and to re-order the checks based on builders'
preferences).

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:18 -04:00
Laine Stump
45c4527f36 network: add (empty) network.conf file to distribution files
This file is generated from network.conf.in because it will soon have
an item that must be modified according to meson buildtime config.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:18 -04:00
Laine Stump
9293644d8a util/network: new virFirewallBackend enum
(This paragraph is for historical reference only, described only to
avoid confusion of past use of the name with its new use) In a past
life, virFirewallBackend had been a private static in virfirewall.c
that was set at daemon init time, and used to globally (i.e. for all
drivers in the daemon) determine whether to directly execute iptables
commands, or to run them indirectly via the firewalld passthrough
API. This was removed in commit d566cc55, since we decided that using
the firewalld passthrough API is never appropriate.

Now the same enum, virFirewallBackend, is being reintroduced, with a
different meaning and usage pattern. It will be used to pick between
using nftables commands or iptables commands (in either case directly
handled by libvirt, *not* via firewalld). Additionally, rather than
being a static known only within virfirewall.c and applying to all
firewall commands for all drivers, each virFirewall object will have
its own backend setting, which will be set during virFirewallNew() by
the driver who wants to add a firewall rule.

This will allow the nwfilter and network drivers to each have their
own backend setting, even when they coexist in a single unified
daemon. At least as important as that, it will also allow an instance
of the network driver to remove iptables rules that had been added by
a previous instance, and then add nftables rules for the new instance
(in the case that an admin, or possibly an update, switches the driver
backend from iptables to nftable)

Initially, the enum will only have one usable value -
VIR_FIREWALL_BACKEND_IPTABLES, and that will be hardcoded into all
calls to virFirewallNew(). The other enum value (along with a method
of setting it for each driver) will be added later, when it can be
used (when the nftables backend is in the code).

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:18 -04:00
Laine Stump
5543179cea util: determine ignoreErrors value when creating virFirewallCmd, not when applying
We know at the time a virFirewallCmd is created (with
virFirewallAddCmd*()) whether or not we will later want to ignore
errors encountered when attempting to apply that command - if
ignoreErrors is set in the AddCmd or if the group has already had
VIR_FIREWALL_TRANSACTION_IGNORE_ERRORS set, then we ignore the errors.

Rather than setting the fwCmd->ignoreErrors only according to the arg
sent to virFirewallAddCmdFull(), and then later (at ApplyCmd-time)
combining that with the group transactionFlags setting (and passing it
all the way down the call chain), just combine the two flags right
away and store this final value in fwCmd->ignoreErrors when the
virFirewallCmd is created (thus avoiding the need to look at anything
other than fwCmd->ignoreErrors at the time the command is applied). Once
that is done, we can simply grab ignoreErrors from the object down in
virFirewallApply() rather than cluttering up the argument list on the
entire call chain.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:18 -04:00
Laine Stump
4ee30ecd57 util: add -w/--concurrent when applying a FirewallCmd rather than when building it
We will already need a separate function for virFirewallApplyCmd for
iptables vs. nftables, but the only reason for needing a separate
function for virFirewallAddCmd* is that iptables/ebtables need to have
an extra arg added for locking (to prevent multiple iptables commands
from running at the same time). We can just as well add in the
-w/--concurrent during virFirewallApplyCmd, so move the arg-add to
ApplyCmd to keep AddCmd simple.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:18 -04:00
Laine Stump
ad96ee74ce util: check for 0 args when applying iptables rule
In normal practice a virFirewallCmd should never have 0 args by the
time it gets to the Apply stage, but at some time while debugging one
of the other patches in this series, exactly that happened (due to a
bug that was since squashed), and having a check for it helped
debugging, so let's permanently check for it.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:18 -04:00
Laine Stump
67362e6328 util: rename virNetFilterAction to iptablesAction, and add VIR_ENUM_DECL/IMPL
I had originally named these as VIR_NETFILTER_* because I assumed the
same enum would eventually be used by our nftables backend as well as
iptables. But it turns out that in most cases it's not possible to
delete an nftables rule, so we just never used the enum anyway, so
this patch is renaming the values to IPTABLES_ACTION_*, and taking
advantage of the newly defined (via VIR_ENUM_DECL/IMPL)
iptablesActionTypeToString() to replace all the ternary operators used
to translate the enum into a string for the iptables commandline with
iptablesActionTypeToString().

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:18 -04:00
Laine Stump
0817344ba7 util: change name of virFirewallRule to virFirewallCmd
These objects aren't rules, they are commands that are executed that
may create a firewall rule, delete a firewall rule, or simply list the
existing firewall rules. It's confusing for the objects to be called
"Rule" (especially in the case of the function
virFirewallRemoveRule(), which doesn't remove a rule from the
firewall, it takes one of the objects out of the list of commands to
execute! In order to remove a rule from the host's firewall, you have
to Add a "rule" (now "cmd" aka command) to the list that will, when
applied/run, remove a rule from the host firewall.)

Changing the name to virFirewallCmd makes it all much less confusing.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:18 -04:00
Laine Stump
5ac0dc4cef util: #define the names used for private packet filter chains
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:18 -04:00
Laine Stump
b4913820ec network: make all iptables functions used only in network_iptables.c static
Now that the toplevel iptables functions have been moved out of the
linux bridge driver into network_iptables.c, all of the utility
functions are used only within that same file, so simplify it.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:18 -04:00
Laine Stump
01fd85fed9 network: move all functions manipulating iptables rules into network_iptables.c
Although initially we will add exactly the same rules for the nftables
backend, the two may (hopefully) soon diverge as we take advantage of
nftables features that weren't available in iptables. When we do that,
there will need to be a different version of these functions (currently in
bridge_driver_linux.c) for each backend:

  networkAddFirewallRules()
  networkRemoveFirewallRules()
  networkSetupPrivateChains()

Although it will mean duplicating some amount of code (with just the
function names changed) for the nftables backend, this patch moves all
of the rule-related code in the above three functions into iptables*()
functions in network_iptables.c, and changes the functions in
bridge_driver_linux.c to call the iptables*() functions. When we make
a different backend, it will only need to make equivalents of those 3
functions publicly available to the upper layer.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:18 -04:00
Laine Stump
e1f6d2c205 util/network: move viriptables.[ch] from util to network directory
These functions are only ever used by the network driver, and are so
specific to the network driver's usage of iptables that they likely
won't ever be used elsewhere. The files are renamed to
network_iptables.[ch] to be more in line with driver-specific file
naming conventions.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 23:19:18 -04:00
Michal Privoznik
66b052263d src: Fix return types of .stateInitialize callbacks
The virStateDriver struct has .stateInitialize callback which is
declared to return virDrvStateInitResult enum. But some drivers
return a plain int in their implementation which is UB.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 13:41:42 +02:00
Jonathon Jongsma
7c8e606b64 qemu: fix qemu command for pci hostdevs and ramfb='off'
There was no test for this and we mistakenly used 'B' rather than 'T'
when constructing the json value for this parameter. Thus, a value of
'off' was VIR_TRISTATE_SWITCH_OFF=2, which was translated to a boolean
value of 'true'.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-05-20 12:42:18 -05:00
Rayhan Faizel
57f29f675d qemu: Implement support for hotplugging evdev input devices
Unlike other input types, evdev is not a true device since it's backed by
'-object'. We must use object-add/object-del monitor commands instead of
device-add/device-del in this particular case.

This patch adds support for handling live attachment and
detachment of evdev type devices.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/529
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-05-16 14:56:59 +02:00
Michal Privoznik
9c1cfc337e meson: Bump glib version to 2.58.0
Now that we don't have any distro stuck with glib-2.56.0, we can
bump the glib version. In fact, this is needed, because of
g_clear_pointer. Since v7.4.0-rc1~301 we declare at compile time
what version of glib APIs we want to use (by setting
GLIB_VERSION_MIN_REQUIRED = GLIB_VERSION_MAX_ALLOWED = 2.56.0),
regardless of actual glib version in the host.

And since we currently require glib-2.56.0 and force glib to use
APIs of that version, some newer bits are slipping from us. For
instance: regular function version of g_clear_pointer() is used
instead of a fancy macro. So what? Well, g_clear_pointer()
function typecasts passed free function to void (*)(void *) and
then calls it. Well, this triggers UBSAN, understandably. But
with glib-2.58.0 the g_clear_pointer() becomes a macro which
calls the free function directly, with no typecasting and thus no
undefined behavior.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-05-14 15:17:20 +02:00
Michal Privoznik
1a4063ca20 security: Fix return types of .probe callbacks
The .probe member of virSecurityDriver struct is declared to
return virSecurityDriverStatus enum. But there are two instances
(AppArmorSecurityManagerProbe() and
virSecuritySELinuxDriverProbe()) where callbacks are defined to
return an integer. This is an undefined behavior because integer
has strictly bigger space of possible values than the enum.

Defined those aforementioned callbacks so that they return the
correct enum instead of int.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-05-14 15:11:30 +02:00
Rayhan Faizel
ffebb557f1 qemu_hotplug: Properly assign USB address to hotplugged usb-net device
Previously, the network device hotplug logic would try to ensure only CCW or
PCI addresses. With recent support for the usb-net model, this patch will
ensure USB addresses for usb-net network devices.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/14
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-05-14 09:14:39 +02:00
Martin Kletzander
2482801608 vmx: Do not require DVS Port ID
It can be safely removed from the VMX, VMWare will still boot the
machine and once another ethernet is added it is updated in the VMX to
zero.  So do not require it and default to zero too since this part of
the XML is done as best effort and it is mentioned even in our
documentation.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-05-14 08:32:13 +02:00
Dr. David Alan Gilbert
9e59ba56c8 qemu_capabilities: Remove unused struct
'virQEMUCapsSearchData' has been unused since
commit bc33b8c639 ("qemu: capabilities: Drop the
virQEMUCapsCacheLookupByArch function")
Remove it.

Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-05-13 03:14:14 +02:00
Jiri Denemark
dda10ac8ac network: Register dnsmasq with resolved only when really requested
An incorrect check for domainRegister caused the DNS server for a
virtual domain to be registered with systemd-resolved even if
register='no' attribute was present. Only omitting the attribute
completely would disable the registration.

Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-09 16:34:40 +02:00
Daniel P. Berrangé
a47e73d6e7 src/node_device: don't overwrite error messages
The nodedev code unhelpfully reports

  couldn't convert node device def to mdevctl JSON

which hides the actual error message

  No JSON parser implementation is available

Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-08 16:01:34 +01:00
Daniel P. Berrangé
08bfb18736 tests: build driver modules before virdrivermoduletest
The virdrivermoduletest will attempt to dlopen() each driver module,
so they must be build before the test can run.

Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-08 16:01:34 +01:00
Daniel P. Berrangé
0dc278dd02 src: ensure augeas test file is generated before running test
We fail to express an ordering between the custom target that
generates the combined augeas test input file, and the meson
test command.

Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-08 15:57:46 +01:00
Peter Krempa
df9ffb0256 udevListInterfacesByStatus: Don't try to return NULL names
In case when the interface is being detached/reattached it may happen
that udev will return NULL from 'udev_device_get_sysname()'.

As the RPC code requires nonnull strings in the return array it fails to
serialize such reply:

 libvirt: XML-RPC error : Unable to encode message payload

Fix this by simply ignoring such interfaces as there's nothing we can
report in such case.

A similar fix was done to 'udevConnectListAllInterfaces' in commit
2ca94317ac.

Resolves: https://issues.redhat.com/browse/RHEL-34615
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-05-07 14:55:57 +02:00
Peter Krempa
bc596f2751 interface_udev: Replace udevNumOfInterfacesByStatus by udevListInterfacesByStatus
Make the array-filling operation of udevListInterfacesByStatus optional
and replace the completely redundant udevNumOfInterfacesByStatus by it.

Further patches fixing the listing will not need to be duplicated.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-05-07 14:55:57 +02:00
Michal Privoznik
e6a5592787 datatypes: Declare g_autoptr cleanup functions for more public objects
Some public objects (like virDomain, virInterface, and so on) are
missing g_autoptr() cleanup functions. Provide missing
declarations. Note, this is only for our internal use - hence
datatypes.h.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-07 13:03:19 +02:00
Michal Privoznik
da95bcb6b2 qemu: Substract isolcpus from all online affinity
When starting a domain and there's no vCPU/emulator pinning set,
we query the list of all online physical CPUs and set affinity of
the child process (which eventually becomes QEMU) to that list.
We can't assume libvirtd itself had affinity to all online CPUs
and since affinity of the child process is inherited, we should
fix it afterwards. But that's not necessarily correct. Users
might isolate some physical CPUs and we should avoid touching
them unless explicitly told so (i.e. vCPU/emulator pinning told
us so).

Therefore, when attempting to set affinity to all online CPUs
subtract the isolated ones.

Before this commit:

  root@localhost:~# cat /sys/devices/system/cpu/isolated
  19,21,23
  root@virtlab414:~# taskset -cp $(pgrep qemu)
  pid 14835's current affinity list: 0-23

After:

  root@virtlab414:~# taskset -cp $(pgrep qemu)
  pid 17153's current affinity list: 0-18,20,22

Resolves: https://issues.redhat.com/browse/RHEL-33082
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-05-06 15:38:58 +02:00
Michal Privoznik
3c948ef699 virhostcpu: Introduce virHostCPUGetIsolated()
This is a helper that parses /sys/devices/system/cpu/isolated
into a virBitmap. It's going to be needed soon.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-05-06 15:36:17 +02:00
Michal Privoznik
f3c6c7623c virfile: Introduce virFileReadValueBitmapAllowEmpty()
Some sysfs files contain either string representation of a bitmap
or just a newline character. An example of such file is:
/sys/devices/system/cpu/isolated. Our current implementation of
virFileReadValueBitmap() fails in the latter case, unfortunately.
Introduce a slightly modified version that accepts empty files.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-05-06 15:29:36 +02:00
Michal Privoznik
b972cdc1a5 virbitmap: Introduce virBitmapParseUnlimitedAllowEmpty()
Some sysfs files contain either string representation of a bitmap
or just a newline character. An example of such file is:
/sys/devices/system/cpu/isolated. Our current implementation of
virBitmapParseUnlimited() fails in the latter case,
unfortunately. Introduce a slightly modified version that accepts
empty files.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-05-06 15:26:58 +02:00