35976 Commits

Author SHA1 Message Date
Michal Privoznik
b5c54df901 virt-aa-helper: Drop needless comments
When generating paths for a domain specific AppArmor profile each
path undergoes a validation where it's matched against an array
of well known prefixes (among other things). Now, for
OVMF/AAVMF/... images we have a list and some entries have
comments to which type of image the entry belongs to. For
instance:

  "/usr/share/OVMF/",                  /* for OVMF images */
  "/usr/share/AAVMF/",                 /* for AAVMF images */

But these comments are pretty useless. The path itself already
gives away the image type. Drop them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
2024-07-10 09:25:32 +02:00
hongmianquan
0d3e962d47 security_manager: Remove redundant qemuSecurityGetNested() call
This commit removes the redundant call to qemuSecurityGetNested() in
qemuStateInitialize(). In qemuSecurityGetModel(), the first security manager
in the stack is already used by default, so this change helps to
simplify the code.

Signed-off-by: hongmianquan <hongmianquan@bytedance.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-09 13:24:57 +02:00
hongmianquan
790b4d8067 security_manager: Ensure top lock is acquired before nested locks
Fix libvirtd hang since fork() was called while another thread had
security manager locked.

We have the stack security driver, which internally manages other security drivers,
just call them "top" and "nested".

We call virSecurityStackPreFork() to lock the top one, and it also locks
and then unlocks the nested drivers prior to fork. Then in qemuSecurityPostFork(),
it unlocks the top one, but not the nested ones. Thus, if one of the nested
drivers ("dac" or "selinux") is still locked, it will cause a deadlock. If we always
surround nested locks with top lock, it is always secure. Because we have got top lock
before fork child libvirtd.

However, it is not always the case in the current code, We discovered this case:
the nested list obtained through the qemuSecurityGetNested() will be locked directly
for subsequent use, such as in virQEMUDriverCreateCapabilities(), where the nested list
is locked using qemuSecurityGetDOI, but the top one is not locked beforehand.

The problem stack is as follows:

libvirtd thread1          libvirtd thread2          child libvirtd
        |                           |                       |
        |                           |                       |
virsh capabilities      qemuProcessLanuch                   |
        |                           |                       |
        |                       lock top                    |
        |                           |                       |
    lock nested                     |                       |
        |                           |                       |
        |                           fork------------------->|(nested lock held by thread1)
        |                           |                       |
        |                           |                       |
    unlock nested               unlock top              unlock top
                                                            |
                                                            |
                                                qemuSecuritySetSocketLabel
                                                            |
                                                            |
                                                    lock nested (deadlock)

In this commit, we ensure that the top lock is acquired before the nested lock,
so during fork, it's not possible for another task to acquire the nested lock.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1303031

Signed-off-by: hongmianquan <hongmianquan@bytedance.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-09 13:22:26 +02:00
Miroslav Los via Devel
8515a178f8 qemuDomainChangeNet: check virtio options for non-virtio models
In a domain created with an interface with a <driver> subelement,
the device contains a non-NULL virDomainVirtioOptions struct, even
for non-virtio NIC models. The subelement need not be present again
after libvirt restarts, or when the interface is passed to clients.

When clients such as virsh domif-setlink put back the modified
interface XML, the new device's virtio attribute is NULL. This may
fail the equality checks for virtio options in qemuDomainChangeNet,
depending on whether libvird was restarted since define or not.

This patch modifies the check for non-virtio models, to ignore olddev
value of virtio (assumed valid), and to allow either NULL or a struct
with all values ABSENT in the new virtio options.

Signed-off-by: Miroslav Los <mirlos@cisco.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-09 13:20:05 +02:00
Martin Kletzander
db622081e0 vmx: Do not require all ID data for VMWare Distributed Switch
Similarly to commit 2482801608b8 we can safely ignore connectionId,
portId and portgroupId in both XML and VMX as they are only a blind
pass-through between XML and VMX and an ethernet without such parameters
was spotted in the wild.  On top of that even our documentation says the
whole VMWare Distrubuted Switch configuration is a best-effort.

Resolves: https://issues.redhat.com/browse/RHEL-46099

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-07-08 15:18:22 +02:00
Michal Privoznik
893800be49 virt-aa-helper: Allow RO access to /usr/share/edk2-ovmf
When binary version of edk2 is distributed, the files reside
under /usr/share/edk2-ovmf as can be seen from Gentoo's ebuild
[1]. Allow virt-aa-helper to generate paths under that dir.

1: https://gitweb.gentoo.org/repo/gentoo.git/tree/sys-firmware/edk2-ovmf-bin/edk2-ovmf-bin-202202.ebuild
Resolves: https://bugs.gentoo.org/911786
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-07-07 07:24:56 +02:00
Daniel P. Berrangé
e40a533118 qemu: set swtpm log level parameter
This wires up the emulator 'debug' parameter to control the
/usr/bin/swtpm 'level' parameter for logging.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-07-05 14:43:15 +01:00
Daniel P. Berrangé
5c77ecd5f3 conf: add support for 'debug' parameter on TPM emulator
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-07-05 14:43:15 +01:00
John Levon
9559130693 test_driver: support VIR_DOMAIN_AFFECT_LIVE in testUpdateDeviceFlags()
Pick up some more of the qemu_driver.c code so this function supports
both CONFIG and LIVE updates.

Note that qemuDomainUpdateDeviceFlags() passed vm->def to
virDomainDeviceDefParse() for the VIR_DOMAIN_AFFECT_CONFIG case, which
is technically incorrect; in the test driver code we'll fix this.

Signed-off-by: John Levon <john.levon@nutanix.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-04 15:29:33 +02:00
Rayhan Faizel
1ebb892472 conf: Fix out-of-bounds write during cleanup of virDomainNumaDefNodeDistanceParseXML
mem_nodes[i].ndistances is written outside the loop causing an out-of-bounds
write leading to heap corruption.

While we are at it, the entire cleanup portion can be removed as it can be
handled in virDomainNumaFree. One instance of VIR_FREE is also removed and
replaced with g_autofree.

This patch also adds a testcase which would be picked up by ASAN, if this
portion regresses.

Fixes: 742494eed8dbdde8b1d05a306032334e6226beea
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-04 14:58:15 +02:00
Tim Wiederhake
f67b12ba35 cpu_map: Ignore feature "kvm-asyncpf-vmexit"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:24 +02:00
Tim Wiederhake
9c46fb8d3d cpu_map: Add missing feature "vmx-nested-exception"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:20 +02:00
Tim Wiederhake
7e395b4ef0 cpu_map: Add missing feature "rfds-clear"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:18 +02:00
Tim Wiederhake
3ff2d2d502 cpu_map: Add missing feature "rfds-no"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:16 +02:00
Tim Wiederhake
aba89e2f98 cpu_map: Add missing feature "succor"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:13 +02:00
Tim Wiederhake
62dc5d44a7 cpu_map: Add missing feature "overflow-recov"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:11 +02:00
Tim Wiederhake
bcb4b246a9 cpu_map: Add missing feature "lam"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:09 +02:00
Tim Wiederhake
4b556699c6 cpu_map: Add missing feature "wrmsrns"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:06 +02:00
Tim Wiederhake
261fe98dee cpu_map: Add missing feature "lkgs"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:36:04 +02:00
Tim Wiederhake
4d981bdb2c cpu_map: Add missing feature "fred"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 13:35:36 +02:00
Adam Julis
c3302ceb1d qemuDomainChangeNet: forbid changing portgroup
Changing the postgroup attribute caused unexpected behavior.
Although it can be implemented, it has a non-trivial solution.
No requirement or use has yet been found for implementing this
feature, so it has been disabled for hot-plug.

Resolves: https://issues.redhat.com/browse/RHEL-7299
Signed-off-by: Adam Julis <ajulis@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 09:59:10 +02:00
Rayhan Faizel
70e826ec6a conf: Fix rawio/sgio checks for non-scsi hostdev devices
The current hostdev parsing logic sets rawio or sgio even if the hostdev type
is not 'scsi'. The rawio field in virDomainHostdevSubsysSCSI overlaps with
wwpn field in virDomainHostdevSubsysSCSIVHost, consequently setting a bogus
pointer value such as 0x1 or 0x2 from virDomainHostdevSubsysSCSIVHost's
point of view. This leads to a segmentation fault when it attempts to free
wwpn.

While setting sgio does not appear to crash, it shares the same flawed logic
as setting rawio.

Instead, we ensure these are set only after the hostdev type check succeeds.
This patch also adds two test cases to exercise both scenarios.

Fixes: bdb95b520c53f9bacc6504fc51381bac4813be38
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 09:54:43 +02:00
John Levon
738b201aad test_driver: add testUpdateDeviceFlags implementation
Add basic coverage of device update; for now, only support disk updates
until other types are needed or tested.

Signed-off-by: John Levon <john.levon@nutanix.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-02 16:06:19 +02:00
Michal Privoznik
cf7d495324 qemu: Drop _virQEMUDriver::hostFips
The 'hostFips' member of _virQEMUDriver struct is not used
really, due to previous cleanups. Drop it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-02 09:14:24 +02:00
Michal Privoznik
ce48d584cc qemu_capabilities: Retire QEMU_CAPS_VXHS
The support for VXHS device was removed in QEMU commit
v5.1.0-rc1~16^2~10. Since we require QEMU-5.2.0 at least there's
no QEMU that has the device and thus the corresponding capability
can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-02 09:14:23 +02:00
Michal Privoznik
295eb1b3d8 qemu_capabilities: Retire QEMU_CAPS_ENABLE_FIPS
The capability is no longer used. Retire it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-02 09:14:22 +02:00
Michal Privoznik
8cf81de8bf qemu_capabilities: Drop version check for QEMU_CAPS_ENABLE_FIPS and QEMU_CAPS_NETDEV_USER
Now that the minimal required version of QEMU is 5.2.0 the
conditional setting of QEMU_CAPS_ENABLE_FIPS and
QEMU_CAPS_NETDEV_USER is effectively a dead code. Drop it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-02 09:14:20 +02:00
Michal Privoznik
073bf16784 qemu_capabilities: Require QEMU-5.2.0 or newer
According to repology.org and/or distro repos these are the version of QEMU:

     CentOS Stream 9: qemu-kvm-9.0.0
           Debian 11: qemu-5.2.0
           Fedora 39: qemu-8.3.1
  openSUSE Leap 15.3: qemu-5.2.0
              RHEL-8: qemu-6.2.0
        Ubuntu 22.04: qemu-6.2.0

Since the minimal version is 5.2.0 we can bump from 4.2.0 to
5.2.0.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-02 09:14:18 +02:00
Michal Privoznik
8f34fd0c4c qemu_domain: Set 'passt' net backend if 'default' is unsupported
It may happen that QEMU is compiled without SLIRP but with
support for passt. In such case it is acceptable to alter user
provided configuration and switch backend to passt as it offers
all the features as SLIRP.

Resolves: https://issues.redhat.com/browse/RHEL-45518
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:40:06 +02:00
Michal Privoznik
bd6060d1c3 qemu_validate: Use domaincaps to validate supported net backend type
Now that the logic for detecting supported net backend types has
been moved to domain capabilities generation, we can just use it
when validating net backend type. Just like we do for device
models and so on.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:39:10 +02:00
Michal Privoznik
751a327423 conf: Accept 'default' backend type for <interface type='user'/>
After previous commits, domain capabilities XML reports basically
two possible values for backend type: 'default' and 'passt'.
Despite its misleading name, 'default' really means 'use
hypervisor's builtin SLIRP'. Since it's reported in domain
capabilities as a value accepted, make our parser and XML schema
accept it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:38:21 +02:00
Michal Privoznik
6a0f45a9e0 qemu_capabilities: Fill supported net backend types
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:37:27 +02:00
Michal Privoznik
2d3a42cb7c domain_capabilities: Introduce netdev capabilities
If mgmt apps on top of libvirt want to make a decision on the
backend type for <interface type='user'/> (e.g. whether past is
supported) we currently offer them no way to learn this fact.
Domain capabilities were invented exactly for this reason. Report
supported net backend types there.

Now, because of backwards compatibility, specifying no backend
type (which translates to VIR_DOMAIN_NET_BACKEND_DEFAULT) means
"use hyperviosr's builtin SLIRP". That behaviour can not be
changed. But it may happen that the hypervisor has no support for
SLIRP. So we have to report it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:36:28 +02:00
Michal Privoznik
73fc20e262 qemu_validate: Validate net backends against QEMU caps
Now that we have a capability for each domain net backend we can
start validating user's selection against QEMU capabilities.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:33:14 +02:00
Michal Privoznik
e28bc15f09 qemu_capabilities: Introduce QEMU_CAPS_NETDEV_USER
Since -netdev user can be disabled during QEMU compilation, we
can't blindly expect it to just be there. We need a capability
that tracks its presence.

For qemu-4.2.0 we are not able to detect the capability so do the
next best thing - assume the capability is there. This is
consistent with our current behaviour where we blindly assume the
capability, anyway.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:32:16 +02:00
Michal Privoznik
e42f9e40b9 libvirt_private.syms: Export virDomainNetBackendType enum handlers
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:28:03 +02:00
Pavel Hrdina
67fdc636bf vircgroup: fix g_variant_new_parsed format string causing abort
The original code was incorrect and never tested because at the time of
implementing it the cgroup file `io.weight` was not available.

Resolves: https://issues.redhat.com/browse/RHEL-45185
Introduced-by: 9c1693eff427661616ce1bd2795688f87288a412
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-06-28 16:51:33 +02:00
Jon Kohler
76e2dae01a qemu: fix switchover-ack regression for old qemu
When enabling switchover-ack on qemu from libvirt, the .party value
was set to both source and target; however, qemuMigrationParamsCheck()
only takes that into account to validate that the remote side of the
migration supports the flag if it is marked optional or auto/always on.

In the case of switchover-ack, when enabled on only the dst and not
the src, the migration will fail if the src qemu does not support
switchover-ack, as the dst qemu will issue a switchover-ack msg:
qemu/migration/savevm.c ->
  loadvm_process_command ->
    migrate_send_rp_switchover_ack(mis) ->
      migrate_send_rp_message(mis, MIG_RP_MSG_SWITCHOVER_ACK, 0, NULL)

Since the src qemu doesn't understand messages with header_type ==
MIG_RP_MSG_SWITCHOVER_ACK, qemu will kill the migration with error:
  qemu-kvm: RP: Received invalid message 0x0007 length 0x0000
  qemu-kvm: Unable to write to socket: Bad file descriptor

Looking at the original commit [1] for optional migration capabilities,
it seems that the spirit of optional handling was to enhance a given
existing capability where possible. Given that switchover-ack
exclusively depends on return-path, adding it as optional to that cap
feels right.

[1] 61e34b08568 ("qemu: Add support for optional migration capabilities")

Fixes: 1cc7737f69e ("qemu: add support for qemu switchover-ack")

Signed-off-by: Jon Kohler <jon@nutanix.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Avihai Horon <avihaih@nvidia.com>
Cc: Jiri Denemark <jdenemar@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: YangHang Liu <yanghliu@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-06-28 08:50:12 +02:00
Michal Privoznik
ea73fcb3e3 remote_daemon_dispatch: Unref sasl session when closing client connection
In ideal world, where clients close connection gracefully their
SASL session is freed in virNetServerClientDispose() as it's
stored in client->sasl. Unfortunately, if client connection is
closed prematurely (e.g. the moment virsh asks for credentials),
the _virNetServerClient member is never set and corresponding
SASL session is never freed. The handler is still stored in
client private data, so free it in remoteClientCloseFunc().

  20,862 (288 direct, 20,574 indirect) bytes in 3 blocks are definitely lost in loss record 1,763 of 1,772
     at 0x50390C4: g_type_create_instance (in /usr/lib64/libgobject-2.0.so.0.7800.6)
     by 0x501BDAF: g_object_new_internal.part.0 (in /usr/lib64/libgobject-2.0.so.0.7800.6)
     by 0x501D43D: g_object_new_with_properties (in /usr/lib64/libgobject-2.0.so.0.7800.6)
     by 0x501E318: g_object_new (in /usr/lib64/libgobject-2.0.so.0.7800.6)
     by 0x49BAA63: virObjectNew (virobject.c:252)
     by 0x49BABC6: virObjectLockableNew (virobject.c:274)
     by 0x4B0526C: virNetSASLSessionNewServer (virnetsaslcontext.c:230)
     by 0x18EEFC: remoteDispatchAuthSaslInit (remote_daemon_dispatch.c:3696)
     by 0x15E128: remoteDispatchAuthSaslInitHelper (remote_daemon_dispatch_stubs.h:74)
     by 0x4B0FA5E: virNetServerProgramDispatchCall (virnetserverprogram.c:423)
     by 0x4B0F591: virNetServerProgramDispatch (virnetserverprogram.c:299)
     by 0x4B18AE3: virNetServerProcessMsg (virnetserver.c:135)

Resolves: https://issues.redhat.com/browse/RHEL-22574
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-27 17:02:51 +02:00
Michal Privoznik
fbe97ee17d qemu_validate: Use domaincaps to validate supported launchSecurity type
Now that the logic for detecting supported launchSecurity types
has been moved to domain capabilities generation, we can just use
it when validating launchSecurity type. Just like we do for
device models and so on.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:46:08 +02:00
Michal Privoznik
66df7992d8 qemu: Fill launchSecurity in domaincaps
The inspiration for these rules comes from
qemuValidateDomainDef().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:46:05 +02:00
Michal Privoznik
d460e17282 domcaps: Report launchSecurity
In order to learn what types of <launchSecurity/> are supported
users can turn to domain capabilities and find <sev/> and
<s390-pv/> elements. While these may expose some additional info
on individual launchSecurity types, we are lacking clean
enumeration (like we do for say device models). And given that
SEV and SEV SNP share the same basis (info found under <sev/> is
applicable to SEV SNP too) we have no other way to report SEV SNP
support.

Therefore, report supported launchSecurity types in domain
capabilities.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:46:03 +02:00
Michal Privoznik
d00816209e qemu_capabilities: Probe SEV capabilities even for QEMU_CAPS_SEV_SNP_GUEST
While it's very unlikely to have QEMU that supports SEV-SNP but
doesn't support plain SEV, for completeness sake we ought to
query SEV capabilities if QEMU supports either. And similarly to
QEMU_CAPS_SEV_GUEST we need to clear the capability if talking to
QEMU proves SEV is not really supported.

This in turn removes the 'sev-snp-guest' capability from one of
our test cases as Peter's machine he uses to refresh capabilities
is not SEV capable. But that's okay. It's consistent with
'sev-guest' capability.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:46:00 +02:00
Michal Privoznik
3a6ca064ca libvirt_private.syms: Export virDomainLaunchSecurity enum handlers
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:45:54 +02:00
Rayhan Faizel
9b0606ef8e qemu_block: Validate number of hosts for iSCSI disk device
An iSCSI device with zero hosts will result in a segmentation fault. This patch
adds a check for the number of hosts, which must be one in the case of iSCSI.

Minimal reproducing XML:

<domain type='qemu'>
    <name>MyGuest</name>
    <uuid>4dea22b3-1d52-d8f3-2516-782e98ab3fa0</uuid>
    <os>
        <type arch='x86_64'>hvm</type>
    </os>
    <memory>4096</memory>
    <devices>
        <disk type='network'>
            <source name='dummy' protocol='iscsi'/>
            <target dev='vda'/>
        </disk>
    </devices>
</domain>

Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-06-25 10:05:49 +02:00
Jon Kohler
1cc7737f69 qemu: add support for qemu switchover-ack
Add plumbing for QEMU's switchover-ack migration capability, which
helps lower the downtime during VFIO migrations. This capability is
enabled by default as long as both the source and destination support
it.

Note: switchover-ack depends on the return path capability, so this may
not be used when VIR_MIGRATE_TUNNELLED flag is set.

Extensive details about the qemu switchover-ack implementation are
available in the qemu series v6 cover letter [1] where the highlight is
the extreme reduction in guest visible downtime. In addition to the
original test results below, I saw a roughly ~20% reduction in downtime
for VFIO VGPU devices at minimum.

  === Test results ===

  The below table shows the downtime of two identical migrations. In the
  first migration swithcover ack is disabled and in the second it is
  enabled. The migrated VM is assigned with a mlx5 VFIO device which has
  300MB of device data to be migrated.

  +----------------------+-----------------------+----------+
  |    Switchover ack    | VFIO device data size | Downtime |
  +----------------------+-----------------------+----------+
  |       Disabled       |         300MB         |  1900ms  |
  |       Enabled        |         300MB         |  420ms   |
  +----------------------+-----------------------+----------+

  Switchover ack gives a roughly 4.5 times improvement in downtime.
  The 1480ms difference is time that is used for resource allocation for
  the VFIO device in the destination. Without switchover ack, this time is
  spent when the source VM is stopped and thus the downtime is much
  higher. With switchover ack, the time is spent when the source VM is
  still running.

[1] https://patchwork.kernel.org/project/qemu-devel/cover/20230621111201.29729-1-avihaih@nvidia.com/

Signed-off-by: Jon Kohler <jon@nutanix.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Avihai Horon <avihaih@nvidia.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: YangHang Liu <yanghliu@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-06-25 09:51:00 +02:00
Jiri Denemark
e622970c87 qemu: Fix migration with disabled vmx-* CPU features
When starting a domain on a host which lacks a vmx-* CPU feature which
is expected to be enabled by the CPU model specified in the domain XML,
libvirt properly marks such feature as disabled in the active domain
XML. But migrating the domain to a similar host which lacks the same
vmx-* feature will fail with libvirt reporting the feature as missing.
This is because of a bug in the hack ensuring backward compatibility
libvirt running on the destination thinks the missing feature is
expected to be enabled.

https://issues.redhat.com/browse/RHEL-40899

Fixes: v10.1.0-85-g5fbfa5ab8a
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-06-25 09:41:16 +02:00
Jonathon Jongsma
af437d2d64 qemu: Don't specify vfio-pci.ramfb when ramfb is false
Commit 7c8e606b64c73ca56d7134cb16d01257f39c53ef attempted to fix
the specification of the ramfb property for vfio-pci devices, but it
failed when ramfb is explicitly set to 'off'. This is because only the
'vfio-pci-nohotplug' device supports the 'ramfb' property. Since we use
the base 'vfio-pci' device unless ramfb is enabled, attempting to set
the 'ramfb' parameter to 'off' this will result in an error like the
following:

  error: internal error: QEMU unexpectedly closed the monitor
  (vm='rhel'): 2024-06-06T04:43:22.896795Z qemu-kvm: -device
  {"driver":"vfio-pci","host":"0000:b1:00.4","id":"hostdev0","display":"on
  ","ramfb":false,"bus":"pci.7","addr":"0x0"}: Property 'vfio-pci.ramfb'
  not found.

This also more closely matches what is done for mdev devices.

Resolves: https://issues.redhat.com/browse/RHEL-28808

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-24 08:55:50 -05:00
Adam Julis
3a9095976e qemuDomainDiskChangeSupported: Fill in missing check
The attribute 'discard_no_unref' of <disk/> is not allowed to be
changed while the virtual machine is running.

Resolves: https://issues.redhat.com/browse/RHEL-37542
Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-24 11:14:56 +02:00
Laine Stump
43a0881274 network: allow for forward dev to be a transient interface
A user reported that if they set <forward mode='nat|route' dev='blah'>
starting the network would fail if the device 'blah' didn't already
exist.

This is caused by using "iif" and "oif" in nftables rules to check for
the forwarding device - these two commands work by saving the named
interface's ifindex (an unsigned integer) when the rule is added, and
comparing it to the ifindex associated with the packet's path at
runtime. This works great if the interface both 1) exists when the
rule is added, and 2) is never deleted and re-created after the rule
is added (since it would end up with a different ifindex).

When checking for the network's bridge device, it is okay for us to
use "iif" and "oif", because the bridge device is created before the
firewall rules are added, and will continue to exist until just after
the firewall rules are deleted when the network is shutdown.

But since the forward device might be deleted/re-added during the
lifetime of the network's firewall rules, we must instead us "oifname"
and "iifname" - these are much less efficient than "Xif" because they
do a string compare of the interface's name rather than just comparing
two integers (ifindex), but they don't require the interface to exist
when the rule is added, and they can properly cope with the named
interface being deleted and re-added later.

Fixes: a4f38f6ffe6a9edc001d18890ccfc3f38e72fb94
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 06:52:57 -04:00