libvirt

mirror of https://gitlab.com/libvirt/libvirt.git synced 2025-01-05 04:25:19 +00:00

Author	SHA1	Message	Date
Jim Fehlig	b0dc8a923d	qemu: conf: Improve the foo_image_format setting descriptions The current description of the various foo_image_format settings can be construded to imply the setting is only used to control compression of the image. Improve the documentation to clarify that format describes the representation of guest memory blocks on disk, which includes compression among other possible layouts. Signed-off-by: Jim Fehlig <jfehlig@suse.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-10-09 13:48:38 -06:00
Peter Krempa	aa08a30048	qemu: snapshot: Allow internal snapshots with PFLASH nvram With the new snapshot QMP command we can select which block device backend receives the VM state and thus the main issue with internal snapshots with pflash was addressed. Thus we can relax the check and allow snapshots if the pflash nvram is on qcow2. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 16:00:43 +02:00
Peter Krempa	8be8b7de78	qemuSnapshotActiveInternalDeleteGetDevices: Add warning when deleting inconsistent snapshot As explained in the commit which added the new internal snapshot deletion code we don't want to do any form of strict checking whether the libvirt metadata is consistent with the on-disk state as we didn't historically do that. In order to be able to spot the cases add a warning into the logs if such state is encountered. While warnings are easy to miss it's the only reasonable way to do that. Users will be encouraged to file an issue with the information, without requiring them to enable debug logs as the reproduction of that issue may include very old historical state. The checker is deliberately added separately so that it can be easily reverted once it's no longer needed. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 16:00:43 +02:00
Peter Krempa	eac1a86f72	qemu snapshot: use QMP snapshot-delete for internal snapshots deletion Switch to using the modern QMP command. As the user visible logic when deleting internal snapshots using the old 'delvm' command was very lax in terms of catching inconsistencies between the snapshot metadata and on-disk state we re-implement this behaviour even using the new command. We could improve the validation but that'd go at the cost of possible failures which users might not expect. As 'delvm' was simply ignoring any kind of failure the selection of devices to delete the snapshot from is based on querying qemu first which top level images do have the internal snapshot and then continuing only on those. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 16:00:43 +02:00
Nikolai Barybin via Devel	b93af62c40	qemu snapshot: use QMP snapshot-save for internal snapshots creation The usage of HMP commands are highly discouraged by qemu. Moreover, current snapshot creation routine does not provide flexibility in choosing target device for VM state snapshot. This patch makes use of QMP commands snapshot-save and by default chooses first writable non-shared qcow2 disk (if present) as target for VM state. Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com> Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 16:00:43 +02:00
Peter Krempa	6d8ae98fa0	qemu: monitor: Store internal snapshot names from 'query-named-block-nodes' Store the names of internal snapshots present in supported images in the data we dump from 'query-named-block-nodes' so that the upcoming changes to the internal snapshot code can access it. To test this we use the bitmap detection test cases which can be easily extended to dump this data. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 16:00:43 +02:00
Nikolai Barybin via Devel	9df1453db8	qemu: capabilities: Introduce QEMU_CAPS_SNAPSHOT_INTERNAL_QMP capability The 'snapshot-save/delete' QMP commands were introduced in QEMU 6.0.0, so we add a compatible capability to check if target QEMU binary supports it. Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com> Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 15:22:00 +02:00
Nikolai Barybin via Devel	ce4ed8deef	qemu: blockjob: Add job types for 'snapshot-save/delete' The snapshot creation/deletion QMP commands use the qemu 'job' API to signal completion thus we need to add corresponding job types. As the job handles everything internally we don't store anything about the job. Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com> Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 15:22:00 +02:00
Nikolai Barybin via Devel	5d0773633a	qemu: monitor: Add plumbing for 'snaphot-save'/'snapshot-delete' QMP commands Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com> Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 15:22:00 +02:00
Peter Krempa	2e325804cc	qemuDomainObjWait: Annotate with G_GNUC_WARN_UNUSED_RESULT Callers must handle the return value of this function as the VM might have died. Add compiler annotation to force it. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-09 15:22:00 +02:00
Jiri Denemark	93d97d8fa2	cpu_map: Drop vmx-invvpid-single-context from CPU models QEMU calls the same feature differently, but translating the names in libvirt does not make sense because the name in QEMU conflicts with another feature. QEMU will not change the name for compatibility reasons so we can just drop our invented name as it is not supported by QEMU. Apart from this slightly different reason behind the feature being unsupported by QEMU the situation is similar to vmx-ept-{uc,wb} dropped in the previous patch and so is the implications. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-09 14:46:51 +02:00
Jiri Denemark	b1d4196580	cpu_map: Drop vmx-ept-{uc,wb} features from CPU models Although QEMU knows and enables the corresponding MSR bits, it does not allow users to configure them (there are no names attached to them). They should have never been added to the CPU map and definitely not to CPU models as the features will always be considered disabled regardless on their actual state as QEMU will not report them. While we cannot drop them completely for backward compatibility, we can at least remove them from all CPU models. This is effectively no change for CPU models where the features were marked with added='yes' because migration source would always remove the features from domain XML so not adding them to the live XML does not hurt. On the other side the destination could not ever be surprised by the features being suddenly enabled as QEMU never reports them, which means libvirt considers them disabled all the time. GraniteRapids CPU model is the only one which contains the feature ever since it was introduced in libvirt, but it was never possible to migrate a domain with such CPU. The source would always mark vmx-ept-wb as disabled and the destination without the fixes in this series would drop the feature from the XML completely as it is unsupported by QEMU and disabled, but when probing for the actual CPU created by QEMU libvirt would expect the feature to be enabled (as it is included in the CPU model and not explicitly mentioned in the domain definition) and fail the migration. There's nothing the source could do to workaround the behavior on the destination and migration to older libvirt will still be broken. But it's possible to migrate a domain with GraniteRapids to a destination with this series applied from both old and new source. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-09 14:46:51 +02:00
Jiri Denemark	29aa9b02aa	qemu: Replace big condition in virQEMUCapsCPUFilterFeatures with array Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-09 14:46:51 +02:00
Jiri Denemark	98700d354b	qemu: Translate vmx-invvpid-single-context-noglobals CPU feature This feature is called "vmx-invept-single-context-noglobals" in QEMU and our CPU map even contains the appropriate alias. But we failed to actually translate the name when talking to QEMU. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-09 14:46:51 +02:00
Jiri Denemark	00e55059e6	qemu: Do not drop unknown CPU features from domain XML CPU features with policy='disable' which are unknown to QEMU may be safely skipped when generating the -cpu command line, but we should still keep them in the domain definition so that we can properly check they are disabled after migrating the domain to a newer QEMU. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-09 14:46:51 +02:00
Jiri Denemark	aae8a5774b	qemu: Drop vmx-* from migratable CPU model only when origCPU is set When qemuDomainMakeCPUMigratable is called with origCPU == NULL the code just removed all vmx-* features marked as added in the specified CPU model just like when origCPU is not NULL, but does not list any of the vmx-* features. But this is wrong, we should not touch these features at all when no origCPU is supplied, which happens when parsing XML passed by a user (e.g., migration XML). Such XML is supposed to be generated by libvirt as migration XML and contains only vmx-* features explicitly requested by a user. https://issues.redhat.com/browse/RHEL-52314 Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-10-09 14:46:50 +02:00
Andrea Bolognani	7d6759135e	qemu: Handle locking of TPM state directory for incoming migration By not attempting to lock the lock file, which would fail. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-10-03 14:50:06 +02:00
Andrea Bolognani	454219ad6c	security: Allow skipping locking when labeling lock files This is needed when migrating a guest that has persistent TPM state: relabeling (which implies locking) needs to happen before the swtpm process is started on the destination host, but the lock file won't be released by the swtpm process running on the source host before a handshake with the target process has happened, creating a catch-22 scenario. In order to make migration possible, make it so that locking for lock files can be explicitly skipped. All other state files are handled as usual. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-10-03 13:29:59 +02:00
Peter Krempa	3bfcb35dd5	qemu: migration: Don't remember seclabel for images shared from current host In case when the user exports images from current host and there is an incoming migration from a remote host, security label remembering would be possible but would attempt to remember the label allowing access to the image as the image is already used by a VM on remote host. To prevent remembering the wrong label, we'll skip the remembering of the label for any shared resource, so that the code behaves identically regardless of how the image is accessed. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>	2024-10-03 13:29:26 +02:00
Andrea Bolognani	da0c363835	qemu: Always set labels for TPM state Up until this point, we have avoided setting labels for incoming migration when the TPM state is stored on a shared filesystem. This seems to make sense, because since the underlying storage is shared surely the labels will be as well. There's one problem, though: when a guest is migrated, the SELinux context for the destination process is different from the one of the source process. We haven't hit any issues with the current approach so far because NFS doesn't support SELinux, so effectively it doesn't matter whether relabeling happens or not: even if the SELinux contexts of the source and target processes are different, both will be able to access the storage. Now that it's possible for the local admin to manually mark exported directories as shared filesystems, however, things can get problematic. Consider the case in which one host (mig-one) exports its local filesystem /srv/nfs/libvirt/swtpm via NFS, and at the same time bind-mounts it to /var/lib/libvirt/swtpm; another host (mig-two) mounts the same filesystem to the same location, this time via NFS. Additionally, in order to allow migration in both directions, on mig-one the /var/lib/libvirt/swtpm directory is listed in the shared_filesystems qemu.conf option. When migrating from mig-one to mig-two, things work just fine; going in the opposite direction, however, results in an error: # virsh migrate cirros qemu+ssh://mig-one/system error: internal error: QEMU unexpectedly closed the monitor (vm='cirros'): qemu-system-x86_64: tpm-emulator: Setting the stateblob (type 1) failed with a TPM error 0x1f qemu-system-x86_64: error while loading state for instance 0x0 of device 'tpm-emulator' qemu-system-x86_64: load of migration failed: Input/output error This is because the directory on mig-one is considered a shared filesystem and thus labeling is skipped, resulting in a SELinux denial. The solution is quite simple: remove the check and always relabel. We know that it's okay to do so not just because it makes the error seen above go away, but also because no such check currently exists for disks and other types of persistent storage such as NVRAM files, which always get relabeled. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-10-03 13:29:26 +02:00
Andrea Bolognani	6952af8b43	qemu: Propagate shared_filesystems virFileIsSharedFS() is the function that ultimately decides whether a filesystem should be considered shared, but the list of manually configured shared filesystems is part of the QEMU driver's configuration, so we need to pass the information through several layers in order to make use of it. Note that with this change the list is propagated all the way through, but its contents are still ignored, so the behavior remains the same for now. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-10-03 13:29:26 +02:00
Andrea Bolognani	df3597ee70	qemu: Introduce shared_filesystems configuration option As explained in the comment, this can help in scenarios where a shared filesystem can't be detected as such by libvirt, by giving the admin the opportunity to provide this information manually. https://issues.redhat.com/browse/RHEL-35752 Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-10-03 13:29:25 +02:00
Peter Krempa	621f879adf	qemu: Introduce and wire in 'VIR_MIGRATE_PARAM_MIGRATE_DISKS_DETECT_ZEROES' The new 'VIR_MIGRATE_PARAM_MIGRATE_DISKS_DETECT_ZEROES' migration parameter allows users of migration to pass in a list of disks where zero-detection (which avoids transferring the zeroed-blocks) should be enabled for the migration connection. This comes at the cost of extra CPU cycles needed to check each block if it's all-zero. This is useful for storage backends where information about the allocation state of a block is not available and thus without this the image would become fully allocated on the destination. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	448b14f74d	qemu: migration: Remove 'nmigration_disks' variable from all places Now that none of the functions need it we can drop it. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	aaefaabf5a	qemu: migration: Extract validation of disk target list The migration code is checking the disk list provided via VIR_MIGRATE_PARAM_MIGRATE_DISKS against existing disks. Extract it to a helper function as we'll be passing another list of disk targets soon. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	4ebf1acb83	qemu: migration: Avoid use of 'nmigration_disks' 'migration_disks' is a NULL-terminated string list, so the code can be converted to either iterate the string-list, use existing accessors or check the presence of the pointers instead of checking the count. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	d98beef107	qemu: migration: Don't log 'nmigrate_disks' The actual number of disks to migrate is not important. The presence of disks to migrate can be inferred from presence of the 'migrate_disks' pointer which is logged. Since 'nmigrate_disks' will eventually be removed remove the logging right now. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	ab52a069ee	qemuMigrationSrcBeginPhaseBlockDirtyBitmaps: Use qemuMigrationAnyCopyDisk() The function open-coded the checking whether a disk is being migrated with non-shared storage and did so badly (not taking into account if user doesn't explicitly provide list of disks to migrate). Use the existing helper instead. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	165b30e06a	qemu: migration: Pre-create QCOW2 images for non-shared storage with 0 allocation Specify that the <allocation> parameter for the newly-created qcow2 image is 0 so that only metadata gets preallocated. Otherwise the storage driver code instructs qemu to use 'fallocate' preallocation mode and considers the image fully allocated. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:57:02 +02:00
Peter Krempa	54109db826	qemu: blockjob: Clean out disk mirror data after concluding the job The 'disk->mirrorJob' and 'disk->mirrorState' fields need to be cleared after a blockjob, but should be kept around while 'disk->mirror' is still in place. As 'disk->mirror' is cleared only after conclusion of the job in 'qemuBlockJobEventProcessConcluded()' we should be resetting them only afterwards. Move the code later, but since the job is unregistered from the disk we need to store the pointer to the disk before concluding the job. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:54:40 +02:00
Peter Krempa	b85b60d140	qemu: blockjob: Update 'mirror' of a copy job before removing images When concluding a job with a 'mirror' we first unplugged the appropriate no-longer used images from qemu and then updated the definition. Normally this wouldn't be a problem because for any other thread this is done under the VM lock thus atomic. Unfortunately though, the AppArmor security backend is using a VM XML to pass data to the helper process and the state of the definition at that point was unsuitable to format a valid XML thus making 'virt-aa-helper' report parsing failure. Since we're removing the images the proper state of the VM definition indeed should not include the mirror element any more at the point when the images are removed. Closes: https://gitlab.com/libvirt/libvirt/-/issues/601 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>	2024-10-01 12:54:40 +02:00
Laine Stump	bcd5ae4e73	qemu: fix regression in update-device for interfaces Commit `a37bd2a15b` eliminated a failure to update any change in an interface that was connected via a network that consisted of a pool of VFs using macvtap passthrough mode. Unfortunately it caused a regression that results in failure to update changes to bandwidth/vlan/trustGuestRxFilters in any interface connected via a network that uses a bridge to connect tap devices. This fixes that problem by narrowing the usage of the fix in the earlier patch to only be done in the case that the the interface is connected via a macvtap+passthrough network. Signed-off-by: Laine Stump <laine@redhat.com> Fixes: `a37bd2a15b` Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-10-01 10:25:12 +02:00
Michal Privoznik	6126f743b1	qemu: Provide sane default for dump_guest_core QEMU uses Linux extensions to madvise() to include/exclude guest memory from core dump. These are obviously not available everywhere. Currently, users have two options: 1) configure <memory dumpCore=''/> in domain XML, or 2) configure dump_guest_core in qemu.conf While these work, they may harm user experience as "things just don't work" out of the box. Provide sane default in virQEMUDriverConfigNew() so neither of two options is required. To have predictable results in tests, explicitly set cfg->dumpGuestCore to false in qemuTestDriverInit() (which creates cfg object for tests). Resolves: https://gitlab.com/libvirt/libvirt/-/issues/679 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-25 08:38:09 +02:00
Michal Privoznik	18b61cb4f9	qemu.conf.in: Fix dumpCore capitalization In qemu.conf.in we give examples of enabling/disabling core dumps in domain XML. But the attribute is spelled wrong. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-25 08:38:09 +02:00
Martin Kletzander	6f0974ca32	qemu: Generate domain memory backing path directly This makes qemuDomainGenerateMemoryBackingPath() nicer to call. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-24 10:12:08 +02:00
Martin Kletzander	f035f24777	qemu: Rename memory path functions This way they make sense not only based on where they are located but the name also relates to what they are actually doing. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-24 10:12:08 +02:00
Martin Kletzander	d599fc3d57	qemu: Make qemuGetMemoryBackingDomainPath static After previous patches it is not used (and should not be used) outside of qemu_domain.c. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-24 10:12:08 +02:00
Martin Kletzander	ff49d2a8c2	qemu: Use per-domain private memoryBackingDir for new memory backends The function qemuGetMemoryBackingPath() does not need the @def any more and priv->memoryBackingDir can be used instead of constructing the path by calling qemuGetMemoryBackingDomainPath(). Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-24 10:12:08 +02:00
Martin Kletzander	f58a4dc9d5	qemu: Set memoryBackingDir in private data upon start This way we keep the path for each running VM. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-24 10:12:08 +02:00
Martin Kletzander	da8a1d7943	qemu: Add memoryBackingDir to qemuDomainObjPrivate This way we _can_ (but do not, yet) remember the memory backing path for running domains even after configuration change and daemon restart. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-24 10:12:08 +02:00
Martin Kletzander	c9a35eb255	qemu: Change parameters of qemuGetMemoryBackingDomainPath() This way it does not use driver, since it will be later reworked and the following patches cleaner, hopefully. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-24 10:12:08 +02:00
Martin Kletzander	edcf14be9c	qemu: Move domain-related functions to qemu_domain Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-24 10:12:08 +02:00
Laine Stump	c7ea694f7d	qemu: rework needBridgeChange/needReconnect decisions in qemuDomainChangeNet() This patch simplifies (?) the of qemuDomainChangeNet() code while fixing some incorrect decisions about exactly when it's necessary to re-attach an interface's bridge device, or to fail the device update (needReconnect[]) because the type of connection has changed (or within bridge and direct (macvtap) type because some attribute of the connection has changed that can't actually be modified after the tap/macvtap device of the interface is created). Example 1: it's pointless to require the bridge device to be reattached just because the interface has been switched to a different network (i.e. the name of the network is different), since the new network could be using the same bridge as the old network (very uncommon, but technically possible). Instead we should only care if the name of the bridge device* changes (or if something in <virtualport> changes - see Example 3). Example 2: wrt changing the "type" of the interface, a change should be allowed if old and new type both used a bridge device (whether or not the name of the bridge changes), or if old and new type are both "direct" and the device being linked and macvtap mode remain the same. Any other change in interface type cannot be accommodated and should be a failure (i.e. needReconnect). Example 3: there is no valid reason to fail just because the interface has a <virtualport> element - the <virtualport> could just say "type='openvswitch'" in both the before and after cases (in which case it isn't a change by itself, and so is completely acceptable), and even if the interfaceid changes, or the <virtualport> disappears completely, that can still be reconciled by simply re-attaching the bridge device. (If, on the other hand, the modified <virtualport> is for a type='direct' interface, we can't domodify that, and so must fail (needReconnect).) (I tried splitting this into multiple patches, but they were so intertwined that the intermediate patches made no sense.) [*] "needReconnect" was a flag added to this function way back in 2012, when I still believed that QEMU might someday support connecting a new & different device backend (the way the virtual device connects to the host) to an already existing guest netdev (the virtual device as it appears to the guest). Sadly that has never happened, so for the purposes of qemuDOmainChangeNet() "needReconnect" is equivalent to "fail". Resolves: https://issues.redhat.com/browse/RHEL-7036 Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-19 13:56:39 -04:00
Laine Stump	601f4160b9	qemu: replace open-coded remove/attach bridge with virNetDevTapReattachBridge() The new function does what the old qemuDomainChangeNetbridge() did manually, except that: 1) the new function supports changing from a bridge of one type to another, e.g. from a Linux host bridge to an OVS bridge. (previously that wasn't handled) 2) the new function doesn't emit audit log messages. This is actually a good thing, because the old code would just log a "detach" followed immediately by "attach" for the same MAC address, so it's essentially a NOP. (the audit logs don't have any more detailed info about the connection - just the VM name and MAC address, so it makes no sense to log the detach/attach pair as it's not providing any information). Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-19 13:56:31 -04:00
Laine Stump	a37bd2a15b	qemu: prevent unnecessarily failing live interface update Attempts to use update-device to modify just the link state of a guest interface were failing due to a supposed attempt to modify something in the interface that can't be modified live (even though the only thing that was changing was the link state, which can be modified live). It turned out that this failure happened because the guest interface in question was type='network', and the network in question was a 'direct' network that provides each guest interface with one device from a pool of network devices. As a part of qemuDomainChangeNet() we would always allocate a new port from the network driver for the updated interface definition (by way of calling virDomainNetAllocateActualDevice(newdev)), and this new port (ie the ActualNetDef in newdev) would of course be allocated a new host device from the pool (which would of course be different from the one currently in use by the guest interface (in olddev)). Because direct interfaces don't support changing the host device in a live update, this would cause the update to fail. The solution to this is to realize that as long as the interface doesn't get switched to a different network as a part of the update, the network port information (ie the ActualNetDef) will not change as a part of updating the guest interface itself. So for sake of comparison we can just point the newdev at the ActualNetDef of olddev, and then clear out one or the other when we're done (to avoid a double free or, more likely, attempt to reference freed memory). (If, on the other hand, the name of the network has changed, or if the interface type has changed to type='network' from something else, then we do need to allocate a new port (actual device) from the network driver (as we used to do in all cases when the new type was 'network'), and also indicate that we'll need to replace olddev in the domain with newdev (because either of these changes is major enough that we shouldn't just try to fix up olddev) Partially-Resolves: https://issues.redhat.com/browse/RHEL-7036 Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-19 13:56:06 -04:00
Peter Krempa	852380cef5	qemuBuildChardevCommand: Remove unused variable 'charstr' is unused since `36d06a5637`, breaking the build on some platforms. Remove it. Fixes: `36d06a5637` Signed-off-by: Peter Krempa <pkrempa@redhat.com>	2024-09-19 13:12:02 +02:00
Peter Krempa	24d468993c	qemu: Reject unsupported chardev backend protocols QEMU supports only 'raw' and 'telnet' in the <protocol type='telnets'/> element. Reject 'telnets' and 'tls'. TLS transport for qemu chardevs is configured via "tls='yes'" attribute added to the "<source>" element instead, so this prevents potential misconfig as the value would be silently accepted. Closes: https://gitlab.com/libvirt/libvirt/-/issues/412 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-09-19 10:30:15 +02:00
Peter Krempa	2256466f70	qemu: monitor: Remove the old chardev backend generator Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-09-19 10:30:15 +02:00
Peter Krempa	e352a692a7	qemu: Use the new chardev backend JSON props generator also in the monitor Now that we have a unified generator of chardev backend which is also validated against the QMP schema we can replace the old generator with it. This patch modifies the monitor code to take virJSONValue 'props' instead of the chardev definition and adds the conversion from the chardev definition to JSON on higher levels. The monitor code now also attempts to extract the returned 'pty' if returned from qemu, so higher level code needs to report the error if the path is needed and missing. The current monitor generator is for now abandoned in place and will be removed later. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-09-19 10:30:15 +02:00
Peter Krempa	d897ad2b89	qemu: Move check for chardev backends which can't be hotplugged out of the monitor The upcoming refactor of the monitor code will make the hotplug code paths use the same generator we have for commandline -chardev backends which doesn't refuse to format certain backends which can't be hotplugged. To prepare for this we add a check to qemuHotplugChardevAttach() refusing such hotplug and remove 'qemumonitorjsontest' test cases which will not make sense any more. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-09-19 10:30:14 +02:00
Peter Krempa	36d06a5637	qemu: Introduce unified chardev backend config generator Similarly to how we approach the generators for -device/-object/-blockdev/-netdev rewrite the generator of -chardev to be unified with the generator for the monitor. Unfortunately with -chardev it will be a bit more quirky when compared to the others as the generator itself will need to know whether it generates command line output or not as a few field names change and data is nested differently. This first step adds the generator and uses it only for command line generation. This was possible to achieve without changing any of the output in tests. In further patches the same generator will then be used also in the monitor code replacing both. As basis for the generator I took the monitor code but modified it to have the same field order as the commandline code and extended it further to support all backend types, even those which are not hotpluggable. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-09-19 10:30:14 +02:00
Peter Krempa	9c88a566d8	qemu: capabilities: Explain that QEMU_CAPS_CHARDEV_JSON will be used in tests only I've added that capability a long time ago when I was converting various stuff to use JSON but the support in '-chardev' didn't yet materialize. Fix the comment to make that clear and also that it'll be used in tests for the upcoming refactor of the chardev code (so that we can validate generator against the schema even if that doesn't yet work). Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-09-19 10:30:14 +02:00
Peter Krempa	ecffc91d02	qemuBackupDiskDataCleanupOne: Don't skip rest of cleanup if we can't enter monitor Recent fix to use the proper 'async' monitor function would cause libvirt to leak some of the objects it's supposed to clean up in other places besides qemu. Don't skip the whole function on failure to enter the job but just the monitor section. Fixes: `9b22c25548` Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-06 18:14:34 +02:00
Peter Krempa	9b22c25548	qemu: backup: Use 'async' monitor in 'qemuBackupDiskDataCleanupOne' 'qemuBackupDiskDataCleanupOne()' is entering the monitor while we're in the async backup job inside 'qemuBackupBegin()' which is semantically wrong and per upstream report causes crashes if some monitoring commands are run in parallel. Use qemuDomainObjEnterMonitorAsync() instead. Closes: https://gitlab.com/libvirt/libvirt/-/issues/668 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-09-05 15:52:26 +02:00
Peter Krempa	61c8a7180e	qemuProcessSetupRawIO: Refactor return value and remove useless #ifdef The function can return directly rather than setting 'ret' as there's no cleanup. It also doesn't make sense to conditionally compile out the 'break' statement when checking whether a disk has rawio enabled if 'CAP_SYS_RAWIO' is _not_ defined as the function will still behave the same. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-09-05 15:24:55 +02:00
Andrea Bolognani	ad92468924	qemu: Use pvpanic by default on aarch64 pvpanic-pci is the only reasonable implementation of a panic device for aarch64/virt guests. Right now we're asking users to provide the model name manually, but we can be more helpful and fill it in automatically instead. With this change, the aarch64-panic-no-model test no longer fails and so it's no longer useful to us. Instead, we can amend the aarch64-virt-default-models test case to include panic coverage, something that until now wasn't possible. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-03 14:08:34 +02:00
Andrea Bolognani	6d92185a49	qemu: Sometimes the default panic model doesn't exist Right now the fallback behavior is to use MODEL_ISA if we haven't been able to find a better match, but that's not very useful as we're still going to hit an error later, when QEMU_CAPS_DEVICE_PANIC is not found at Validate time. Instead of doing that, allow MODEL_DEFAULT to get all the way to Validate and report an error upon encountering it. The reported error changes slightly, but other than that the set of configurations that are allowed and blocked remains the same. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-03 14:07:47 +02:00
Andrea Bolognani	9e1970efa5	qemu: Refactor default panic model Perform decisions based on the architecture and machine type in a single place instead of duplicating them. This technically adds new behavior for MODEL_ISA in qemuDomainDefAddDefaultDevices(), but it doesn't make any difference functionally since we don't set addPanicDevice outside of ppc64(le) and s390(x). If we did, the lack of handling for that value would be a latent bug. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-09-03 14:06:11 +02:00
Martin Kletzander	ac05dc8d4f	qemu_driver: Fix indentation Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2024-09-03 13:13:58 +02:00
Kamil Szczęk	76f6caee3c	qemu: Fix a few comments Fixes: `d292c5ba17` Signed-off-by: Kamil Szczęk <kamil@szczek.dev> Reviewed-by: Andrea Bolognani <abologna@redhat.com>	2024-08-29 13:52:12 +02:00
Andrea Bolognani	725afb4e7b	qemu: Expose availability of PS/2 feature in domcaps This advertises the feature only for the architectures and machine types where it can actually be used. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-29 09:44:53 +02:00
Andrea Bolognani	e0e496d90c	qemu: Change signature for virQEMUCapsSupportsI8042Toggle() We will soon need to use it in a context where we don't have a virDomainDef handy. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-29 09:44:51 +02:00
Andrea Bolognani	d292c5ba17	qemu: Export a few functions We're going to need them in a minute. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-29 09:44:46 +02:00
Praveen K Paladugu	af87ee7927	hypervisor: Move domain interface mgmt methods From: Praveen K Paladugu <prapal@linux.microsoft.com> Move methods to connect domain interfaces to host bridges to hypervisor. This is to allow reuse between qemu and ch drivers. Signed-off-by: Praveen K Paladugu <praveenkpaladugu@gmail.com> Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-26 16:10:04 +02:00
Tim Wiederhake	7b6702d516	hyperv: Support hv-xmm-input enlightenment qemu supports this enlightenment since version 7.10. From the qemu commit: Hyper-V specification allows to pass parameters for certain hypercalls using XMM registers ("XMM Fast Hypercall Input"). When the feature is in use, it allows for faster hypercalls processing as KVM can avoid reading guest's memory. Signed-off-by: Tim Wiederhake <twiederh@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-08-26 11:48:15 +02:00
Tim Wiederhake	0313a500a9	hyperv: Support hv-emsr-bitmap enlightenment qemu supports this enlightenment since version 7.10. From the qemu commit: The newly introduced enlightenment allow L0 (KVM) and L1 (Hyper-V) hypervisors to collaborate to avoid unnecessary updates to L2 MSR-Bitmap upon vmexits. Signed-off-by: Tim Wiederhake <twiederh@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-08-26 11:48:15 +02:00
Michal Privoznik	0888784f38	qemu: Use virEventThreadStop() in qemuProcessStop() Currently, qemuProcessStop() unlocks given domain object right in the middle of cleanup process. This is dangerous because there might be another thread which is executing virDomainObjListAdd(). And since the domain object is on the list of domain objects AND by the time qemuProcessStop() unlocks it the object is also marked as inactive, the other thread acquires the lock and switches vm->def pointer. The unlocking of domain object is needed though, to allow even processing thread finish its queue. Well, the processing can be done before any cleanup is attempted. Therefore, use freshly introduced virEventThreadStop() to join the event thread and drop lock/unlock from the middle of qemuProcessStop(). Now, there's a comment being removed that mentions qemuDomainObjStopWorker() and why it has to be called only after the domain is marked as dead. This comment is no longed applicable because call to qemuDomainObjStopWorker() is removed also. Moreover, priv->beingDestroyed is set to true before unlocking the domain object, thus any event processing callback is going to see the domain being destroyed and can chose to either exit early or finish processing event. Fixes: `3865410e7f` Resolves: https://issues.redhat.com/browse/RHEL-49607 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-08-22 13:33:09 +02:00
Kamil Szczęk	a9a5f8ef39	qemu: Introduce the 'ps2' feature This introduces a new 'ps2' feature which, when disabled, results in no implicit PS/2 bus input devices being automatically added to the domain and addition of the 'i8042=off' machine option to the QEMU command-line. A notable side effect of disabling the i8042 controller in QEMU is that the vmport device won't be created. For this reason we will not allow setting the vmport feature if the ps2 feature is explicitly disabled. Signed-off-by: Kamil Szczęk <kamil@szczek.dev> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-21 17:10:51 +02:00
Kamil Szczęk	9eb3c28323	qemu_capabilities: Introduce QEMU_CAPS_MACHINE_I8042_OPT This capability tells us whether given QEMU binary supports the '-machine xxx,i8042=on/off' toggle used to enable/disable PS/2 controller emulation. A few facts: - This option was introduced in QEMU 7.0 and defaults to 'on' - QEMU versions before 7.0 enabled i8042 controller emulation implicitly - This option (and i8042 controller emulation itself) is only supported by descendants of the generic PC machine type (e.g. i440fx, q35, etc.) Signed-off-by: Kamil Szczęk <kamil@szczek.dev> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-21 17:10:51 +02:00
Kamil Szczęk	51521d13a8	qemu: Improve PS/2 controller detection Up until now, we've assumed that all x86 machines have a PS/2 controller built-in. This assumption was correct until QEMU v4.2 introduced a new x86-based machine type - microvm. Due to this assumption, a pair of unnecessary PS/2 inputs are implicitly added to all microvm domains. This patch fixes that by whitelisting machine types which are known to include the i8042 PS/2 controller. Signed-off-by: Kamil Szczęk <kamil@szczek.dev> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-21 17:10:51 +02:00
Peter Krempa	62d6e8dcb2	qemu: validate: Reject empty USB disks Attempting to start qemu with or hotplug an empty 'usb-storage' based disk results in the following error: qemu-system-x86_64: -device {"driver":"usb-storage","bus":"usb.0","port":"2","id":"usb-disk1","removable":true}: drive property not set Reject such config at validation step and adjust tests. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-08-21 15:49:36 +02:00
Peter Krempa	204013d4aa	qemu: block: Allow NULL 'data' in 'qemuBlockStorageSourceChainDetach' Some code paths, such as if hotplug of an empty cdrom fails can cause that 'qemuBlockStorageSourceChainDetach' will be called with 'NULL' @data as there is no backend for the disk. The above case became possible once we allowed hotplug of cdroms and subsequently fixed the case when users would hotplug an empty cdrom which ultimately caused the possibility of having no backend in the hotplug code path which was not possible before (see 'Fixes:' below and also the commit linked from there). Make 'qemuBlockStorageSourceChainDetach' tolerate NULL @data by simply returning early. Fixes: `894c6c5c16` Resolves: https://issues.redhat.com/browse/RHEL-54550 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>	2024-08-21 15:49:36 +02:00
Michal Privoznik	ab7f877f27	lib: Use NULLSTR family of macros more There is a family of convenient macros: NULLSTR, NULLSTR_EMPTY, NULLSTR_STAR, NULLSTR_MINUS which hides ternary operator. Generated using the following spatch (and its obvious variants): @@ expression s; @@ <+... - s ? s : "<null>" + NULLSTR(s) ...+> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-08-19 13:44:12 +02:00
Peter Krempa	b3edf03c31	qemu: hotplug: Rollback FD passthrough for 'slirpfd' and 'vdpafd' on hotplug failure On failure to plug the device the cleanup path didn't roll back the FD passing to qemu thus qemu would hold the FDs indefinitely. Resolves: https://issues.redhat.com/browse/RHEL-53964 Fixes: `b79abf9c3c` (vdpafd) Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-08-13 16:34:47 +02:00
Sandesh Patel	6d7dd09e8a	qemu: format dma-translation on intel-iommu command line Add dma-translation attribute to qemu command line if specified in domain conf. Signed-off-by: Sandesh Patel <sandesh.patel@nutanix.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-09 10:05:56 +02:00
Sandesh Patel	6866f958c1	conf: add dma_translation attribute to iommu Add dma_translation attribute to iommu to enable/disable dma traslation for intel-iommu Signed-off-by: Sandesh Patel <sandesh.patel@nutanix.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-09 10:05:56 +02:00
Sandesh Patel	b2cc19e5fd	qemu_capabilities: Introduce QEMU_CAPS_INTEL_IOMMU_DMA_TRANSLATION This capability tracks whether intel-iommu device has dma-translation attribute. Signed-off-by: Sandesh Patel <sandesh.patel@nutanix.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-09 10:05:56 +02:00
Jiri Denemark	11f6773f19	qemu: Avoid false failure when resuming post-copy migration Depending on timing between QEMU and libvirt an attempt to resume failed post-copy migration could immediately report a failure in post-copy phase again even though the migration actually resumed and is progressing just fine. This is caused by QEMU reporting the original migration state (i.e., postcopy-paused) until migration is successfully resumed and QEMU switches to postcopy-active. QEMU 9.1 introduced a new postcopy-recover-setup migration state which is entered immediately after requesting migration to be resumed and we can reliably wait for the migration to either continue or fail without being confused by the old state. https://issues.redhat.com/browse/RHEL-22166 Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-08 16:27:13 +02:00
Jiri Denemark	79e0b50bb6	qemu: Add support for postcopy-recover-setup migration state This patch adds support for recognizing the new migration state reported by QEMU when post-copy recovery is requested. It is not actually used for anything yet. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-08-08 16:27:13 +02:00
Peter Krempa	4ba4f659e4	qemu_domain: Strip <acpi/> from s390(x) definitions The s390(x) machines never supported ACPI. That didn't stop users enabling ACPI in their config. As of libvirt-9.2 (`98c4e3d073`) with new enough qemu we reject configs which require ACPI, but qemu can't satisfy it. This breaks migration of existing VMs with the old wrong configs to new libvirt installations. To address this introduce a post-parse fixup removing the ACPI flag specifically for s390 machines which do enable it in the definition. The advantage of doing it in post-parse, rather than simply relaxing the ABI stability check to allow users providing an fixed XML when migrating (allowing change of the ACPI flag for s390 in ABI stability check, as it doesn't impact ABI), is that only the destination installation needs to be patched in order to preserve migration. To mitigate the disadvantage of simply stripping it from all s390(x) configs the hack is not applied when defining or starting a new domain from the XML, to preserve the error about unsupported configuration. Resolves: https://issues.redhat.com/browse/RHEL-49516 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>	2024-08-06 15:12:14 +02:00
Michal Privoznik	d913f204e0	qemu: Pre-create pstore device file So far we are relying on QEMU or sysadmin to create the file for pstore. This is suboptimal as in the case of the former we can not set proper seclabels (there's nothing to set seclabels on until QEMU is started). Therefore, make sure the file is created before launching QEMU and that it has the correct size. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>	2024-07-31 14:15:48 +02:00
Michal Privoznik	de355b7873	qemu: Autofill pstore path if missing Introduced only a couple of commits ago (in v10.5.0-84-g90e50e67c6) the pstore device acts as a nonvolatile storage, where guest kernel can store information about crashes. This device, however, expects a file in the host from which the crash data is read. So far, we expected users to provide a path, but we can autogenerate one if missing. Just put it next to per-domain's NVRAM stores. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>	2024-07-30 17:22:00 +02:00
Michal Privoznik	3cfe4caa0a	qemu: Build cmd line for pstore device Nothing special going on here. Resolves: https://issues.redhat.com/browse/RHEL-24746 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Kristina Hanicova <khanicov@redhat.com>	2024-07-25 16:04:50 +02:00
Michal Privoznik	90e50e67c6	conf: Introduce pstore device The aim of pstore device is to provide a bit of NVRAM storage for guest kernel to record oops/panic logs just before the it crashes. Typical usage includes usage in combination with a watchdog so that the logs can be inspected after the watchdog rebooted the machine. While Linux kernel (and possibly Windows too) support many backends, in QEMU there's just 'acpi-erst' device so stick with that for now. The device must be attached to a PCI bus and needs two additional values (well, corresponding memory-backend-file needs them): size and path. Despite using memory-backend-file this does NOT add any additional RAM to the guest and thus I've decided to expose it as another device type instead of memory model. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Kristina Hanicova <khanicov@redhat.com>	2024-07-25 16:04:50 +02:00
Michal Privoznik	4a9c2d9bbe	qemu_capabilities: Introduce QEMU_CAPS_DEVICE_ACPI_ERST This capability tracks whether QEMU has acpi-erst device. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Kristina Hanicova <khanicov@redhat.com>	2024-07-25 16:04:50 +02:00
Ján Tomko	8d3b239737	qemu: virtiofs: cache: use 'never' instead of 'none' The new option style renamed one of the cache modes. https://issues.redhat.com/browse/RHEL-50329 Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-07-25 13:41:46 +02:00
Boris Fiuczynski	e62c26a20d	qemu: add a monitor to /proc/$pid when killing times out In cases when a QEMU process takes longer than the time sigterm and sigkill are issued to kill the process do not simply fail and leave the VM in state VIR_DOMAIN_SHUTDOWN until the daemon stops. Instead set up an fd on /proc/$pid and get notified when the QEMU process finally has terminated to cleanup the VM state. Resolves: https://issues.redhat.com/browse/RHEL-28819 Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-07-24 13:16:02 +02:00
Kristina Hanicova	e5eb64e9fd	qemu_hotplug: Do not allow absent values in rom settings If there are absent values in an already existing element specifying rom settings, we simply use the old ones. This behaviour is not desired, as users might think that deleting the element from XML would delete the setting (because the hotplug succeeds) - which does not happen. Because of that, we should not accept an interface without elements that cannot be changed. Therefore, we should not allow absent values for already existing rom setting during hotplug. Resolves: https://issues.redhat.com/browse/RHEL-7109 Signed-off-by: Kristina Hanicova <khanicov@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-07-24 13:07:20 +02:00
Adam Julis	b53e9f834b	virtiofs: rename member to 'openfiles' for clarity New element 'openfiles' had confusing name. Since the patch with this new element wasn't propagate yet, old name ('rlimit_nofile') was changed. ... <binary> <openfiles max='122333'/> </binary> ... Signed-off-by: Adam Julis <ajulis@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-07-24 12:48:16 +02:00
Andrea Bolognani	47d34ffb26	qemu: ROM firmware images are always readonly By definition. Accordingly, filter them out when looking for a read/write image. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-07-19 15:18:39 +02:00
Andrea Bolognani	f13b3f8098	qemu: Filter firmware images by type If the configuration explicitly requests a specific type of firmware image, be it pflash or ROM, we should ignore all images that are not of that type. If no specific type has been requested, of course, any type is considered a match and the selection will be based upon the other attributes. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-07-19 15:18:38 +02:00
Adam Julis	ea6c3ea2d5	qemu: virtiofs: format --rlimit-nofile Resolves: https://gitlab.com/libvirt/libvirt/-/issues/485 Signed-off-by: Adam Julis <ajulis@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-07-17 13:18:11 +02:00
Jiri Denemark	bec903cae8	qemu: Don't leave beingDestroyed=true on inactive domain Recent commit v10.4.0-87-gd9935a5c4f made a reasonable change to only reset beingDestroyed back to false when vm->def->id is reset to make sure other code can detect a domain is (about to become) inactive. It even added a comment saying any caller of qemuProcessBeginStopJob is supposed to call qemuProcessStop to clear beingDestroyed. But not every caller really does so because they first call qemuProcessBeginStopJob and then check whether a domain is still running. If not the qemuProcessStop call is skipped leaving beingDestroyed=true. In case of a persistent domain this may block incoming migrations of such domain as the migration code would think the domain died unexpectedly (even though it's still running). The qemuProcessBeginStopJob function is a wrapper around virDomainObjBeginJob, but virDomainObjEndJob was used directly for cleanup. This patch introduces a new qemuProcessEndStopJob wrapper around virDomainObjEndJob to properly undo everything qemuProcessBeginStopJob did. https://issues.redhat.com/browse/RHEL-43309 Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-07-12 11:27:03 +02:00
Ján Tomko	d94b31a68a	qemu: migration: allow migration for virtiofs Allow migration if the "migrate-precopy" capability is present or libvirt is not the one running the virtiofs daemon. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-07-10 12:32:23 +02:00
Ján Tomko	8dc04cafec	qemu: do not use deprecated options for new virtiofsd Use the to-be-introduced virtiofsd capability to mark whether new options are safe to use. Depends on: https://gitlab.com/virtio-fs/virtiofsd/-/merge_requests/231 https://issues.redhat.com/browse/RHEL-7108 Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-07-10 12:32:23 +02:00
Ján Tomko	730eaafaac	qemu: fill capabilities for virtiofsd Run the daemon with --print-capabilities first, to see what it supports. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-07-10 12:32:23 +02:00
Kshitij Jha	6d3955acf1	Include support for Vfio stats during Migration As of now, libvirt supports few essential stats as part of virDomainGetJobStats for Live Migration such as memory transferred, dirty rate, number of iteration etc. Currently it does not have support for the vfio stats returned via QEMU. This patch adds support for that. Signed-off-by: Kshitij Jha <kshitij.jha@nutanix.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-07-10 12:28:55 +02:00
hongmianquan	0d3e962d47	security_manager: Remove redundant qemuSecurityGetNested() call This commit removes the redundant call to qemuSecurityGetNested() in qemuStateInitialize(). In qemuSecurityGetModel(), the first security manager in the stack is already used by default, so this change helps to simplify the code. Signed-off-by: hongmianquan <hongmianquan@bytedance.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-07-09 13:24:57 +02:00
hongmianquan	790b4d8067	security_manager: Ensure top lock is acquired before nested locks Fix libvirtd hang since fork() was called while another thread had security manager locked. We have the stack security driver, which internally manages other security drivers, just call them "top" and "nested". We call virSecurityStackPreFork() to lock the top one, and it also locks and then unlocks the nested drivers prior to fork. Then in qemuSecurityPostFork(), it unlocks the top one, but not the nested ones. Thus, if one of the nested drivers ("dac" or "selinux") is still locked, it will cause a deadlock. If we always surround nested locks with top lock, it is always secure. Because we have got top lock before fork child libvirtd. However, it is not always the case in the current code, We discovered this case: the nested list obtained through the qemuSecurityGetNested() will be locked directly for subsequent use, such as in virQEMUDriverCreateCapabilities(), where the nested list is locked using qemuSecurityGetDOI, but the top one is not locked beforehand. The problem stack is as follows: libvirtd thread1 libvirtd thread2 child libvirtd \| \| \| \| \| \| virsh capabilities qemuProcessLanuch \| \| \| \| \| lock top \| \| \| \| lock nested \| \| \| \| \| \| fork------------------->\|(nested lock held by thread1) \| \| \| \| \| \| unlock nested unlock top unlock top \| \| qemuSecuritySetSocketLabel \| \| lock nested (deadlock) In this commit, we ensure that the top lock is acquired before the nested lock, so during fork, it's not possible for another task to acquire the nested lock. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1303031 Signed-off-by: hongmianquan <hongmianquan@bytedance.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-07-09 13:22:26 +02:00
Miroslav Los via Devel	8515a178f8	qemuDomainChangeNet: check virtio options for non-virtio models In a domain created with an interface with a <driver> subelement, the device contains a non-NULL virDomainVirtioOptions struct, even for non-virtio NIC models. The subelement need not be present again after libvirt restarts, or when the interface is passed to clients. When clients such as virsh domif-setlink put back the modified interface XML, the new device's virtio attribute is NULL. This may fail the equality checks for virtio options in qemuDomainChangeNet, depending on whether libvird was restarted since define or not. This patch modifies the check for non-virtio models, to ignore olddev value of virtio (assumed valid), and to allow either NULL or a struct with all values ABSENT in the new virtio options. Signed-off-by: Miroslav Los <mirlos@cisco.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-07-09 13:20:05 +02:00
Daniel P. Berrangé	e40a533118	qemu: set swtpm log level parameter This wires up the emulator 'debug' parameter to control the /usr/bin/swtpm 'level' parameter for logging. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2024-07-05 14:43:15 +01:00
Adam Julis	c3302ceb1d	qemuDomainChangeNet: forbid changing portgroup Changing the postgroup attribute caused unexpected behavior. Although it can be implemented, it has a non-trivial solution. No requirement or use has yet been found for implementing this feature, so it has been disabled for hot-plug. Resolves: https://issues.redhat.com/browse/RHEL-7299 Signed-off-by: Adam Julis <ajulis@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-07-03 09:59:10 +02:00
Michal Privoznik	cf7d495324	qemu: Drop _virQEMUDriver::hostFips The 'hostFips' member of _virQEMUDriver struct is not used really, due to previous cleanups. Drop it. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-07-02 09:14:24 +02:00
Michal Privoznik	ce48d584cc	qemu_capabilities: Retire QEMU_CAPS_VXHS The support for VXHS device was removed in QEMU commit v5.1.0-rc1~16^2~10. Since we require QEMU-5.2.0 at least there's no QEMU that has the device and thus the corresponding capability can be retired. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-07-02 09:14:23 +02:00
Michal Privoznik	295eb1b3d8	qemu_capabilities: Retire QEMU_CAPS_ENABLE_FIPS The capability is no longer used. Retire it. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-07-02 09:14:22 +02:00
Michal Privoznik	8cf81de8bf	qemu_capabilities: Drop version check for QEMU_CAPS_ENABLE_FIPS and QEMU_CAPS_NETDEV_USER Now that the minimal required version of QEMU is 5.2.0 the conditional setting of QEMU_CAPS_ENABLE_FIPS and QEMU_CAPS_NETDEV_USER is effectively a dead code. Drop it. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-07-02 09:14:20 +02:00
Michal Privoznik	073bf16784	qemu_capabilities: Require QEMU-5.2.0 or newer According to repology.org and/or distro repos these are the version of QEMU: CentOS Stream 9: qemu-kvm-9.0.0 Debian 11: qemu-5.2.0 Fedora 39: qemu-8.3.1 openSUSE Leap 15.3: qemu-5.2.0 RHEL-8: qemu-6.2.0 Ubuntu 22.04: qemu-6.2.0 Since the minimal version is 5.2.0 we can bump from 4.2.0 to 5.2.0. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-07-02 09:14:18 +02:00
Michal Privoznik	8f34fd0c4c	qemu_domain: Set 'passt' net backend if 'default' is unsupported It may happen that QEMU is compiled without SLIRP but with support for passt. In such case it is acceptable to alter user provided configuration and switch backend to passt as it offers all the features as SLIRP. Resolves: https://issues.redhat.com/browse/RHEL-45518 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-07-01 12:40:06 +02:00
Michal Privoznik	bd6060d1c3	qemu_validate: Use domaincaps to validate supported net backend type Now that the logic for detecting supported net backend types has been moved to domain capabilities generation, we can just use it when validating net backend type. Just like we do for device models and so on. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-07-01 12:39:10 +02:00
Michal Privoznik	6a0f45a9e0	qemu_capabilities: Fill supported net backend types Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-07-01 12:37:27 +02:00
Michal Privoznik	73fc20e262	qemu_validate: Validate net backends against QEMU caps Now that we have a capability for each domain net backend we can start validating user's selection against QEMU capabilities. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-07-01 12:33:14 +02:00
Michal Privoznik	e28bc15f09	qemu_capabilities: Introduce QEMU_CAPS_NETDEV_USER Since -netdev user can be disabled during QEMU compilation, we can't blindly expect it to just be there. We need a capability that tracks its presence. For qemu-4.2.0 we are not able to detect the capability so do the next best thing - assume the capability is there. This is consistent with our current behaviour where we blindly assume the capability, anyway. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-07-01 12:32:16 +02:00
Jon Kohler	76e2dae01a	qemu: fix switchover-ack regression for old qemu When enabling switchover-ack on qemu from libvirt, the .party value was set to both source and target; however, qemuMigrationParamsCheck() only takes that into account to validate that the remote side of the migration supports the flag if it is marked optional or auto/always on. In the case of switchover-ack, when enabled on only the dst and not the src, the migration will fail if the src qemu does not support switchover-ack, as the dst qemu will issue a switchover-ack msg: qemu/migration/savevm.c -> loadvm_process_command -> migrate_send_rp_switchover_ack(mis) -> migrate_send_rp_message(mis, MIG_RP_MSG_SWITCHOVER_ACK, 0, NULL) Since the src qemu doesn't understand messages with header_type == MIG_RP_MSG_SWITCHOVER_ACK, qemu will kill the migration with error: qemu-kvm: RP: Received invalid message 0x0007 length 0x0000 qemu-kvm: Unable to write to socket: Bad file descriptor Looking at the original commit [1] for optional migration capabilities, it seems that the spirit of optional handling was to enhance a given existing capability where possible. Given that switchover-ack exclusively depends on return-path, adding it as optional to that cap feels right. [1] `61e34b0856` ("qemu: Add support for optional migration capabilities") Fixes: `1cc7737f69` ("qemu: add support for qemu switchover-ack") Signed-off-by: Jon Kohler <jon@nutanix.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Avihai Horon <avihaih@nvidia.com> Cc: Jiri Denemark <jdenemar@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: YangHang Liu <yanghliu@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-06-28 08:50:12 +02:00
Michal Privoznik	fbe97ee17d	qemu_validate: Use domaincaps to validate supported launchSecurity type Now that the logic for detecting supported launchSecurity types has been moved to domain capabilities generation, we can just use it when validating launchSecurity type. Just like we do for device models and so on. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-06-25 14:46:08 +02:00
Michal Privoznik	66df7992d8	qemu: Fill launchSecurity in domaincaps The inspiration for these rules comes from qemuValidateDomainDef(). Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-06-25 14:46:05 +02:00
Michal Privoznik	d00816209e	qemu_capabilities: Probe SEV capabilities even for QEMU_CAPS_SEV_SNP_GUEST While it's very unlikely to have QEMU that supports SEV-SNP but doesn't support plain SEV, for completeness sake we ought to query SEV capabilities if QEMU supports either. And similarly to QEMU_CAPS_SEV_GUEST we need to clear the capability if talking to QEMU proves SEV is not really supported. This in turn removes the 'sev-snp-guest' capability from one of our test cases as Peter's machine he uses to refresh capabilities is not SEV capable. But that's okay. It's consistent with 'sev-guest' capability. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-06-25 14:46:00 +02:00
Rayhan Faizel	9b0606ef8e	qemu_block: Validate number of hosts for iSCSI disk device An iSCSI device with zero hosts will result in a segmentation fault. This patch adds a check for the number of hosts, which must be one in the case of iSCSI. Minimal reproducing XML: <domain type='qemu'> <name>MyGuest</name> <uuid>4dea22b3-1d52-d8f3-2516-782e98ab3fa0</uuid> <os> <type arch='x86_64'>hvm</type> </os> <memory>4096</memory> <devices> <disk type='network'> <source name='dummy' protocol='iscsi'/> <target dev='vda'/> </disk> </devices> </domain> Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-06-25 10:05:49 +02:00
Jon Kohler	1cc7737f69	qemu: add support for qemu switchover-ack Add plumbing for QEMU's switchover-ack migration capability, which helps lower the downtime during VFIO migrations. This capability is enabled by default as long as both the source and destination support it. Note: switchover-ack depends on the return path capability, so this may not be used when VIR_MIGRATE_TUNNELLED flag is set. Extensive details about the qemu switchover-ack implementation are available in the qemu series v6 cover letter [1] where the highlight is the extreme reduction in guest visible downtime. In addition to the original test results below, I saw a roughly ~20% reduction in downtime for VFIO VGPU devices at minimum. === Test results === The below table shows the downtime of two identical migrations. In the first migration swithcover ack is disabled and in the second it is enabled. The migrated VM is assigned with a mlx5 VFIO device which has 300MB of device data to be migrated. +----------------------+-----------------------+----------+ \| Switchover ack \| VFIO device data size \| Downtime \| +----------------------+-----------------------+----------+ \| Disabled \| 300MB \| 1900ms \| \| Enabled \| 300MB \| 420ms \| +----------------------+-----------------------+----------+ Switchover ack gives a roughly 4.5 times improvement in downtime. The 1480ms difference is time that is used for resource allocation for the VFIO device in the destination. Without switchover ack, this time is spent when the source VM is stopped and thus the downtime is much higher. With switchover ack, the time is spent when the source VM is still running. [1] https://patchwork.kernel.org/project/qemu-devel/cover/20230621111201.29729-1-avihaih@nvidia.com/ Signed-off-by: Jon Kohler <jon@nutanix.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Avihai Horon <avihaih@nvidia.com> Cc: Markus Armbruster <armbru@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: YangHang Liu <yanghliu@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>	2024-06-25 09:51:00 +02:00
Jonathon Jongsma	af437d2d64	qemu: Don't specify vfio-pci.ramfb when ramfb is false Commit `7c8e606b64` attempted to fix the specification of the ramfb property for vfio-pci devices, but it failed when ramfb is explicitly set to 'off'. This is because only the 'vfio-pci-nohotplug' device supports the 'ramfb' property. Since we use the base 'vfio-pci' device unless ramfb is enabled, attempting to set the 'ramfb' parameter to 'off' this will result in an error like the following: error: internal error: QEMU unexpectedly closed the monitor (vm='rhel'): 2024-06-06T04:43:22.896795Z qemu-kvm: -device {"driver":"vfio-pci","host":"0000:b1:00.4","id":"hostdev0","display":"on ","ramfb":false,"bus":"pci.7","addr":"0x0"}: Property 'vfio-pci.ramfb' not found. This also more closely matches what is done for mdev devices. Resolves: https://issues.redhat.com/browse/RHEL-28808 Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-06-24 08:55:50 -05:00
Adam Julis	3a9095976e	qemuDomainDiskChangeSupported: Fill in missing check The attribute 'discard_no_unref' of <disk/> is not allowed to be changed while the virtual machine is running. Resolves: https://issues.redhat.com/browse/RHEL-37542 Signed-off-by: Adam Julis <ajulis@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-06-24 11:14:56 +02:00
Michal Privoznik	58b5219961	qemu_firmware: Pick the right firmware for SEV-SNP guests The firmware descriptors have 'amd-sev-snp` feature which describes whether firmware is suitable for SEV-SNP guests. Provide necessary implementation to detect the feature and pick the right firmware if guest is SEV-SNP enabled. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-06-21 09:59:04 +02:00
Michal Privoznik	a1d850b300	qemu: Build cmd line for SEV-SNP Pretty straightforward as qemu has 'sev-snp-guest' object which attributes maps pretty much 1:1 to our XML model. Except for @vcek where QEMU has 'vcek-disabled`, an inverted boolean, while we model it as virTristateBool. But that's easy to map too. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-06-21 09:58:10 +02:00
Michal Privoznik	c65eba1f57	conf: Introduce SEV-SNP support SEV-SNP is an enhancement of SEV/SEV-ES and thus it shares some fields with it. Nevertheless, on XML level, it's yet another type of <launchSecurity/>. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-06-21 09:56:57 +02:00
Michal Privoznik	1abcba9d4d	qemu_capabilities: Introduce QEMU_CAPS_SEV_SNP_GUEST This capability tracks sev-snp-guest object availability. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-06-21 09:56:18 +02:00
Michal Privoznik	be26d0ebbe	qemu: Report snp-policy in virDomainGetLaunchSecurityInfo() Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-06-21 09:36:04 +02:00
Michal Privoznik	914b986275	qemu_monitor: Allow querying SEV-SNP state in 'query-sev' In QEMU commit v9.0.0-1155-g59d3740cb4 the return type of 'query-sev' monitor command changed to accommodate SEV-SNP. Even though we currently support launching plain SNP guests, this will soon change. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-06-21 09:35:32 +02:00
Michal Privoznik	7d16c296e3	src: Convert some _virDomainSecDef::sectype checks to switch() In a few instances there is a plain if() check for _virDomainSecDef::sectype. While this works perfectly for now, soon there'll be another type and we can utilize compiler to identify all the places that need adaptation. Switch those if() statements to switch(). Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-06-21 09:32:09 +02:00
Michal Privoznik	a44a43361f	Drop needless typecast to virDomainLaunchSecurity The sectype member of _virDomainSecDef struct is already declared as of virDomainLaunchSecurity type. There's no need to typecast it to the very same type when passing it to switch(). Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-06-21 09:31:33 +02:00
Michal Privoznik	d2cad18ca3	conf: Move some members of virDomainSEVDef into virDomainSEVCommonDef Some parts of SEV are to be shared with SEV SNP. In order to reuse XML parsing / formatting code cleanly, let's move those common bits into a new struct (virDomainSEVCommonDef) and adjust rest of the code. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-06-21 09:28:54 +02:00
Michal Privoznik	66efdfabd9	qemu_monitor_json: Report error in error paths in SEV related code While working on qemuMonitorJSONGetSEVMeasurement() and qemuMonitorJSONGetSEVInfo() I've noticed that if these functions fail, they do so without appropriate error set. Fill in error reporting. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2024-06-21 09:25:32 +02:00
Peter Krempa	e6b94cba7e	qemu: migration: Preserve error across qemuDomainSetMaxMemLock() on error paths When a VM terminates itself while it's being migrated in running state libvirt would report wrong error: error: cannot get locked memory limit of process 2502057: No such file or directory rather than the proper error: error: operation failed: domain is not running Remember the error on error paths in qemuMigrationSrcConfirmPhase and qemuMigrationSrcPerformPhase. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-06-20 09:58:52 +02:00
Peter Krempa	e00a58c10a	qemuMigrationSrcRun: Re-check whether VM is active before accessing job data 'qemuProcessStop()' clears the 'current' job data. While the code under the 'error' label in 'qemuMigrationSrcRun()' does check that the VM is active before accessing the job, it also invokes multiple helper functions to clean up the migration including 'qemuMigrationSrcNBDCopyCancel()' which calls 'qemuDomainObjWait()' invalidating the result of the liveness check as it unlocks the VM. Duplicate the liveness check and explain why. The rest of the code e.g. accessing the monitor is safe as 'qemuDomainEnterMonitorAsync()' performs a liveness check. The cleanup path just ignores the return values of those functions. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-06-20 09:58:52 +02:00
Peter Krempa	9243e87820	qemu: migration: Inline 'qemuMigrationDstFinishResume()' The function is a pointless wrapper on top of qemuMigrationDstWaitForCompletion. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-06-20 09:58:52 +02:00
Peter Krempa	a52e125d56	qemu: migration: Properly check for live VM after qemuDomainObjWait() Similarly to the one change in commit `4d1a1fdffd` we should be checking that the VM is not being yet destroyed if we've invoked qemuDomainObjWait(). Use the new helper qemuDomainObjIsActive(). Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-06-20 09:58:52 +02:00
Peter Krempa	9eb33b7f03	qemu: domain: Introduce qemuDomainObjIsActive helper The helper checks whether VM is active including the internal qemu state. This helper will become useful in situations when an async job is in use as VIR_JOB_DESTROY can run along async jobs thus both checks are necessary. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-06-20 09:58:52 +02:00
Peter Krempa	d9935a5c4f	qemu: process: Ensure that 'beingDestroyed' gets cleared only after VM id is reset Prevent the possibility that a VM could be considered as alive while inside qemuProcessStop. A recently fixed bug which unlocked the domain object while inside qemuProcessStop showed that there's possibility to confuse the state of the VM to be considered active while 'qemuProcessStop' is processing shutdown of the VM. Ensure that this doesn't happen by clearing the 'beingDestroyed' flag only after the VM id is cleared. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-06-20 09:58:52 +02:00
Peter Krempa	3865410e7f	qemuProcessStop: Move code not depending on 'vm->def->id' after reset of the ID There are few function calls done while cleaning up a stopped VM which do require the old VM id, to e.g. clean up paths containing the 'short' domain name in the path. Anything else, which doesn't strictly require it can be moved after clearing the 'id' in order to decrease likelyhood of potential bugs. This patch moves all the code which does not require the 'id' (except for the log entry and closing the monitor socket) after the statement clearing the id and adds a comment explaining that anything in the section must not unlock the VM object. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-06-20 09:58:52 +02:00
Peter Krempa	d29e0f3d4a	qemuProcessStop: Prevent crash when qemuDomainObjStopWorker() unlocks the VM 'qemuDomainObjStopWorker()' which is meant to dispose of the event loop thread for the monitor unlocks the VM object while disposing the thread to prevent possible deadlocks with events waiting on the monitor thread. Unfortunately 'qemuDomainObjStopWorker()' is called before the VM is marked as inactive by clearing 'vm->def->id', but at the same time it's no longer marked as 'beingDestroyed' when we're inside 'qemuProcessStop()'. If 'vm' would be kept locked this wouldn't be a problem. Same way it's not a problem for anything that uses non-ASYNC VM jobs, or when the monitor is accessed in an async job, as the 'destroy' job interlocks with those. It is a problem for code inside an async job which uses 'qemuDomainObjWait()' though. The API contract of qemuDomainObjWait() ensures the caller that the VM on successful return from it, but in this specific reason it's not the case, as both 'beingDestroyed' is already false, and 'vm->def->id' is not yet cleared. To fix the issue move the 'qemuDomainObjStopWorker()' call after clearing 'vm->def->id' and also add a note stating what the function is doing. Fixes: `860a999802` Closes: https://gitlab.com/libvirt/libvirt/-/issues/640 Reported-by: luzhipeng <luzhipeng@cestc.cn> Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-06-20 09:58:21 +02:00
Peter Krempa	da8d97e4e2	qemuDomainObjWait: Add documentation Document why this function exists and meaning of return values. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-06-20 09:52:55 +02:00
Peter Krempa	f9ad21996d	qemuDomainDeviceBackendChardevForeach: Fix typo in comment Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-06-20 09:52:54 +02:00
Peter Krempa	b4423a753b	qemuDomainDiskPrivateDispose: Prevent dangling 'disk' pointer in blockjob data Clear the 'disk' member of 'blockjob' as we're freeing the disk object at this point. While this should not normally happen it was observed when other bug allowed the VM to be cleared while other threads didn't yet finish. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-06-20 09:52:54 +02:00
Peter Krempa	737f897c29	qemuBlockJobProcessEventConcludedBackup: Handle potentially NULL 'job->disk' Similarly to other blockjob handlers, if there's no disk associated with the blockjob the handler needs to behave correctly. This is needed as the disk might have been de-associated on unplug or other operations. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-06-20 09:52:54 +02:00
Swapnil Ingle	c772f1982d	Pass shutoff reason to release hook Sometimes in release hook it is useful to know if the VM shutdown was graceful or not. This is especially useful to do cleanup based on the VM shutdown failure reason in release hook. This patch proposes to use the last argument 'extra' to pass VM shutoff reason in the call to release hook. Making this change for Qemu and LXC. Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-06-19 12:15:26 +02:00
Adam Julis	e145d182a6	qemu: implement iommu coldplug/unplug Resolves: https://issues.redhat.com/browse/RHEL-23833 Signed-off-by: Adam Julis <ajulis@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-06-18 12:17:50 +02:00
Adam Julis	59f6e226bb	qemu_driver: add validation of potential dependencies on cold plug Although virDomainDeviceDefValidate() is called as a part of parsing device XML routine, it validates only that single device. The virDomainDefValidate() function performs a more comprehensive check. It should detect errors resulting from dependencies between devices, or a device and some other part of XML config. Therefore, a call to virDomainDefValidate() is added at the end of qemuDomainAttachDeviceConfig(). Signed-off-by: Adam Julis <ajulis@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>	2024-06-18 08:46:28 +02:00
Michal Privoznik	095f22db21	qemu_process: Issue an info message when subtracting isolcpus In one of my previous commits I've made us substract isolcpus from all online CPUs when setting affinity on QEMU threads. See commit below for more info on that. Nevertheless, this is something that surely deserves an entry in log. I've chosen INFO priority for now. We can promote that to a regular WARN if users complain. Fixes: `da95bcb6b2` Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-06-17 12:30:39 +02:00
Daniel P. Berrangé	f2828880b6	meson: allow systemd sysusersdir to be changed We currently hardcode the systemd sysusersdir, but it is desirable to be able to choose a different location in some cases. For example, Fedora flatpak builds change the RPM %_sysusersdir macro, but we can't currently honour that. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Reported-by: Yaakov Selkowitz <yselkowi@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2024-06-13 10:23:11 +01:00
Peter Krempa	39bfd6c888	qemu_validate: Validate support for SCSI emulation support in 'virtio-blk' devices The support will be dropped soon by qemu, and libvirt is not rejecting such configurations. Add validation of this explicitly requested config. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-06-12 08:21:12 +02:00
Peter Krempa	126f95c1fe	qemuValidateDomainDeviceDefDiskFrontend: Refactor validation of <disk type='lun'> Use a switch statement for checks based on the disk bus. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2024-06-12 08:21:11 +02:00
Andrea Bolognani	971e767805	qemu: Reject TPM 1.2 in most scenarios Everywhere we use TPM 2.0 as our default, the chances of TPM 1.2 being supported by the guest OS are very slim. Just reject such configurations outright. Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2024-06-07 11:13:19 +02:00

1 2 3 4 5 ...

13705 Commits