Commit Graph

36141 Commits

Author SHA1 Message Date
Jiri Denemark
ba1652e84a sync_qemu_features_i386: Add some removed features back
When removing features unknown to QEMU (they have a different name or
are completely missing as they are not configurable by a user) I should
not have removed them from the list of features unknown to QEMU in the
script for synchronizing QEMU features to the CPU map.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-11-26 13:04:33 +01:00
Jiri Denemark
88e7cc1477 cpu_x86: Promote added/removed from ancestor
When a CPU model is defined based on another model, we were completely
ignoring features marked as added to or removed from the original model
after it was released. For added features this is the right thing to do
as it will promote them to become normal features included in the new
model. But features marked as removed would become included in the new
model as well. We need to explicitly remove them as if they were never
included in the model.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-11-26 13:04:33 +01:00
Jiri Denemark
61dd5d6bef cpu_x86: Annotate virCPUx86Model fields
Document which fields are inherited when a CPU model is based on another
model.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-11-26 13:04:33 +01:00
Nikolai Barybin
59b1408860 qemu: block: Add support for 'data-file' feature of qcow2
Add the block infrastructure for detecting and landling the data file
for images and starting qemu with the configuration.

Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-25 22:31:18 +01:00
Nikolai Barybin
cb5371a8bd qemu: factor out qemuDomainPrepareStorageSource()
This refactoring will simplify next changes.

Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-25 22:31:18 +01:00
Nikolai Barybin
9ba2bef816 qemu: put data-file path to VM's cgroup and namespace
Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-25 22:31:18 +01:00
Nikolai Barybin
702f8b5a77 security: apparmor: handle qcow2 data-file
Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-25 22:31:18 +01:00
Nikolai Barybin
8fcc6c8025 security: selinux: handle qcow2 data-file on image label set/restore
Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-25 22:31:18 +01:00
Nikolai Barybin
724a4c6dc4 security: DAC: handle qcow2 data-file on image label set/restore
Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-25 22:31:18 +01:00
Nikolai Barybin
0a3d177d9b storage file: fill in src->dataFileStore during file probe
Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-25 22:31:18 +01:00
Nikolai Barybin
63481eca80 storage file: add qcow2 data-file path parsing from header
In qcow2 header data file is represented by incompitible feature bit
and its path is saved to header extension table.
Thus, we implement here the logic similar to backing file probing.

Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-25 22:31:18 +01:00
Nikolai Barybin
8799818036 storage file: add getDataFile function to FileTypeInfo
Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-25 22:31:18 +01:00
Nikolai Barybin
b3171cf8da conf: implement XML parsing/formating for <dataStore> element of a storage <source>
Introduce parsing and formatting of <dataStore> element. The <dataStore
represents a different storage volume meant for storing the actual
blocks of guest-visible data. The original disk source is then just a
metadata storage for any advanced features.

This currently works only for 'qcow2' images.

Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-25 22:31:13 +01:00
Nikolai Barybin
28d88e9fc3 Add schema and documentation for 'dataStore' sub-element of disk source
Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-25 20:28:15 +01:00
Nikolai Barybin
64627d5645 conf: add data-file feature and related fields to virStorageSource
The 'data-file' is a qcow2 feature which allows storing the actual data
outside of the qcow2 image.

Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-25 20:28:15 +01:00
shenjiatong
c54df8ed9e QEMU: allow to hot plugging virtio-serial-pci device
Virtio-serial-pci device is hot pluggable, loosen the restriction
and allow user to hot plug it.

Signed-off-by: shenjiatong <yshxxsjt715@163.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-21 16:24:27 +01:00
Peter Krempa
e4d058866e qemu: passt: Use 'reconnect-ms' instead of 'reconnect' with new qemus
The 'reconnect' field of 'stream' network backend type is about to be
deprecated so libvirt will need to start using 'reconnect-ms'.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-19 13:06:02 +01:00
Peter Krempa
9e273db203 qemu: capabilities: Introduce QEMU_CAPS_NETDEV_STREAM_RECONNECT_MILISECONDS
The 'stream' type for 'netdev-add' recently added support for
'reconnect-ms' which supersedes 'reconnect' (now deprecated). Add a
capability which will allow us to switch to the new property.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-19 13:06:02 +01:00
Peter Krempa
25f1fc8568 qemu: capabilities: Restore grouping in 'virQEMUCapsQMPSchemaQueries'
Historically the QMP schema lookup queries were grouped by the first
component of the query (which was also sorted), but not fully sorted.
This deteriorated over time. Re-group the query strings now that some
were added at the bottom.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-19 13:06:02 +01:00
Adam Julis
74ef4888ff qemuDomainDiskChangeSupported: Add missing iothreads check
GSList of iothreads is not allowed to be changed while the
virtual machine is running.

Resolves: https://issues.redhat.com/browse/RHEL-23607
Signed-off-by: Adam Julis <ajulis@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-19 13:04:58 +01:00
Peter Krempa
6f237f4642 Revert "qemu: migration: Improve handling of VIR_MIGRATE_PARAM_DEST_XML with VIR_MIGRATE_PERSIST_DEST"
The original intention was to improve the behaviour of the
VIR_MIGRATE_PERSIST_DEST flag which makes the VM persistent after
migration on the destination when used with VIR_MIGRATE_PARAM_DEST_XML.

While it worked as intended with p2p migration where the migration is
driven from the virtqemud instance on the source of the migration, which
can distinguish between the user-provided input XML and the one fetched
from the source of the migration, it's not easily possible to achieve
the same behaviour with normal migration driven from the client library.

The approach also still had corner cases (originally deemed worth
changing) such as if the persistent definition was modified it would be
overwritten.

As there is no clear fix which would improve both styles of migrations
with no corner cases revert the change.

Upcoming commits will modify the documentation to add warning about the
use of VIR_MIGRATE_PERSIST_DEST with VIR_MIGRATE_PARAM_DEST_XML/xmlin
without using VIR_MIGRATE_PARAM_PERSIST_XML instead of a code fix.

This reverts commit 6a38559092.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-19 13:02:11 +01:00
Richard W.M. Jones
7b73e681a2 vmx: Get the VMware boolean uefi.secureBoot.enabled
Some VMware guests have a boolean uefi.secureBoot.enabled.  If found,
and it's set to "TRUE", and if it's a UEFI guest, then add this clause
into the domain XML:

  <os firmware='efi'>
    <firmware>
      <feature enabled='yes' name='enrolled-keys'/>
      <feature enabled='yes' name='secure-boot'/>
    </firmware>
  </os>

This approximates the meaning of this VMware flag.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Fixes: https://issues.redhat.com/browse/RHEL-67836
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 15:29:43 +01:00
Peter Krempa
ec8be9aceb qemu: Avoid use of '-loadvm' commandline argument for internal snapshot reversion
The '-loadvm' commandline parameter has exactly the same semantics as
the HMP 'loadvm' command. This includes the selection of which block
device is considered to contain the 'vmstate' section.

Since libvirt recently switched to the new QMP commands which allow a
free selection of where the 'vmstate' is placed, snapshot reversion will
no longer work if libvirt's algorithm disagrees with qemu's. This is the
case when the VM has UEFI NVRAM image, in qcow2 format, present.

To solve this we'll use the QMP counterpart 'snapshot-load' to load the
snapshot instead of using '-loadvm'. We'll do this before resuming
processors after startup of qemu and thus the behaviour is identical to
what we had before.

The logic for selecting the images now checks both the snapshot metadata
and the VM definition. In case images not covered by the snapshot
definition do have the snapshot it's included in the reversion, but it's
fatal if the snapshot is not present in a disk covered in snapshot
metadata.

The vmstate is selected based on where it's present as libvirt doesn't
store this information.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 13:51:13 +01:00
Peter Krempa
2da32ff468 qemu: monitor: Extract vmstate presence for internal snapshots in qemuBlockGetNamedNodeData
Refactor the parts of qemuBlockGetNamedNodeData which fetch the names of
internal snapshots present in the on-disk state of QCOW2 images to also
extract the presence of the 'vmstate' section.

This requires conversion of the snapshot list to a hash table as we
always know the name of the snapshot that we're looking for, and the
hash table allows also storing of additional data which we'll use to
store the presence of the 'vmstate'.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 13:51:13 +01:00
Peter Krempa
6902e77c01 qemu: Add enum entries for 'snapshot-load' qemu job
The internal snapshot code will use the 'snapshot-load' command so we
need to add the corresponding job type.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 13:51:13 +01:00
Peter Krempa
2ed93e1a4b qemu: monitor: Add monitor infrastructure for 'snapshot-load' QMP command
Libvirt currently loads snapshots via the '-loadvm' commandline option
but that uses the same logic as the 'loadvm' text monitor command used
to pick the disk image with the 'vmstate' section. Since libvirt now
implements our own logic to pick the 'vmstate' device it can happen that
we pick a different than qemu and thus qemu would fail to load the
snapshot. This happens currently on VMs with UEFI firmware with NVRAM
image in qcow2 format.

To fix this libvirt will need to use the 'snapshot-load' QMP command
instead of relying on '-savevm'.

Implement the monitor bits for 'snapshot-load'.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 13:51:13 +01:00
Peter Krempa
c82dd60b2e qemuSnapshotForEachQcow2: Handle also NVRAM image for internal snapshots
The live VM snapshot code already does handle the NVRAM image when it's
in use, so we should also handle it when modifying/creating the
snapshots via qemu-img when inactive.

Add the handling to qemuSnapshotForEachQcow2 which is used for all
inactive operations.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 13:51:13 +01:00
Peter Krempa
5ca0552d31 qemuSnapshotForEachQcow2: Refactor
Refactor the function to avoid recursive call to rollback and simplify
calling parameters.

To achieve that most of the fatal checks are extracted into a dedicated
loop that runs before modifying the disk state thus removing the need to
rollback altoghether. Since rollback is still necessary when creation of
the snapshot fails half-way through the rollback is extracted to handle
only that scenario.

Additionally callers would only pass the old 'try_all' argument as true
on all non-creation ("-c") modes. This means that we can infer it from
the operation instead of passing it as an extra argument.

This refactor will also make it much simpler to implement handling of
the NVRAM pflash backing file (in case it's qcow2) for internal
snapshots.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 13:51:13 +01:00
Peter Krempa
5dfd0a0ce8 qemu: Move 'qemuDomainSnapshotForEachQcow2(Raw)' to qemu_snapshot.c
The functions are exclusively used in the snapshot module. Move and
rename them:

  qemuDomainSnapshotForEachQcow2Raw -> qemuSnapshotForEachQcow2Internal
  qemuDomainSnapshotForEachQcow2 -> qemuSnapshotForEachQcow2

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 13:51:13 +01:00
Peter Krempa
60838fee08 qemuDomainSnapshotForEachQcow2Raw: Remove 'driver' argument
Now that it's unused except for the recursive call it can be dropped
from all of the call tree.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 13:51:13 +01:00
Peter Krempa
20ffcb912f qemu: Don't store path to qemu img
The 'virCommand' helpers already look up the full path to the binary in
PATH if it's not specified. This means that the qemu driver doesn't have
to lookup and store the path to 'qemu-img' in the conf object but rather
can be cleaned up to use this new infrastructure.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 13:51:13 +01:00
Praveen K Paladugu
25fdb57d8e ch: Enable callbacks for ch domain events
Enable callbacks for define, undefine, started, booted, stopped,
destroyed events of ch guests.

Signed-off-by: Praveen K Paladugu <praveenkpaladugu@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 10:58:38 +01:00
Praveen K Paladugu
ed1cef6264 ch: enable virNodeGetMemoryStats API
Enable virNodeGetMemoryStats API to return the stats of host memory.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Signed-off-by: Praveen K Paladugu <praveenkpaladugu@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 10:45:34 +01:00
Stefan Berger
d79542eec6 qemu: Read back the profile name after creation of a TPM instance
Get the JSON profile that the swtpm instance was created with from the
output of 'swtpm socket --tpm2 --print-info 0x20 --tpmstate ...'. Get the
name of the profile from the JSON and set it in the current and persistent
emulator descriptions as 'name' attribute and have the persistent
description stored with this update. The user should avoid setting this
'name' attribute since it is meant to be read-only. The following is
an example of how the XML could look like:

  <profile source='local:restricted' name='custom:restricted'/>

If the user provided no profile node, and therefore swtpm_setup picked its
default profile, the XML may now shows the 'name' attribute with the name
of the profile. This makes the 'source' attribute now optional.

  <profile name='default-v1'/>

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 08:46:49 +01:00
Stefan Berger
957bda01c8 qemu: Move adding --tpmstate to swtpm command line into own function
Factor-out code related to adding the --tpmstate option to the swtpm
command line into its own function.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 08:46:44 +01:00
Stefan Berger
fc9a333f37 qemu: Move adding of keys to swtpm command line into own function
Factor-out code related to adding key to the swtpm command line into its
own function.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 08:46:41 +01:00
Stefan Berger
cd37721d19 qemu: Extend swtpm_setup command line to set a profile by its name
Run swtpm_setup with the --profile-name option if the user provided the
name of a profile. swtpm_setup will try to load the profile from
directories with local profiles and distro profiles and if no profile
by this name with appended '.json' suffix could be found there, it will
fall back to try to use an internal profile with the given name.

Also set the --profile-remove-disabled option if the user provided a value
in the remove_disabled attribute in the profile XML node.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 08:46:38 +01:00
Stefan Berger
90c40d3b9c conf: Add support for profile parameter on TPM emulator in domain XML
Extend the parser and XML builder with support for the profile parameter
and its remove_disabled attribute.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 08:46:33 +01:00
Stefan Berger
498b5b7440 schema: Extend schema for TPM emulator profile node
Extend the schema for the TPM emulator profile node. Require that the
profile the user provides is described in a 'source' attribute. An optional
remove_disabled attribute is also supported for swtpm to automatically
remove algorithms from the 'custom' profile if they are disabled by FIPS
mode on the host.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 08:46:30 +01:00
Stefan Berger
15ba6edabd conf: Define enum virDomainTPMProfileRemoveDisabled
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 08:46:27 +01:00
Stefan Berger
1079532d74 util: Add parsing support for swtpm_setup's cmdarg-profile capability
Add support for parsing swtpm_setup 'cmdarg-profile' capability
(since v0.10).

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 08:46:24 +01:00
Stefan Berger
279b14cb81 qemu: Pass virQEMUDriverConfig rather than some of its fields
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 08:46:22 +01:00
Stefan Berger
8bba15bdc1 conf: Move TPM emulator parameters into own struct
To avoid passing TPM emulator parameters around individually, move them
into a structure and pass around the structure.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-18 08:46:18 +01:00
Jiri Denemark
b0aa9d31f2 qemu: Avoid useless tmp variable in qemuCanonicalizeMachine
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-11-15 15:22:23 +01:00
Praveen K Paladugu
5904676d2f ch: explicitly set INFILESIZE to 0
While sending API requests that don't need any body, explicitly set
CURLOPT_INFILESIZE to 0.

Without this option, curl sends a chunked request with `Expect: 100-continue`
header. The client, in this case curl, expects a response from the server,
ch in this case, to respond within a timeout period.

If guest definition has a PCI passthrough device configuration,
cloud-hypervisor process cannot respond within above mentioned timeout.
Even if cloud-hypervisor responds after the timeout, curl cannot read
the response. Because of this, virsh request to create a guest, hangs. This
only happens while using "mshv" hypervisor.

By setting CURLOPT_INFILESIZE to O, curl drops the Expect header and
sychronously waits for server to respond.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-15 13:15:41 +01:00
Praveen K Paladugu
cec5bb372a ch: reattach PCI devices to host while stopping guest
Reattach PCI devices to host, while stopping ch guest.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-15 13:15:38 +01:00
Praveen K Paladugu
1e8cc91f9d ch: allow hostdev in domain definitions
Allow hostdev configurations in ch guest definitions.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-15 13:15:35 +01:00
Wei Liu
c6dbc6042d ch: prepare host for PCI passthrough
Prepare host to passthrough PCI devices for ch guests.

Co-authored-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-15 13:15:31 +01:00
Wei Liu
acfe2e7a50 ch: prepare domain definition for pci passthrough
Check if the domain definition is valid for PCI passthrough and update
it if necessary.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-15 13:15:27 +01:00
Wei Liu
5d4f9e1bdd ch: add host device manager to driver
Co-authored-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-15 13:15:24 +01:00
Praveen K Paladugu
89ef0c0f2b hypervisor: move HostdevHostSupportsPassthroughVFIO
Move HostdevHostSupportsPassthroughVFIO method to hypervisor to be
shared between qemu and ch drivers.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-15 13:15:22 +01:00
Praveen K Paladugu
b05f6134c9 hypervisor: move HostdevNeedsVFIO to hypervisor
Move HostdevNeedsVFIO method to hypervisor to be reused between qemu
and ch drivers.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-15 13:15:12 +01:00
Peter Krempa
f35f817ebf qemu: process: Introduce setup of block-device backed NVRAM
In case when a management application will require to store the nvram in
a block device instead of a file libvirt needs to be able to set up the
block device.

This patch introduces support for setting up the block device by using
'qemu-img convert' to produce a qcow2-formatted block device.

The use of 'qcow2' is made mandatory as the UEFI firmware requires that
the NVRAM image has the exact expected size, which is almost impossible
with block devices. 'qcow2' also allows libvirt to detect wheher the
block device is formatted allowing file-like semantics.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-15 09:42:05 +01:00
Peter Krempa
fce4319f58 qemu: process: Extract setup of file-backed nvram from template
The setup of nvram will later be extended to also support block-device
backed nvram, so extract the file-backed nvram setup steps from
'qemuPrepareNVRAM' into 'qemuPrepareNVRAMFile'.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-15 09:42:05 +01:00
Peter Krempa
32228ecb02 conf: Remove nonsensical requirement of nvram format matching firmware format
The nvram image can have any supported format and there's no technical
requirement of them having the same format. In fact the actual nvram
image doesn't necessarily need to have the same format as the template
if the user is willing to format it themselves (as libvirt is not going
to convert it).

Remove the nonsensical check and adjust tests. The test case required
swapping around the format in order to work properly.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-15 09:42:04 +01:00
Peter Krempa
d3016e47be qemuFirmwareMatchDomain: Don't base firmware selection on nvram image format
Basing the selection on the format of the actual NVRAM image makes no
sense as user may format the image themselves.

Additionally it doesn't make much sense to even limit the firmware
selection based on the nvram template itself. As format of the template
is given and firmware images don't really provide any choice.

Remove the limitation so that autoselection can pick a template
regardless of the selected format or template format.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-15 09:42:04 +01:00
Peter Krempa
2aa644a2fc qemuPrepareNVRAM: Refuse conversion of NVRAM backing file format
Refuse situations where the user configures a different format for a
file-backed nvram than the template file has.

At this point it's still required that the NVRAM and firmware share
format, but that is going to be relaxed, thus we need to refuse
configurations that the code can't handle.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-15 09:42:04 +01:00
Peter Krempa
6540cc08b1 conf: Always format firmware image format
The code historically skipped the 'format' field for 'raw' images as we
didn't output it when no format support was present. Stop misleading and
output the format also for 'raw' images.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-15 09:42:04 +01:00
Peter Krempa
49ce561e3f conf: domain: Output 'format' attribute of '<nvram>' also for' raw images
As the 'format' field is meant to carry the format of the nvram image we
should output it even when the image is 'raw'.

Currently this is not a problem but later patches will allow mismatch
between the nvram format and loader format (as nothing really
technically requires them to be the same and this then could become
problem).

Modify the condition and update tests.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-15 09:42:04 +01:00
Peter Krempa
366907e520 firmware: Add 'templateFormat' XML attribute and plumb it in
Currently the qemu firmware code weirdly depends on the 'format' field
of the nvram image itself to do the auto-selection process as well as
then uses it to declare the actual type to qemu.

As it's not technically required that the template and the on disk image
share the type introduce a 'templateFormat' field which will split off
from the shared purpose of the type and will be used for the selection
and instantiation process, while 'format' will be left for the actual
type of the on disk image.

This patch introduces the field, adds XML infrastructure as well as
plumbs it to the firmware bits.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-15 09:42:04 +01:00
Peter Krempa
a448d4a18a conf: domain: Clarify nvram/loader format logic
Restructure the code to assign first (as this is simpler to refactor in
the future) and avoid mixing implicit value checks with explicit ones by
checking for _NONE.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-15 09:42:04 +01:00
Peter Krempa
d57630c282 qemu: Install backing store terminators for 'pflash' blockdevs
The qemu driver does support qcow2 images for the firmware and nvram
pflash devices, but we do not do the full backing chain setup for them
as we don't expect that those images would actually have a backing
store. We don't tell that to qemu though which theoretically can lead to
qemu probing the backing store from the image itself. We don't want that
for now.

Deny qemu probing the backing store by installing a "terminator" empty
virStorageSource as 'backingStore' for pflash and nvram.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-15 09:42:03 +01:00
Peter Krempa
6a8bcd1110 qemuFirmwareEnsureNVRAM: Don't try to setup non-local nvram
'qemuFirmwareEnsureNVRAM' which fills the NVRAM configuration bits which
may be missing was basing its decision to do something based on whether
the 'path' field was set. This is insufficient if remote storage is to
be considered.

Use 'virStorageSourceIsEmpty()' instead as that properly considers
remote filesystems and explain why the source is unref'd when the
function decides to rewrite the config.

The 'firmware-auto-efi-format-nvram-qcow2-network-nbd' is modified to
omit filling the 'path' field, which without this fix would result in
the nvram to be reset to a local file.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-15 09:42:03 +01:00
Peter Krempa
273157dd9f qemuPrepareNVRAM: Don't attempt to create NVRAM on block device
'virFileRewrite()' which is used to setup the NVRAM image if it doesn't
exist or when it is requested by the user forcibly replaces the
destination file by the file it creates. For block devices this
overwrites the device node file or the symlink pointing to the device
node by a regular file instead of formatting it.

As this not only makes the VM fail to start but also breaks user's /dev/
filesystem forbid it for now.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-15 09:42:03 +01:00
Michal Privoznik
24580d13d1 qemu: Move PostParse functions out of qemu_domain.c
Problem with qemu_domain.c is that it's constantly growing. But
there are few options for improvement. For instance, validation
functions were moved out and now live in qemu_validate.c. We can
do the same for PostParse functions, though since PostParse may
modify domain definition, some functions need to be exported from
qemu_domain.c.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-11-13 13:13:54 +01:00
Martin Kletzander
bf2af76ec2 qemu_hotplug: Do not report unknown error when hot-unplugging non-existing device
When qemuDomainDeleteDevice() gets "DeviceNotFound" error it is a
special case as we're trying to remove a device which does not exists
any more.  Such occasion is indicated by the return value -2.

Callers of the aforementioned function ought to base their behaviour on
the return value.  However not all callers take as much care for the
return value as one could realistically anticipate.

Follow the usual direction of removing possible backend object (in case
of character devices), remove the device from its XML without waiting
for the device removal from QEMU (since it is already not there) and
basically follow the same algorithm as there is when the device was
removed, skipping over the wait for the device removal.

The overall return value also needs to be adjusted since
qemuDomainDeleteDevice() does not set an error on the -2 return value
and would otherwise trigger an unknown error being reported to the user
or management application.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-12 15:41:34 +01:00
Michal Privoznik
aeebb30ba2 Drop unused function declarations
When moving function and/or renaming them sometimes corresponding
change to corresponding header file is not done. This leaves us
with functions that are declared in header files, but nowhere
implemented. Drop such declarations.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-12 15:35:53 +01:00
Michal Privoznik
6c50d11276 virnetserverclient.h: Fix typo in comment of virNetServerClientPrivPreExecRestart()
The function the comment is referring to is
virNetServerClientPrivNew() not virNetServerClintPrivNew(). The
latter doesn't even exist.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-12 15:35:53 +01:00
Michal Privoznik
d88ebd4374 ch_monitor: Report OS error when removing socket fails
When removing a socket in virCHMonitorClose() fails, a warning is
printed. But it doesn't contain errno nor g_strerror() which may
shed more light into why removing of the socket failed.

Oh, and since virCHMonitorClose() is registered as autoptr
cleanup for virCHMonitor() it may happen that virCHMonitorClose()
is called with mon->socketpath allocated but file not existing
yet (see virCHMonitorNew()). Thus ignore ENOENT and do not print
warning in that case - the file doesn't exist anyways.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-12 09:11:47 +01:00
Michal Privoznik
f1f4cbb50a ch_monitor: Avoid possible double free in virCHMonitorClose()
The virCHMonitorClose() is meant to be called when monitor to
cloud-hypervisor process closes. It removes the socket and frees
string containing path to the socket.

In general, there is a problem with the following pattern:

  if (var) {
      do_something();
      g_free(var);
  }

because if the pattern executes twice the variable is freed
twice. That's why we have VIR_FREE() macro. Well, replace plain
g_free() with g_clear_pointer(). Mind you, this is NOT a
destructor where clearing pointers is needless.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-12 09:11:36 +01:00
John Levon
4be361a385 test_driver: provide basic disk hotunplug support
Signed-off-by: John Levon <john.levon@nutanix.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-12 09:10:44 +01:00
John Levon
c530a96151 test_driver: provide basic disk hotplug support
Add some basic plumbing, based on the qemu driver.

Signed-off-by: John Levon <john.levon@nutanix.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-11-12 09:10:44 +01:00
Martin Kletzander
27ae5e602a qemu_hotplug: Report better error message for platform serial devices
This should be better than the current for both hotplug:

    error: internal error: Invalid target model for serial device

and hot-unplug:

    error: An error occurred, but the cause is unknown

which should not be reached at all.

Resolves: https://issues.redhat.com/browse/RHEL-66222
Resolves: https://issues.redhat.com/browse/RHEL-66223
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-11 12:48:42 +01:00
Martin Kletzander
52c2e3e0a7 qemu: Expose qemuChrIsPlatformDevice outside from qemu_command
Then it can be used from qemu_hotplug.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-11 12:48:42 +01:00
Boris Fiuczynski
bf0308b2d4 qemu: command: add multi boot device support on s390x
If QEMU supports multi boot device make use of it instead of using the
single boot device machine parameter.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-11 10:15:06 +01:00
Boris Fiuczynski
3ccf692e08 qemu: capabilities: Add QEMU_CAPS_VIRTIO_CCW_DEVICE_LOADPARM
Add capability QEMU_CAPS_VIRTIO_CCW_DEVICE_LOADPARM to detect multi boot
device support in QEMU by checking the virtio-blk-ccw device property
existence of loadparm.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-11 10:15:06 +01:00
Michal Privoznik
a3b8753db9 virnetdevopenvswitch: Warn on unsupported QoS settings
Let me preface this with stating the obvious: documentation on
QoS in OVS is very sparse. This is all based on my observation
and OVS codebase analysis.

For the following QoS setting:

  <bandwidth>
    <inbound average="512" peak="1024" burst="32"/>
  </bandwidth>

the following QoS setting is generated into OVS (NB, our XML
values are in KiB/s, OVS has them in bits/s):

  # ovs-vsctl list qos
  _uuid               : a087226b-2da6-4575-ad4c-bf570cb812a9
  external_ids        : {ifname=vnet1, vm-id="7714e6b5-4885-4140-bc59-2f77cc99b3b5"}
  other_config        : {burst="262144", max-rate="8192000", min-rate="4096000"}
  queues              : {0=655bf3a7-e530-4516-9caf-ec9555dfbd4c}
  type                : linux-htb

from which the following topology is generated:

  # for i in qdisc class; do tc -s -d -g $i show dev vnet1; done
  qdisc htb 1: root refcnt 2 r2q 10 default 0x1 direct_packets_stat 0 ver 3.17 direct_qlen 1000
   Sent 2186 bytes 16 pkt (dropped 0, overlimits 0 requeues 0)
   backlog 0b 0p requeues 0

  +---(1:fffe) htb rate 8192Kbit ceil 8192Kbit linklayer ethernet burst 1499b/1mpu 60b cburst 1499b/1mpu 60b level 7
       |       Sent 2186 bytes 16 pkt (dropped 0, overlimits 0 requeues 0)
       |       backlog 0b 0p requeues 0
       |
       +---(1:1) htb prio 0 quantum 51200 rate 4096Kbit ceil 8192Kbit linklayer ethernet burst 32Kb/1mpu 60b cburst 32Kb/1mpu 60b level 0
                 Sent 2186 bytes 16 pkt (dropped 0, overlimits 0 requeues 0)
                 backlog 0b 0p requeues 0

Long story short, the default class (1:) for an OVS interface has
average and peak set exactly as requested. But since it's nested
under another class (1:fffe), it can borrow unused bandwidth. And
the parent is set to have rate = ceil = peak from our XML. From
[1]: htb_tc_install() calls htb_parse_qdisc_details__() which
sets: 'hc->min_rate = hc->max_rate;' and then calls
htb_setup_class_(..., tc_make_handle(1, 0xfffe), tc_make_handle(1, 0), &hc);
to set up the top parent class.

In other words - the interface is set up to so that it can always
consume 'peak' bandwidth and there is no way for us to set it up
differently. It's too late to deny setting 'peak' different to
'average' at XML validation phase so do the next best thing -
throw a warning, just like we do in case <bandwidth/> is set for
an unsupported <interface/> type.

1: https://github.com/openvswitch/ovs/blob/main/lib/netdev-linux.c#L5039
Resolves: https://issues.redhat.com/browse/RHEL-53963
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-11-08 10:41:02 +01:00
Michal Privoznik
844d1036eb qemu_domain: Automagically add IOMMU if needed
If a Q35 domain has huge number of vCPUS (over 255, currently), then
it needs IOMMU with Extended Interrupt Mode enabled (see check in
qemuValidateDomainVCpuTopology()).

Well, we already add some devices and to other tricks when
parsing new domain XML. Might as well add IOMMU device if above
condition is met.

Resolves: https://issues.redhat.com/browse/RHEL-65844
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-07 10:46:33 +01:00
Michal Privoznik
b15047ff26 qemu: Turn EIM IOMMU on automagically
If a Q35 domain has huge number of vCPUS (over 255, currently), then
it needs IOMMU with Extended Interrupt Mode enabled (see check in
qemuValidateDomainVCpuTopology()).

Well, we already add some devices and to other tricks when
parsing new domain XML. Might as well turn the EIM on for IOMMU
device.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-07 10:46:33 +01:00
Michal Privoznik
a9797d7c43 libvirt_private.syms: Export virDomainIOMMUDefNew()
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-07 10:46:32 +01:00
Ján Tomko
e45313c031 ch: check return value of virJSONValueArrayAppend
It only errors out when presented with a non-array, but we do check
it everywhere else.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-06 17:12:32 +01:00
Ján Tomko
da66bf53b0 util: json: check return value of virJSONValueFromJsonC
In virJSONValueFromJsonC, the return value of virJSONValueFromJsonC
was not checked in one case.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-06 17:12:32 +01:00
Ján Tomko
13f40898ab qemu: chardev: avoid impossible overflow
In the rare case where int and long long are not the same size,
the multiplication of an int variable and an int constant might
overflow. Cast the constant to long long to avoid this.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: baa4edfb79
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-11-06 17:12:32 +01:00
Marc-André Lureau
bb5e26749f qemu: explicit swtpm state locking
With upcoming v0.10 swtpm (commit
aa483aeb6d),
file locking with "lock" option is now supported and reflected in
"tpmstate-opt-lock" capability.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
2024-11-05 15:25:53 +01:00
Marc-André Lureau
f1304cc566 qemu_tpm: handle file/block storage source
When swtpm reports "nvram-backend-dir", it can accepts a single file or
block device where TPM state will be stored. --tpmstate must be
backend-uri=file://<path>.

Teach the storage to use custom directory or file source location.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
2024-11-05 15:25:53 +01:00
Marc-André Lureau
a110042d0c schema: add TPM emulator <source type='dir' path='..'>
Learn to parse a directory for the TPM state.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
2024-11-05 15:25:53 +01:00
Marc-André Lureau
579fd44612 schema: add TPM emulator <source type='file' path='..'>
Learn to parse a file path for the TPM state.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
2024-11-05 15:25:53 +01:00
Marc-André Lureau
6d4eb07a55 tpm: rename 'storagepath' to 'source_path'
Mechanically replace existing 'storagepath' with 'source_path', as the
following patches introduce <source path='..'> configuration.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
2024-11-05 15:25:53 +01:00
Marc-André Lureau
cc0aab9395 util: check swtpm nvram-backend-{dir,file} capabilities
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
2024-11-05 15:25:53 +01:00
Martin Kletzander
a52cd504b3 qemu: Report supported panic device models in domcapabilities
Domain capabilities include information about support for various
devices and models.

Panic devices are not included in the output which means that management
applications need to include the logic for choosing the right device
model or request a default model and try defining such a domain.

Add reporting of panic device models into the domain capabilities based
on the logic in qemuValidateDomainDefPanic() and also report whether
panic devices are supported based on whether at least one model is
supported.  That way consumers of the domain capability XML can
differentiate between libvirt not reporting the panic device models or
no model being supported.

Resolves: https://issues.redhat.com/browse/RHEL-65187
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-05 09:57:37 +01:00
Ján Tomko
faf6edfa74 json: do not call json_tokener_free with NULL
Add an error message for the rare case if json_tokener_new
fails (allocation failure) and guard any use of json_tokener_free
where tok might be NULL (this was possible in libvirt-nss
when the json file could not be opened).

https://gitlab.com/libvirt/libvirt/-/issues/581

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reported-by: Simon Pilkington
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2024-11-04 12:15:10 +01:00
Peter Krempa
d02140383d virstring: Use 'g_new0' instead of improper use of 'g_malloc0_n'
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-01 15:52:18 +01:00
Peter Krempa
bb4bd9d31f Replace improper use of g_malloc(0) with g_new0
Completely remove use of g_malloc (without zeroing of the allocated
memory) and forbid further use.

Replace use of g_malloc0 in cases where the variable holding the pointer
has proper type.

In all of the above cases we can use g_new0 instead.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-01 15:52:18 +01:00
Peter Krempa
354a3d2be4 virJSONValueFromString: Prefix error message from 'json-c'
The error message from 'json-c' was passed along without any libvirt
string which makes it hard to find in the source and isn't exactly clear
when present in logs:

 libvirtd[843]: internal error : invalid utf-8 string

Prefix the message with 'failed to parse JSON'.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-11-01 15:51:53 +01:00
Jiri Denemark
e71a510605 qemu: Fix maximum physical address size in baseline CPU
We should include maximum physical address size in the CPU definition
created by virConnectBaselineHypervisorCPU only if we know the value for
all input CPUs. Otherwise we would create a CPU definition that is not
usable on all hosts from which we gathered the CPU info.

https://issues.redhat.com/browse/RHEL-24850

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-11-01 10:19:24 +01:00
Laine Stump
7581e3b6d5 Revert "network: add rule to nftables backend that zeroes checksum of DHCP responses"
This reverts commit 42ab0148dd.

This patch was supposed to fix the checksum of dhcp response packets
by setting it to 0 (because having a non-0 but incorrect checksum was
causing the packets to be droppe on FreeBSD guests).

Early testing was positive, but after the patch was pushed upstream
and more people could test it, it turned out that while it fixed the
dhcp checksum problem for virtio-net interfaces on FreeBSD and
OpenBSD, it also *broke* dhcp checksums for the e1000 emulated NIC on
*all* guests (but not e1000e).

So we're reverting this fix and looking for something more universal
to be included in the next release.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-10-30 11:39:58 +01:00
Laine Stump
42ab0148dd network: add rule to nftables backend that zeroes checksum of DHCP responses
Many years ago (April 2010), soon after "vhost" in-kernel packet
processing was added to the virtio-net driver, people running RHEL5
virtual machines with a virtio-net interface connected via a libvirt
virtual network noticed that when vhost packet processing was enabled,
their VMs could no longer get an IP address via DHCP - the guest was
ignoring the DHCP response packets sent by the host.

(I've been informed by danpb that the same issue had been encountered,
and "fixed" even earlier than that, in 2006, with Xen as the
hypervisor.)

The "gory details" of the 2010 discussion are chronicled here:

  https://lists.isc.org/pipermail/dhcp-hackers/2010-April/001835.html

but basically it was because packet checksums weren't being fully
computed on the host side (because QEMU on the host and the NIC driver
in the guest had agreed between themselves to turn off checksums
because they were unnecessary due to the "link" between the two being
entirely in local memory rather than an error-prone physical cable),
but

1) a partial checksum was being put into the packets at some point by
   "someone"

2) the "don't use checksums" info was known by the guest kernel, which
   would properly ignore the "bad" checksum), and

3) the packets were being read by the dhclient application on the
   guest side with a "raw" socket (thus bypassing the guest kernel UDP
   processing that would have known the checksum was irrelevant and
   ignore it)),

The "fix" for this ended up being two-tiered:

1) The ISC DHCP package (which contains the aforementioned dhclient
program) made a fix to their dhclient code which caused it to accept
packets anyway even if they didn't have a proper checksum (NB: that's
not a full explanation, and possibly not accurate). This remedied the
problem for guests with an updated dhclient. Here is the code with the
fix to ISC DHCP:

  https://github.com/isc-projects/dhcp/blob/master/common/packet.c#L365

This eliminated the issue for any new/updated guests that had the
fixed dhclient, but it didn't solve the problem for existing/old guest
images that didn't/couldn't get their dhclient updated. This brings us
to:

2) iptables added a new "CHECKSUM" target and "--checksum-fill"
action:

  http://patchwork.ozlabs.org/patch/58525/

and libvirt added an iptables rule for each virtual network to match
DHCP response packets and perform --checksum-fill. This way by the
time dhclient on the guest read the raw packet, the checksum would be
corrected, and the packet would be accepted. This was pushed upstream
in libvirt commit v0.8.2-142-gfd5b15ff1a.

The word at the time from those more knowledgeable than me was that
the bad checksum problem was really specific to ISC's dhclient running
on Linux, and so once their fix was in use everywhere dhclient was
used, bad checksums would be a thing of the past and the
--checksum-fill iptables rules would no longer be needed (but would
otherwise be harmless if they were still there).

(Plot twist: the dhclient code in fix (1) above apparently is on a
Linux-only code path - this is very important later!)

Based on this information (and also due to the opinion that fixing it
by having iptables modify the packet checksum was really the wrong way
to permanently fix things, i.e. an "ugly hack"), the nftables
developers made the decision to not implement an equivalent to
--checksum-fill in nftables. As a result, when I wrote the nftables
firewall backend for libvirt virtual networks earlier this year, it
didn't add in any rule to "fix" broken UDP checksums (since there was
apparently no equivalent in nftables and, after all, that was fixed
somewhere else 14 years ago, right???)

But last week, when Rich Jones was doing routine testing using a Fedora
40 host (the first Fedora release to use the nftables backend of libvirt's
network driver by default) and a FreeBSD guest, for "some strange
reason", the FreeBSD guest was unable to get an IP address from DHCP!!

  https://www.spinics.net/linux/fedora/libvirt-users/msg14356.html

A few quick tests proved that it was the same old "bad checksum"
problem from 2010 come back to haunt us - it wasn't a Linux-only issue
after all.

Phil Sutter and Eric Garver (nftables people) pointed out that, while
nftables doesn't have an action that will *compute* the checksum of a
packet, it *does* have an action that will set the checksum to 0, and
suggested we try adding a "zero the checksum" rule for dhcp response
packets to our nftables ruleset. (Why? Because a checksum value of 0
in a IPv4 UDP packet is defined by RFC768 to mean "no checksum
generated", implying "checksum not needed").  It turns out that this
works - dhclient properly recognizes that a 0 checksum means "don't
bother with the checksum", and accepts the packet as valid.

So to once again fix this timeless bug, this patch adds such a
checksum zeroing rule to the nftables rules setup for each virtual
network.

This has been verified (on a Fedora 40 host) to fix DHCP with FreeBSD
and OpenBSD guests, while not breaking it for Fedora or Windows (10)
guests.

Fixes: b89c4991da
Reported-by: Rich Jones <rjones@redhat.com>
Fix-Suggested-by: Eric Garver <egarver@redhat.com>
Fix-Suggested-by: Phil Sutter <psutter@redhat.com>
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-10-25 12:00:52 -04:00
Laine Stump
d5af1e90bb network: don't unset the firewalld zone if it's going to be immediately re-set
Any time the firewalld zone for an interface is set, by definition
that removes it from any previous zone that it was in, so there is
really no point in unsetting the zone if it's just going to be
immediately set again.

This is useful because when firewalld reloads its rules, 3 things happen:

1) firewalld flushes *all* firewall rules (including those added by libvirt)

2) firewalld unsets the zones for all interfaces (including those set
   by libvirt)

3) firewalld re-adds its own rules, and sets the zone for all the
   interfaces it manages

4) firewalld sends a dbus message that libvirt is watching for, and
   when libvirt receives that message, it reloads all of the
   libvirt-generated rules, and also re-sets the firewalld zone for
   the bridge interfaces managed by libvirt.

libvirt accomplishes step 4 by a) calling
networkRemoveFirewallRules(), and then b) calling
networkAddFirewallRules(). But (because it is useful in other
contexts) networkRemoveFirewallRules() will attempt to *unset* the
zone for each bridge interface, and when firewalld receives this
request, it sees that the bridge interface *has no zone* (because it
was unset by firewalld in step (2) above), and thus logs an error
message.

There is no way for libvirt to suppress an error message that is
logged by firewalld when a request to firewalld fails. But what
libvirt *can* do is realize that in these cases, the firewalld zone is
about to be set again anyway, and so we don't need to unset the zone.

This patch handles that by adding a bool unsetZone to the arguments of
networkRemoveFirewallRules(); most calls to networkRemoveFirewallRules()
have unsetZone=true, but in two cases where the zone is about to be
reset, networkRemoveFirewallRules() is called with unsetZone=false,
which prevents the call to virFirewallDInterfaceUnsetZone() and thus
avoids the unnecessary (and confusing to users!) error message that
would have been logged by firewalld.

Signed-off-by: Laine Stump <laine@redat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-24 12:31:03 -04:00
Laine Stump
e8228a9e79 network: ignore/don't log errors when unsetting firewalld zone
The most common "error" when trying to unset the firewalld zone of an
interface is for firewalld to tell us that the interface already isn't
in any zone. Since this is what we want, no need to alarm the user by
logging it as an error.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-24 11:50:41 -04:00
Jiri Denemark
f4dc248a95 domain_capabilities: Report CPU blockers
When a CPU model is reported as usable='no' an additional
<blockers model='...'> element is added for that CPU model to show which
features are missing for the CPU model to become usable.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-24 15:53:51 +02:00
Jiri Denemark
016be5510a domain_capabilities: Sort CPU models
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-24 15:53:51 +02:00
Jiri Denemark
0c6134f190 util: Introduce virStringListRemoveDuplicates
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-24 15:53:51 +02:00
Jiri Denemark
f928eb5fc8 qemu: Change CPU comparison algorithm for future models
When starting a domain we check whether the guest CPU definition is
compatible with the host (i.e., when the host supports all features
required both explicitly and by the specified CPU model) as long as
check == 'partial', which is the default.

We are doing so by checking our definition of the CPU model in the CPU
map amending it with explicitly mentioned features and comparing it to
features QEMU would enabled when started with -cpu host. But since our
CPU model definitions often slightly differ from QEMU we may be checking
features which are not actually needed and on the other hand not
checking something that is part of the CPU model in QEMU.

This patch changes the algorithm for CPU models added in the future
(changing it for existing models could cause them to suddenly become
incompatible with the host and domains using them would fail to start).
The new algorithm uses information we probe from QEMU about features
that block each model from being directly usable. If all those features
are explicitly disabled in the CPU definition we consider the base model
compatible with the host. Then we only need to check that all explicitly
required features are supported by QEMU on the host to get the result
for the whole CPU definition.

After this we only use the model definitions (for newly added models)
from CPU map for creating a CPU definition for host-model.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:45 +02:00
Jiri Denemark
e373f87034 qemu: Introduce virQEMUCapsGetCPUBlockers
A function for accessing a list of features blocking CPU model
usability.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:44 +02:00
Jiri Denemark
5f8abbb7d0 cpu: Introduce virCPUCompareUnusable
As opposed to the existing virCPUCompare{,XML} this function does not
use CPU model definitions from CPU map. It relies on CPU model usability
info from a hypervisor with a list of blockers that make the selected
CPU model unusable. Explicitly requested features are checked against
the hypervisor's view of a host CPU.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:44 +02:00
Jiri Denemark
591b364f49 qemu: Separate partial CPU check into a function
The new qemuDomainCheckCPU function is used as a replacement for
virCPUCompare to make sure all callers use the same comparison
algorithm. As a side effect qemuConnectCompareHypervisorCPU now properly
reports CPU compatibility for CPU model that are considered runnable by
QEMU even if our definition of the model disagrees.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:44 +02:00
Jiri Denemark
52d2a8eb6c qemu: Use virCPUCompare in qemuConnectCompareHypervisorCPU directly
The function already parses CPU XML on s390. By parsing it consistently
on all architecture we can switch to virCPUCompare and easily replace it
with a QEMU specific helper in the following patch.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:44 +02:00
Jiri Denemark
1c45473b93 qemu: Use g_autoptr in qemuConnectCompareHypervisorCPU
Let's get rid of the only explicitly freed variable left in
qemuConnectCompareHypervisorCPU.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:44 +02:00
Jiri Denemark
5475688a29 cpu: Introduce virCPUGetCheckMode
On x86 the function returns whether an old style compat check mode
should be used for a specified CPU model according to the CPU map. All
other architectures will always use compat mode.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:44 +02:00
Jiri Denemark
cd93f7ddab cpu_map: Use compat partial check for all x86 CPU models
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:44 +02:00
Jiri Denemark
f8ade72c2b cpu_x86: Introduce <check> element for CPU models
CPU models in the CPU map may be marked with <check partial="compat"/>
to indicate a backward compatible partial check (comparing our
definition of the model with the host CPU) should be performed. Other
models will be checked using just runnability info from QEMU.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-23 16:00:44 +02:00
Peter Krempa
36080e1b57 qemu: snapshot: Delete leftover overlay files for <transient/> disks
When a VM is terminated by host reboot libvirt doesn't get to cleaning
out the temporary overlay file used for transient disks. Since we create
those files with a very specific suffix it's almost guaranteed that if
it exists it's a leftover from a libvirt run. Delete them instead of
complaining to preserve functionality.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/684
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-10-22 08:15:36 +02:00
Peter Krempa
7cbe9e94c4 util: bitmap: Rewrite virBitmapShrink using new helpers
Rather than reimplement everything manually use virBitmapBuffsize to
find the current number of units, realloc the buffer and clear the tail
using virBitmapClearTail().

This fixes a corner case where the buffer would be over-allocated by one
unit when shrinking to the boundary of the unit size.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-17 17:09:24 +02:00
Peter Krempa
e506e0b3f1 util: virbitmap: Extract clearing of unused bits at the end of the last unit
Extract the clearing of the traling bits from 'virBitmapSetAll' into a
new helper.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-17 17:09:24 +02:00
Peter Krempa
e572150ebe virbitmap: Extract and reuse buffer size calculation into a function
Calculating the number of element can come handy in multiple places,
extract it from virBitmapNew.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-17 17:09:24 +02:00
Peter Krempa
cfe638ef80 virBitmapNewCopy: Honor sizes of either bitmap when doing memcpy()
'virBitmapNewCopy()' allocates a new bitmap with the same number of bits
but uses the internal allocation length as argument for the memcpy()
operation to copy the bits. Due to bugs in other code these may not be
the same resulting into a buffer overflow if the source is
over-allocated. Use the buffer length of the target bitmap instead.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-17 17:09:24 +02:00
Martin Kletzander
f7c89763b1 qemu: Do not hardcode Hyper-V feature names on command line
When constructing the command line for QEMU, some Hyper-V features were
hardcoded, probably due to the fact that they could not have been
automatically translated from the libvirt feature name to QEMU CPU
feature name.

Well now they can be, thanks to their additions to the
virQEMUCapsCPUFeaturesX86 translation table.

Translate all such features the same way when constructing the command
line.  This way any future feature that is not translated will be caught
by tests (if a test is added for it) which was not the case when it was
just hardcoded.  Hopefully this avoids at least some possible future
issues.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-17 00:43:36 +02:00
Martin Kletzander
ca8c0862ac qemu: Add more translations to virQEMUCapsCPUFeatureTranslationTable
Hyper-V enlightenment features can have hyphenated names which libvirt
exposes under Hyper-V features with underscored names.  When libvirt
checks that all requested features were enabled by QEMU (on x86
architectures) it first queries for all those that QEMU knows and
compiles them in a map while using the virQEMUCapsCPUFeaturesX86 for
translations.

Some features (well, all Hyper-V features with underscores) were not
present in the translation table and were incorrectly reported as not
enabled, consequently failing the start of any such domain.

Add all hyphenated/underscored Hyper-V feature names into the
aforementioned translation table.  That way domains with these features
enabled can be started when QEMU and the kernel support them.

Resolves: https://issues.redhat.com/browse/RHEL-7122
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-17 00:43:35 +02:00
Adam Julis
0fd36e9656 lxc: fix variable storage order before call
virDomainConfNWFilterInstantiate() was called without updated
net->ifname, it caused in some cases throwing error message. If
function failed, change is reverted.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/658
Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-16 16:30:19 +02:00
Martin Kletzander
f2710260d4 qemu_namespace: Only replicate labels on created files
Function qemuNamespaceMknodOne() is trying to replicate a file from the
parent namespace as perfectly as possible, with the same permissions,
labels, ACLs, etc.

If that file already existed it means that the qemu process is probably
using it already and the current setting is probably more correct than
the ones from the parent namespace.

In order to reflect that only replicate the file metadata when it was
(re-)created in this function.

Resolves: https://issues.redhat.com/browse/RHEL-62174
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-10-16 15:07:10 +02:00
Martin Kletzander
26f249034d qemu_namespace: Properly report new files
Function qemuNamespaceMknodOne() is supposed to return 0 if the file did
not exist before this function.  If, however, the file existed, but was
removed and recreated by this function the @existed flag should be reset
to its proper state (false) because the function then behaves the same
way as if the file did not exist as it needed to be recreated.

So reset the @existed flag to properly reflect what happened.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-10-16 15:07:10 +02:00
Martin Kletzander
2b19f4b82d qemu_namespace: Rename variable
The boolean actually tells whether the file existed when the function
was called and using it in more places later on makes them
confusing (e.g. do something with a file if it does not exist).  To
better reflect the above and prepare for next patch rename this
variable.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-10-16 15:07:10 +02:00
Peter Krempa
baa4edfb79 qemu: chardev: Use 'reconnect-ms' instead of deprecated 'reconnect'
qemu-9.2 will deprecate the 'reconnect' field in favor of
'reconnect-ms'. As libvirt currently doesn't track the timeouts in
milliseconds we simply convert them to avoid use of the deprecated
field.

Quite a lot of churn is caused by the need to plumb 'qemuCaps' into the
chardev props generator.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-16 14:39:09 +02:00
Peter Krempa
23fa1d2184 qemu: capabilities: Introduce QEMU_CAPS_CHARDEV_RECONNECT_MILISECONDS
New qemu introduced the 'reconnect-ms' field for character devices
allowing the reconnect timeout to be specified in milliseconds, which
also deprecates the existing 'reconnect' field that libvirt uses.

To avoid use of deprecated interfaces add a capability which will allow
us to use the new field.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-16 14:39:09 +02:00
Andrea Bolognani
81493d8eb6 apparmor: Allow running i686 VMs on Debian 12
In Debian 12, the qemu-system-i386 binary in /usr/bin is a wrapper
script, with the actual executable living in /usr/libexec instead.
This makes it impossible to run i686 VMs when AppArmor is enabled.

Allow running the actual binary.

https://bugs.debian.org/1030926

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
2024-10-16 09:46:49 +02:00
Ján Tomko
e996536a3b Remove pointless bool conversions
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-15 14:48:35 +02:00
Peter Krempa
e2c6f4c800 qemu: snapshot: Remove dead code in 'qemuSnapshotDeleteBlockJobRunning'
'qemuSnapshotDeleteBlockJobIsRunning' returns only 0 and 1. Convert it
to bool and remove the dead code handling -1 return in the caller.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/682
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-14 16:25:21 +02:00
Peter Krempa
04d6a0ec5d qemu: migration: Fix blockdev config with VIR_MIGRATE_PARAM_MIGRATE_DISKS_DETECT_ZEROES
The idea of migration with VIR_MIGRATE_PARAM_MIGRATE_DISKS_DETECT_ZEROES
populated is to sparsify the image. The QEMU NBD client as it was
configured in commit 621f879adf would
signal to the destination to do thick allocation of holes which would
result in a non-sparse image for any backend except a qcow2 image which
I used to test it.

Switch to VIR_DOMAIN_DISK_DETECT_ZEROES_UNMAP and
VIR_DOMAIN_DISK_DISCARD_UNMAP which tells the NBD client (and that in
turn the NBD server) to preserve the sparse blocks it detected from the
image.

Fixes: 621f879adf
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-14 16:25:21 +02:00
Jiri Denemark
0c653fc9a5 util: Rename variable "major" in virIsDevMapperDevice
major() is a macro defined in sys/sysmacros.h so luckily the code works,
but it's very confusing. Let's rename the local variable to make the
difference between it and the macro more obvious. And while touching the
line we can also initialize it to make sure "clever" analyzers do not
think it may be used uninitialized.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2024-10-14 11:48:50 +02:00
Laine Stump
37800af9a4 network: inhibit idle timeout of daemon if there are any active networks
When the daemons were split out from the monolithic libvirtd, the
network driver didn't implement "inhibit idle timeout if there are any
active objects" as was done for other drivers, so virtnetworkd would
always exit after 120 seconds of no incoming connections. This didn't
every cause any visible problem, although it did mean that anytime a
network API was called after an idle time > 120 seconds, that the
restarting virtnetworkd would flush and reload all the
iptables/nftables rules for any active networks.

This patch replicates what is done in the QEMU driver - an nactive is
added to the network driver object, along with an inhibitCallback; the
latter is passed into networkStateInitialize when the driver is
loaded, and the former is incremented for each already-active network,
then incremented/decremented each time a network is started or
stopped. If nactive transitions from 0 to 1 or 1 to 0, inhibitCallback
is called, and it "does the right stuff" to prevent/enable the idle
timeout.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-10-10 14:07:12 -04:00
Jim Fehlig
d721b6840f libxl: Reject VM config referencing nwfilters
The Xen libxl driver does not support nwfilter. Introduce a
deviceValidateCallback function with a check for nwfilters, returning
VIR_ERR_CONFIG_UNSUPPORTED if any are found. Also fail to start any
existing VMs referencing nwfilters.

Drivers generally ignore unrecognized XML configuration, but ignoring
a user's request to filter VM network traffic can be viewed as a
security issue.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-10-10 08:39:12 -06:00
Laine Stump
c0ba3ed69d network: a different implementation of *un*setting firewalld zone when network is destroyed
(this is a remake of commit v10.7.0-78-g200f60b2e1, which was reverted
due to a regression in another patch it was dependent on. The new
implementation just adds the call to virFirewallDInterfaceUnsetZone()
into the existing networkRemoveFirewallRules() (but only if we had set
a zone when the network was first started).

Replaces: 200f60b2e1
Resolves: https://issues.redhat.com/browse/RHEL-61576
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-09 15:54:08 -04:00
Laine Stump
cb4e38d4b1 network: a different way of supporting firewalld zone for mode='open' networks
Now that networkAddFirewallRules and networkRemoveFirewallRules() are
being called for mode='open' networks, we just need to move the code
that sets the zone outside of the if (mode != ...OPEN) clause, so that
it's done for all forward modes, with the exception of setting the
implied 'libvirt*' zones, which are set when no zone is specified for
all forward modes *except* 'open'.

This was previously done in commit v10.7.0-76-g1a72b83d56, but in a
manner that caused the zone to be unset whenever firewalld reloaded
its rules. That patch was reverted, and this new better patch takes
its place.

Replaces: 1a72b83d56
Resolves: https://issues.redhat.com/browse/RHEL-61576
Re-Resolves: https://gitlab.com/libvirt/libvirt/-/issues/215
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-09 15:54:08 -04:00
Laine Stump
d552d810b9 network: call network(Add|Remove)FirewallRules() for forward mode='open'
Previously networkAddFirewallRules() and networkRemoveFirewallRules()
were only called if the forward mode was none, 'route', or 'nat', so
those functions didn't check the forward mode. Although their current
contents shouldn't be executed for forward mode='open', soon they will
have extra functionality that should be executed for all the current
forward modes and also mode='open'.

This patch modifies all places either of the functions are called to
make sure they are called for mode='open' in addition to current modes
(by either adding 'case ..._OPEN:' to the case of a switch statement,
or just removing an 'if (mode != ...OPEN)' around the calls; to
balance out for that, it puts the entirety of the contents of both
functions inside if (mode != ...OPEN) to retain current behavior. (an
upcoming patch will add code outside that if clause).

debug log messages were also added to make it easier to test that the
right thing is being done in all cases.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-09 15:54:08 -04:00
Laine Stump
ef760a4133 Revert "network: support setting firewalld zone for bridge device of open networks"
This reverts commit 1a72b83d56. That
patch had made the incorrect assumption that the firewalld zone of a
bridge would not be changed/removed when firewalld reloaded its rules
(e.g. with "killall -HUP firewalld"). It turns out my memory was
faulty, and this *does* remove the bridge interface's zone, which
results in guest networking failure after a firewalld reload, until
the virtual network is restarted.

The functionality reverted as a result of this patch reversion will be
added back in an upcoming patch that keeps the zone setting in
networkAddFirewallRules() (rather than moving it into a separate
function) so that it is called every time the network's firewall rules
are reloaded (including the reload that happens in response to a
reload notification from firewalld).

Signed-off-by: Laine Stump
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-09 15:54:08 -04:00
Laine Stump
816876f517 Revert "network: *un*set the firewalld zone while shutting down a network"
This reverts commit 200f60b2e1. The same
functionality will be re-added in a different way in an upcoming patch.

Signed-off-by: Laine Stump
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-10-09 15:54:08 -04:00
Jim Fehlig
bd6d7ebf62 qemu: Use consistent naming for save image format
The image format setting in qemu.conf is named 'save_image_format'. The
enum of supported format types is declared with name 'virQEMUSaveFormat'.
Let's be consistent and use 'format' instead of 'compressed' when referring
to the save image format.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-10-09 13:48:39 -06:00
Jim Fehlig
b0dc8a923d qemu: conf: Improve the foo_image_format setting descriptions
The current description of the various foo_image_format settings can
be construded to imply the setting is only used to control compression
of the image. Improve the documentation to clarify that format describes
the representation of guest memory blocks on disk, which includes
compression among other possible layouts.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2024-10-09 13:48:38 -06:00
Peter Krempa
aa08a30048 qemu: snapshot: Allow internal snapshots with PFLASH nvram
With the new snapshot QMP command we can select which block device
backend receives the VM state and thus the main issue with internal
snapshots with pflash was addressed.

Thus we can relax the check and allow snapshots if the pflash nvram is
on qcow2.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 16:00:43 +02:00
Peter Krempa
8be8b7de78 qemuSnapshotActiveInternalDeleteGetDevices: Add warning when deleting inconsistent snapshot
As explained in the commit which added the new internal snapshot
deletion code we don't want to do any form of strict checking whether
the libvirt metadata is consistent with the on-disk state as we didn't
historically do that.

In order to be able to spot the cases add a warning into the logs if
such state is encountered. While warnings are easy to miss it's the only
reasonable way to do that. Users will be encouraged to file an issue
with the information, without requiring them to enable debug logs as
the reproduction of that issue may include very old historical state.

The checker is deliberately added separately so that it can be easily
reverted once it's no longer needed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 16:00:43 +02:00
Peter Krempa
eac1a86f72 qemu snapshot: use QMP snapshot-delete for internal snapshots deletion
Switch to using the modern QMP command.

As the user visible logic when deleting internal snapshots using the old
'delvm' command was very lax in terms of catching inconsistencies
between the snapshot metadata and on-disk state we re-implement this
behaviour even using the new command. We could improve the validation
but that'd go at the cost of possible failures which users might not
expect.

As 'delvm' was simply ignoring any kind of failure the selection of
devices to delete the snapshot from is based on querying qemu first
which top level images do have the internal snapshot and then continuing
only on those.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 16:00:43 +02:00
Nikolai Barybin via Devel
b93af62c40 qemu snapshot: use QMP snapshot-save for internal snapshots creation
The usage of HMP commands are highly discouraged by qemu. Moreover,
current snapshot creation routine does not provide flexibility in
choosing target device for VM state snapshot.

This patch makes use of QMP commands snapshot-save and by
default chooses first writable non-shared qcow2 disk (if present)
as target for VM state.

Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 16:00:43 +02:00
Peter Krempa
6d8ae98fa0 qemu: monitor: Store internal snapshot names from 'query-named-block-nodes'
Store the names of internal snapshots present in supported images in the
data we dump from 'query-named-block-nodes' so that the upcoming changes
to the internal snapshot code can access it.

To test this we use the bitmap detection test cases which can be easily
extended to dump this data.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 16:00:43 +02:00
Nikolai Barybin via Devel
9df1453db8 qemu: capabilities: Introduce QEMU_CAPS_SNAPSHOT_INTERNAL_QMP capability
The 'snapshot-save/delete' QMP commands were introduced in QEMU 6.0.0,
so we add a compatible capability to check if target QEMU binary supports it.

Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 15:22:00 +02:00
Nikolai Barybin via Devel
ce4ed8deef qemu: blockjob: Add job types for 'snapshot-save/delete'
The snapshot creation/deletion QMP commands use the qemu 'job' API
to signal completion thus we need to add corresponding job types.

As the job handles everything internally we don't store anything about
the job.

Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 15:22:00 +02:00
Nikolai Barybin via Devel
5d0773633a qemu: monitor: Add plumbing for 'snaphot-save'/'snapshot-delete' QMP commands
Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 15:22:00 +02:00
Peter Krempa
2e325804cc qemuDomainObjWait: Annotate with G_GNUC_WARN_UNUSED_RESULT
Callers must handle the return value of this function as the VM might
have died. Add compiler annotation to force it.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-10-09 15:22:00 +02:00
Jiri Denemark
93d97d8fa2 cpu_map: Drop vmx-invvpid-single-context from CPU models
QEMU calls the same feature differently, but translating the names in
libvirt does not make sense because the name in QEMU conflicts with
another feature. QEMU will not change the name for compatibility reasons
so we can just drop our invented name as it is not supported by QEMU.
Apart from this slightly different reason behind the feature being
unsupported by QEMU the situation is similar to vmx-ept-{uc,wb} dropped
in the previous patch and so is the implications.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-09 14:46:51 +02:00
Jiri Denemark
b1d4196580 cpu_map: Drop vmx-ept-{uc,wb} features from CPU models
Although QEMU knows and enables the corresponding MSR bits, it does not
allow users to configure them (there are no names attached to them).
They should have never been added to the CPU map and definitely not to
CPU models as the features will always be considered disabled regardless
on their actual state as QEMU will not report them.

While we cannot drop them completely for backward compatibility, we can
at least remove them from all CPU models.

This is effectively no change for CPU models where the features were
marked with added='yes' because migration source would always remove the
features from domain XML so not adding them to the live XML does not
hurt. On the other side the destination could not ever be surprised by
the features being suddenly enabled as QEMU never reports them, which
means libvirt considers them disabled all the time.

GraniteRapids CPU model is the only one which contains the feature ever
since it was introduced in libvirt, but it was never possible to migrate
a domain with such CPU. The source would always mark vmx-ept-wb as
disabled and the destination without the fixes in this series would drop
the feature from the XML completely as it is unsupported by QEMU and
disabled, but when probing for the actual CPU created by QEMU libvirt
would expect the feature to be enabled (as it is included in the CPU
model and not explicitly mentioned in the domain definition) and fail
the migration. There's nothing the source could do to workaround the
behavior on the destination and migration to older libvirt will still be
broken. But it's possible to migrate a domain with GraniteRapids to a
destination with this series applied from both old and new source.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-09 14:46:51 +02:00
Jiri Denemark
29aa9b02aa qemu: Replace big condition in virQEMUCapsCPUFilterFeatures with array
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-10-09 14:46:51 +02:00