Commit Graph

13705 Commits

Author SHA1 Message Date
Daniel P. Berrangé
e40a533118 qemu: set swtpm log level parameter
This wires up the emulator 'debug' parameter to control the
/usr/bin/swtpm 'level' parameter for logging.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-07-05 14:43:15 +01:00
Adam Julis
c3302ceb1d qemuDomainChangeNet: forbid changing portgroup
Changing the postgroup attribute caused unexpected behavior.
Although it can be implemented, it has a non-trivial solution.
No requirement or use has yet been found for implementing this
feature, so it has been disabled for hot-plug.

Resolves: https://issues.redhat.com/browse/RHEL-7299
Signed-off-by: Adam Julis <ajulis@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-03 09:59:10 +02:00
Michal Privoznik
cf7d495324 qemu: Drop _virQEMUDriver::hostFips
The 'hostFips' member of _virQEMUDriver struct is not used
really, due to previous cleanups. Drop it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-02 09:14:24 +02:00
Michal Privoznik
ce48d584cc qemu_capabilities: Retire QEMU_CAPS_VXHS
The support for VXHS device was removed in QEMU commit
v5.1.0-rc1~16^2~10. Since we require QEMU-5.2.0 at least there's
no QEMU that has the device and thus the corresponding capability
can be retired.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-02 09:14:23 +02:00
Michal Privoznik
295eb1b3d8 qemu_capabilities: Retire QEMU_CAPS_ENABLE_FIPS
The capability is no longer used. Retire it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-02 09:14:22 +02:00
Michal Privoznik
8cf81de8bf qemu_capabilities: Drop version check for QEMU_CAPS_ENABLE_FIPS and QEMU_CAPS_NETDEV_USER
Now that the minimal required version of QEMU is 5.2.0 the
conditional setting of QEMU_CAPS_ENABLE_FIPS and
QEMU_CAPS_NETDEV_USER is effectively a dead code. Drop it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-02 09:14:20 +02:00
Michal Privoznik
073bf16784 qemu_capabilities: Require QEMU-5.2.0 or newer
According to repology.org and/or distro repos these are the version of QEMU:

     CentOS Stream 9: qemu-kvm-9.0.0
           Debian 11: qemu-5.2.0
           Fedora 39: qemu-8.3.1
  openSUSE Leap 15.3: qemu-5.2.0
              RHEL-8: qemu-6.2.0
        Ubuntu 22.04: qemu-6.2.0

Since the minimal version is 5.2.0 we can bump from 4.2.0 to
5.2.0.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-07-02 09:14:18 +02:00
Michal Privoznik
8f34fd0c4c qemu_domain: Set 'passt' net backend if 'default' is unsupported
It may happen that QEMU is compiled without SLIRP but with
support for passt. In such case it is acceptable to alter user
provided configuration and switch backend to passt as it offers
all the features as SLIRP.

Resolves: https://issues.redhat.com/browse/RHEL-45518
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:40:06 +02:00
Michal Privoznik
bd6060d1c3 qemu_validate: Use domaincaps to validate supported net backend type
Now that the logic for detecting supported net backend types has
been moved to domain capabilities generation, we can just use it
when validating net backend type. Just like we do for device
models and so on.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:39:10 +02:00
Michal Privoznik
6a0f45a9e0 qemu_capabilities: Fill supported net backend types
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:37:27 +02:00
Michal Privoznik
73fc20e262 qemu_validate: Validate net backends against QEMU caps
Now that we have a capability for each domain net backend we can
start validating user's selection against QEMU capabilities.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:33:14 +02:00
Michal Privoznik
e28bc15f09 qemu_capabilities: Introduce QEMU_CAPS_NETDEV_USER
Since -netdev user can be disabled during QEMU compilation, we
can't blindly expect it to just be there. We need a capability
that tracks its presence.

For qemu-4.2.0 we are not able to detect the capability so do the
next best thing - assume the capability is there. This is
consistent with our current behaviour where we blindly assume the
capability, anyway.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-07-01 12:32:16 +02:00
Jon Kohler
76e2dae01a qemu: fix switchover-ack regression for old qemu
When enabling switchover-ack on qemu from libvirt, the .party value
was set to both source and target; however, qemuMigrationParamsCheck()
only takes that into account to validate that the remote side of the
migration supports the flag if it is marked optional or auto/always on.

In the case of switchover-ack, when enabled on only the dst and not
the src, the migration will fail if the src qemu does not support
switchover-ack, as the dst qemu will issue a switchover-ack msg:
qemu/migration/savevm.c ->
  loadvm_process_command ->
    migrate_send_rp_switchover_ack(mis) ->
      migrate_send_rp_message(mis, MIG_RP_MSG_SWITCHOVER_ACK, 0, NULL)

Since the src qemu doesn't understand messages with header_type ==
MIG_RP_MSG_SWITCHOVER_ACK, qemu will kill the migration with error:
  qemu-kvm: RP: Received invalid message 0x0007 length 0x0000
  qemu-kvm: Unable to write to socket: Bad file descriptor

Looking at the original commit [1] for optional migration capabilities,
it seems that the spirit of optional handling was to enhance a given
existing capability where possible. Given that switchover-ack
exclusively depends on return-path, adding it as optional to that cap
feels right.

[1] 61e34b0856 ("qemu: Add support for optional migration capabilities")

Fixes: 1cc7737f69 ("qemu: add support for qemu switchover-ack")

Signed-off-by: Jon Kohler <jon@nutanix.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Avihai Horon <avihaih@nvidia.com>
Cc: Jiri Denemark <jdenemar@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: YangHang Liu <yanghliu@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-06-28 08:50:12 +02:00
Michal Privoznik
fbe97ee17d qemu_validate: Use domaincaps to validate supported launchSecurity type
Now that the logic for detecting supported launchSecurity types
has been moved to domain capabilities generation, we can just use
it when validating launchSecurity type. Just like we do for
device models and so on.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:46:08 +02:00
Michal Privoznik
66df7992d8 qemu: Fill launchSecurity in domaincaps
The inspiration for these rules comes from
qemuValidateDomainDef().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:46:05 +02:00
Michal Privoznik
d00816209e qemu_capabilities: Probe SEV capabilities even for QEMU_CAPS_SEV_SNP_GUEST
While it's very unlikely to have QEMU that supports SEV-SNP but
doesn't support plain SEV, for completeness sake we ought to
query SEV capabilities if QEMU supports either. And similarly to
QEMU_CAPS_SEV_GUEST we need to clear the capability if talking to
QEMU proves SEV is not really supported.

This in turn removes the 'sev-snp-guest' capability from one of
our test cases as Peter's machine he uses to refresh capabilities
is not SEV capable. But that's okay. It's consistent with
'sev-guest' capability.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:46:00 +02:00
Rayhan Faizel
9b0606ef8e qemu_block: Validate number of hosts for iSCSI disk device
An iSCSI device with zero hosts will result in a segmentation fault. This patch
adds a check for the number of hosts, which must be one in the case of iSCSI.

Minimal reproducing XML:

<domain type='qemu'>
    <name>MyGuest</name>
    <uuid>4dea22b3-1d52-d8f3-2516-782e98ab3fa0</uuid>
    <os>
        <type arch='x86_64'>hvm</type>
    </os>
    <memory>4096</memory>
    <devices>
        <disk type='network'>
            <source name='dummy' protocol='iscsi'/>
            <target dev='vda'/>
        </disk>
    </devices>
</domain>

Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-06-25 10:05:49 +02:00
Jon Kohler
1cc7737f69 qemu: add support for qemu switchover-ack
Add plumbing for QEMU's switchover-ack migration capability, which
helps lower the downtime during VFIO migrations. This capability is
enabled by default as long as both the source and destination support
it.

Note: switchover-ack depends on the return path capability, so this may
not be used when VIR_MIGRATE_TUNNELLED flag is set.

Extensive details about the qemu switchover-ack implementation are
available in the qemu series v6 cover letter [1] where the highlight is
the extreme reduction in guest visible downtime. In addition to the
original test results below, I saw a roughly ~20% reduction in downtime
for VFIO VGPU devices at minimum.

  === Test results ===

  The below table shows the downtime of two identical migrations. In the
  first migration swithcover ack is disabled and in the second it is
  enabled. The migrated VM is assigned with a mlx5 VFIO device which has
  300MB of device data to be migrated.

  +----------------------+-----------------------+----------+
  |    Switchover ack    | VFIO device data size | Downtime |
  +----------------------+-----------------------+----------+
  |       Disabled       |         300MB         |  1900ms  |
  |       Enabled        |         300MB         |  420ms   |
  +----------------------+-----------------------+----------+

  Switchover ack gives a roughly 4.5 times improvement in downtime.
  The 1480ms difference is time that is used for resource allocation for
  the VFIO device in the destination. Without switchover ack, this time is
  spent when the source VM is stopped and thus the downtime is much
  higher. With switchover ack, the time is spent when the source VM is
  still running.

[1] https://patchwork.kernel.org/project/qemu-devel/cover/20230621111201.29729-1-avihaih@nvidia.com/

Signed-off-by: Jon Kohler <jon@nutanix.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Avihai Horon <avihaih@nvidia.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: YangHang Liu <yanghliu@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-06-25 09:51:00 +02:00
Jonathon Jongsma
af437d2d64 qemu: Don't specify vfio-pci.ramfb when ramfb is false
Commit 7c8e606b64 attempted to fix
the specification of the ramfb property for vfio-pci devices, but it
failed when ramfb is explicitly set to 'off'. This is because only the
'vfio-pci-nohotplug' device supports the 'ramfb' property. Since we use
the base 'vfio-pci' device unless ramfb is enabled, attempting to set
the 'ramfb' parameter to 'off' this will result in an error like the
following:

  error: internal error: QEMU unexpectedly closed the monitor
  (vm='rhel'): 2024-06-06T04:43:22.896795Z qemu-kvm: -device
  {"driver":"vfio-pci","host":"0000:b1:00.4","id":"hostdev0","display":"on
  ","ramfb":false,"bus":"pci.7","addr":"0x0"}: Property 'vfio-pci.ramfb'
  not found.

This also more closely matches what is done for mdev devices.

Resolves: https://issues.redhat.com/browse/RHEL-28808

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-24 08:55:50 -05:00
Adam Julis
3a9095976e qemuDomainDiskChangeSupported: Fill in missing check
The attribute 'discard_no_unref' of <disk/> is not allowed to be
changed while the virtual machine is running.

Resolves: https://issues.redhat.com/browse/RHEL-37542
Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-24 11:14:56 +02:00
Michal Privoznik
58b5219961 qemu_firmware: Pick the right firmware for SEV-SNP guests
The firmware descriptors have 'amd-sev-snp` feature which
describes whether firmware is suitable for SEV-SNP guests.
Provide necessary implementation to detect the feature and pick
the right firmware if guest is SEV-SNP enabled.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:59:04 +02:00
Michal Privoznik
a1d850b300 qemu: Build cmd line for SEV-SNP
Pretty straightforward as qemu has 'sev-snp-guest' object which
attributes maps pretty much 1:1 to our XML model. Except for
@vcek where QEMU has 'vcek-disabled`, an inverted boolean, while
we model it as virTristateBool. But that's easy to map too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:58:10 +02:00
Michal Privoznik
c65eba1f57 conf: Introduce SEV-SNP support
SEV-SNP is an enhancement of SEV/SEV-ES and thus it shares some
fields with it. Nevertheless, on XML level, it's yet another type
of <launchSecurity/>.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:56:57 +02:00
Michal Privoznik
1abcba9d4d qemu_capabilities: Introduce QEMU_CAPS_SEV_SNP_GUEST
This capability tracks sev-snp-guest object availability.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:56:18 +02:00
Michal Privoznik
be26d0ebbe qemu: Report snp-policy in virDomainGetLaunchSecurityInfo()
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:36:04 +02:00
Michal Privoznik
914b986275 qemu_monitor: Allow querying SEV-SNP state in 'query-sev'
In QEMU commit v9.0.0-1155-g59d3740cb4 the return type of
'query-sev' monitor command changed to accommodate SEV-SNP. Even
though we currently support launching plain SNP guests, this will
soon change.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:35:32 +02:00
Michal Privoznik
7d16c296e3 src: Convert some _virDomainSecDef::sectype checks to switch()
In a few instances there is a plain if() check for
_virDomainSecDef::sectype. While this works perfectly for now,
soon there'll be another type and we can utilize compiler to
identify all the places that need adaptation. Switch those if()
statements to switch().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:32:09 +02:00
Michal Privoznik
a44a43361f Drop needless typecast to virDomainLaunchSecurity
The sectype member of _virDomainSecDef struct is already declared
as of virDomainLaunchSecurity type. There's no need to typecast
it to the very same type when passing it to switch().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:31:33 +02:00
Michal Privoznik
d2cad18ca3 conf: Move some members of virDomainSEVDef into virDomainSEVCommonDef
Some parts of SEV are to be shared with SEV SNP. In order to
reuse XML parsing / formatting code cleanly, let's move those
common bits into a new struct (virDomainSEVCommonDef) and adjust
rest of the code.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:28:54 +02:00
Michal Privoznik
66efdfabd9 qemu_monitor_json: Report error in error paths in SEV related code
While working on qemuMonitorJSONGetSEVMeasurement() and
qemuMonitorJSONGetSEVInfo() I've noticed that if these functions
fail, they do so without appropriate error set. Fill in error
reporting.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:25:32 +02:00
Peter Krempa
e6b94cba7e qemu: migration: Preserve error across qemuDomainSetMaxMemLock() on error paths
When a VM terminates itself while it's being migrated in running state
libvirt would report wrong error:

 error: cannot get locked memory limit of process 2502057: No such file or directory

rather than the proper error:

 error: operation failed: domain is not running

Remember the error on error paths in qemuMigrationSrcConfirmPhase and
qemuMigrationSrcPerformPhase.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
e00a58c10a qemuMigrationSrcRun: Re-check whether VM is active before accessing job data
'qemuProcessStop()' clears the 'current' job data. While the code under
the 'error' label in 'qemuMigrationSrcRun()' does check that the VM is
active before accessing the job, it also invokes multiple helper
functions to clean up the migration including
'qemuMigrationSrcNBDCopyCancel()' which calls 'qemuDomainObjWait()'
invalidating the result of the liveness check as it unlocks the VM.

Duplicate the liveness check and explain why. The rest of the code e.g.
accessing the monitor is safe as 'qemuDomainEnterMonitorAsync()'
performs a liveness check. The cleanup path just ignores the return
values of those functions.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
9243e87820 qemu: migration: Inline 'qemuMigrationDstFinishResume()'
The function is a pointless wrapper on top of
qemuMigrationDstWaitForCompletion.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
a52e125d56 qemu: migration: Properly check for live VM after qemuDomainObjWait()
Similarly to the one change in commit 4d1a1fdffd
we should be checking that the VM is not being yet destroyed if we've
invoked qemuDomainObjWait().

Use the new helper qemuDomainObjIsActive().

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
9eb33b7f03 qemu: domain: Introduce qemuDomainObjIsActive helper
The helper checks whether VM is active including the internal qemu
state. This helper will become useful in situations when an async job
is in use as VIR_JOB_DESTROY can run along async jobs thus both checks
are necessary.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
d9935a5c4f qemu: process: Ensure that 'beingDestroyed' gets cleared only after VM id is reset
Prevent the possibility that a VM could be considered as alive while
inside qemuProcessStop.

A recently fixed bug which unlocked the domain object while inside
qemuProcessStop showed that there's possibility to confuse the state of
the VM to be considered active while 'qemuProcessStop' is processing
shutdown of the VM. Ensure that this doesn't happen by clearing the
'beingDestroyed' flag only after the VM id is cleared.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
3865410e7f qemuProcessStop: Move code not depending on 'vm->def->id' after reset of the ID
There are few function calls done while cleaning up a stopped VM which
do require the old VM id, to e.g. clean up paths containing the 'short'
domain name in the path.

Anything else, which doesn't strictly require it can be moved after
clearing the 'id' in order to decrease likelyhood of potential bugs.

This patch moves all the code which does not require the 'id' (except
for the log entry and closing the monitor socket) after the statement
clearing the id and adds a comment explaining that anything in the
section must not unlock the VM object.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
d29e0f3d4a qemuProcessStop: Prevent crash when qemuDomainObjStopWorker() unlocks the VM
'qemuDomainObjStopWorker()' which is meant to dispose of the event loop
thread for the monitor unlocks the VM object while disposing the thread
to prevent possible deadlocks with events waiting on the monitor thread.

Unfortunately 'qemuDomainObjStopWorker()' is called *before* the VM is
marked as inactive by clearing 'vm->def->id', but at the same time it's
no longer marked as 'beingDestroyed' when we're inside
'qemuProcessStop()'.

If 'vm' would be kept locked this wouldn't be a problem. Same way it's
not a problem for anything that uses non-ASYNC VM jobs, or when the
monitor is accessed in an async job, as the 'destroy' job interlocks
with those.

It is a problem for code inside an async job which uses
'qemuDomainObjWait()' though. The API contract of qemuDomainObjWait()
ensures the caller that the VM on successful return from it, but in this
specific reason it's not the case, as both 'beingDestroyed' is already
false, and 'vm->def->id' is not yet cleared.

To fix the issue move the 'qemuDomainObjStopWorker()' call *after*
clearing 'vm->def->id' and also add a note stating what the function is
doing.

Fixes: 860a999802
Closes: https://gitlab.com/libvirt/libvirt/-/issues/640
Reported-by: luzhipeng <luzhipeng@cestc.cn>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:21 +02:00
Peter Krempa
da8d97e4e2 qemuDomainObjWait: Add documentation
Document why this function exists and meaning of return values.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:52:55 +02:00
Peter Krempa
f9ad21996d qemuDomainDeviceBackendChardevForeach: Fix typo in comment
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:52:54 +02:00
Peter Krempa
b4423a753b qemuDomainDiskPrivateDispose: Prevent dangling 'disk' pointer in blockjob data
Clear the 'disk' member of 'blockjob' as we're freeing the disk object
at this point. While this should not normally happen it was observed
when other bug allowed the VM to be cleared while other threads didn't
yet finish.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:52:54 +02:00
Peter Krempa
737f897c29 qemuBlockJobProcessEventConcludedBackup: Handle potentially NULL 'job->disk'
Similarly to other blockjob handlers, if there's no disk associated with
the blockjob the handler needs to behave correctly. This is needed as
the disk might have been de-associated on unplug or other operations.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:52:54 +02:00
Swapnil Ingle
c772f1982d Pass shutoff reason to release hook
Sometimes in release hook it is useful to know if the VM shutdown was graceful
or not. This is especially useful to do cleanup based on the VM shutdown failure
reason in release hook. This patch proposes to use the last argument 'extra'
to pass VM shutoff reason in the call to release hook.
Making this change for Qemu and LXC.

Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-19 12:15:26 +02:00
Adam Julis
e145d182a6 qemu: implement iommu coldplug/unplug
Resolves: https://issues.redhat.com/browse/RHEL-23833
Signed-off-by: Adam Julis <ajulis@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-18 12:17:50 +02:00
Adam Julis
59f6e226bb qemu_driver: add validation of potential dependencies on cold plug
Although virDomainDeviceDefValidate() is called as a part of
parsing device XML routine, it validates only that single device.
The virDomainDefValidate() function performs a more comprehensive
check. It should detect errors resulting from dependencies
between devices, or a device and some other part of XML config.
Therefore, a call to virDomainDefValidate() is added at the end
of qemuDomainAttachDeviceConfig().

Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-18 08:46:28 +02:00
Michal Privoznik
095f22db21 qemu_process: Issue an info message when subtracting isolcpus
In one of my previous commits I've made us substract isolcpus
from all online CPUs when setting affinity on QEMU threads. See
commit below for more info on that. Nevertheless, this is
something that surely deserves an entry in log. I've chosen INFO
priority for now. We can promote that to a regular WARN if users
complain.

Fixes: da95bcb6b2
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-06-17 12:30:39 +02:00
Daniel P. Berrangé
f2828880b6 meson: allow systemd sysusersdir to be changed
We currently hardcode the systemd sysusersdir, but it is desirable to be
able to choose a different location in some cases. For example, Fedora
flatpak builds change the RPM %_sysusersdir macro, but we can't currently
honour that.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reported-by: Yaakov Selkowitz <yselkowi@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-13 10:23:11 +01:00
Peter Krempa
39bfd6c888 qemu_validate: Validate support for SCSI emulation support in 'virtio-blk' devices
The support will be dropped soon by qemu, and libvirt is not rejecting
such configurations. Add validation of this explicitly requested config.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-06-12 08:21:12 +02:00
Peter Krempa
126f95c1fe qemuValidateDomainDeviceDefDiskFrontend: Refactor validation of <disk type='lun'>
Use a switch statement for checks based on the disk bus.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-06-12 08:21:11 +02:00
Andrea Bolognani
971e767805 qemu: Reject TPM 1.2 in most scenarios
Everywhere we use TPM 2.0 as our default, the chances of TPM
1.2 being supported by the guest OS are very slim. Just reject
such configurations outright.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-06-07 11:13:19 +02:00
Andrea Bolognani
220b2690da qemu: Default to TPM 2.0 in most scenarios
TPM 1.2 is a pretty bad default these days, especially for
architectures which were introduced when TPM 2.0 already existed.

We're already carving out exceptions for several scenarios, but
that's basically backwards: at this point, using TPM 1.2 is the
exception.

Restructure the code so that it reflects reality and we don't
have to remember to update it every time a new architecture is
introduced.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-06-07 11:13:16 +02:00
Michal Privoznik
86e511fafb lib: Annotate more function as NULL terminated
While __attribute((sentinel)) (exposed by glib under
G_GNUC_NULL_TERMINATED macro) is a gcc extension, it's supported
by clang too. It's already being used throughout our code but
some functions that take variadic arguments and expect NULL at
the end were lacking such annotation. Fill them in.

After this, there are still some functions left untouched because
they expect a different sentinel than NULL. Unfortunately, glib
does not provide macro for different sentinels. We may come up
with our own, but let's save that for future work.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-06-06 09:29:58 +02:00
Peter Krempa
f3e8c10fe4 qemu: validate: Fix check for unsupported FS-device bootindex use on un-assigned addresses
When hot-plugging a FS device with un-assigned address with a bootindex
the recently-added validation check would fail as validation on hotplug
is done prior to address assignment.

To fix this problem we can simply relax the check to also pass on _NONE
addresses. Unsupported configurations will still be caught as previous
commit re-checks the definition after address assignment prior to
hotplug.

Resolves: https://issues.redhat.com/browse/RHEL-39271
Fixes: 4690058b6d
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-05-31 12:54:32 +02:00
Peter Krempa
63b32dbe8b qemu: hotplug: Validate definition of 'FS' device after address allocation
Some of the checks make sense only after the address is allocated and
thus we need to re-do the validation after the address is assigned.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-05-31 12:54:32 +02:00
Michal Privoznik
805b1eec7d qemu_hotplug: Clear QoS if required in qemuDomainChangeNet()
In one of my recent commits, I've introduced
virDomainInterfaceClearQoS() which is a helper that either calls
virNetDevBandwidthClear() ('tc' implementation) or
virNetDevOpenvswitchInterfaceClearQos() (for ovs ifaces). But I
made a micro optimization which leads to a bug: the function
checks whether passed iface has any QoS set and returns early if
it has none. In majority of cases this is right thing to do, but
when removing QoS on virDomainUpdateDeviceFlags() this is
problematic. The new definition (passed as argument to
virDomainInterfaceClearQoS()) contains no QoS (because user
requested its removal) and thus instead of removing the old QoS
setting nothing is done.

Fortunately, the fix is simple - pass olddev which contains the
old QoS setting.

Fixes: 812a146dfe
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-05-30 14:08:07 +02:00
Pavel Hrdina
2ea493598f qemu_snapshot: fix memory leak when reverting external snapshot
The code cleaning up virStorageSource doesn't free data allocated by
virStorageSourceInit() so we need to call virStorageSourceDeinit()
explicitly.

Fixes: 8e66473781
Resolves: https://issues.redhat.com/browse/RHEL-33044
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-05-29 15:23:55 +02:00
Michal Privoznik
66b052263d src: Fix return types of .stateInitialize callbacks
The virStateDriver struct has .stateInitialize callback which is
declared to return virDrvStateInitResult enum. But some drivers
return a plain int in their implementation which is UB.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-05-22 13:41:42 +02:00
Jonathon Jongsma
7c8e606b64 qemu: fix qemu command for pci hostdevs and ramfb='off'
There was no test for this and we mistakenly used 'B' rather than 'T'
when constructing the json value for this parameter. Thus, a value of
'off' was VIR_TRISTATE_SWITCH_OFF=2, which was translated to a boolean
value of 'true'.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-05-20 12:42:18 -05:00
Rayhan Faizel
57f29f675d qemu: Implement support for hotplugging evdev input devices
Unlike other input types, evdev is not a true device since it's backed by
'-object'. We must use object-add/object-del monitor commands instead of
device-add/device-del in this particular case.

This patch adds support for handling live attachment and
detachment of evdev type devices.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/529
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-05-16 14:56:59 +02:00
Rayhan Faizel
ffebb557f1 qemu_hotplug: Properly assign USB address to hotplugged usb-net device
Previously, the network device hotplug logic would try to ensure only CCW or
PCI addresses. With recent support for the usb-net model, this patch will
ensure USB addresses for usb-net network devices.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/14
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-05-14 09:14:39 +02:00
Dr. David Alan Gilbert
9e59ba56c8 qemu_capabilities: Remove unused struct
'virQEMUCapsSearchData' has been unused since
commit bc33b8c639 ("qemu: capabilities: Drop the
virQEMUCapsCacheLookupByArch function")
Remove it.

Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-05-13 03:14:14 +02:00
Michal Privoznik
da95bcb6b2 qemu: Substract isolcpus from all online affinity
When starting a domain and there's no vCPU/emulator pinning set,
we query the list of all online physical CPUs and set affinity of
the child process (which eventually becomes QEMU) to that list.
We can't assume libvirtd itself had affinity to all online CPUs
and since affinity of the child process is inherited, we should
fix it afterwards. But that's not necessarily correct. Users
might isolate some physical CPUs and we should avoid touching
them unless explicitly told so (i.e. vCPU/emulator pinning told
us so).

Therefore, when attempting to set affinity to all online CPUs
subtract the isolated ones.

Before this commit:

  root@localhost:~# cat /sys/devices/system/cpu/isolated
  19,21,23
  root@virtlab414:~# taskset -cp $(pgrep qemu)
  pid 14835's current affinity list: 0-23

After:

  root@virtlab414:~# taskset -cp $(pgrep qemu)
  pid 17153's current affinity list: 0-18,20,22

Resolves: https://issues.redhat.com/browse/RHEL-33082
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2024-05-06 15:38:58 +02:00
Adam Julis
142ed263c0 qemu_saveimage: add zstd to supported compression formats
Extend the list of supported formats, update and clarify comment
in qemu.conf.in (removed misleading sentence about the order of
compression format types).

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/589
Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-05-06 14:56:58 +02:00
Jiri Denemark
a396f76f70 qemu: Enable removing features from CPU models
Features removed from a CPU model are marked with "removed='yes'"
attribute in the CPU map. Such features will always be present in a CPU
definition produced by libvirt regardless on their state. In other words
a running domain (even saved in a file) will always explicitly contain
states of all features removed from the specified CPU model. This
enables migration to older libvirt which would otherwise think the
affected features should be enabled as they are still included in the
CPU model in the older version of CPU map. Migration from an old libvirt
to a new one would be broken as the new libvirt would think the removed
features should be disabled (because they are not included in the CPU
model anymore), which might not be the case on the source host. Thus we
were refusing to remove CPU features unless they were never working and
no domain could even be running with those features enabled.

This patch removes the limitation. When handling CPU definitions with
missing features marked as removed in the specified CPU model, we know
whether it comes from a running domain, in which case it must have been
created by older libvirt where the missing CPU features were not removed
yet. This means the features must have been enabled on the source and we
can automatically fix the definition by adding the missing features with
correct states.

We can safely remove any CPU feature from our CPU models now, but it
should only be used for features removed from all versions of a given
CPU model in QEMU because unversioned models correspond to v1.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-05-02 19:56:45 +02:00
Jiri Denemark
30458c6071 cpu: Add removedPolicy parameter to virCPUUpdate
virCPUUpdate check the CPU definition for features that were marked as
removed in the specified CPU model and explicitly adds those that were
not mentioned in the definition. So far such features were added with
VIR_CPU_FEATURE_DISABLE policy, but the caller may want to use a
different policy in some situations, which is now possible via the
removedPolicy parameter.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-05-02 19:56:45 +02:00
Jiri Denemark
8c1b07b088 conf: Change return value of some CPU feature APIs
The virCPUDefAddFeatureInternal helper function only fails if it is
called with VIR_CPU_ADD_FEATURE_MODE_EXCLUSIVE, which is only used in
virCPUDefAddFeature. The other callers (virCPUDefUpdateFeature and
virCPUDefAddFeatureIfMissing) will never get anything but 0 from
virCPUDefAddFeatureInternal and their return type can be changed to
void.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-05-02 19:56:45 +02:00
Rayhan Faizel
a1a3da94f5 qemu: Generate command line for sound devices with model 'virtio'
Allow generation of command line for virtio-sound-pci and virtio-sound-device
devices along with additional virtio options.

A new testcase is added to test virtio-sound-pci. The
arm-vexpressa9-virtio testcase is also extended to test virtio-sound-device.

Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-05-02 15:38:34 +02:00
Rayhan Faizel
bb593e3743 conf: Introduce support for virtio-sound devices
This patch adds parsing of the virtio sound model, along with parsing
of virtio options and PCI/virtio-mmio address assignment.

A new 'streams' attribute is added for configuring number of PCM streams
(default is 2) in virtio sound devices. QEMU additionally has jacks and chmaps
parameters but these are currently stubbed, hence they are excluded in this
patch series.

Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-05-02 15:38:32 +02:00
Rayhan Faizel
9081320b53 qemu_capabilities: Add QEMU_CAPS_DEVICE_VIRTIO_SOUND capability
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-05-02 15:37:53 +02:00
Kristina Hanicova
c95cc67efb qemu: format machine virt ras feature and test it
Resolves: https://issues.redhat.com/browse/RHEL-7489
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-05-02 13:17:17 +02:00
Kristina Hanicova
a43007b3c4 qemu: validate machine virt ras feature
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-05-02 13:17:17 +02:00
Kristina Hanicova
aaf4196843 conf: parse and format machine virt ras feature
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-05-02 13:17:16 +02:00
Kristina Hanicova
ffaf77a30d qemu: introduce QEMU_CAPS_MACHINE_VIRT_RAS capability
The capability can be used to detect if the qemu binary already
supports 'ras' feature for 'virt' machine type.

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-05-02 13:17:16 +02:00
Peter Krempa
5d48c5d215 qemu: migration: Don't use empty string for 'tls-hostname' NBD blockdev
While QEMU accepts and interprets an empty string in the tls-hostname
field in migration parametes as if it's unset, the same does not apply
for the 'tls-hostname' field when 'blockdev-add'-ing a NBD backend for
non-shared storage migration.

When libvirt sets up migation with TLS in 'qemuMigrationParamsEnableTLS'
the QEMU_MIGRATION_PARAM_TLS_HOSTNAME migration parameter will be set to
empty string in case when the 'hostname' argument is passed as NULL.

Later on when setting up the NBD connections for non-shared storage
migration 'qemuMigrationParamsGetTLSHostname', which fetches the value
of the aforementioned TLS parameter.

This bug was mostly latent until recently as libvirt used
MIGRATION_DEST_CONNECT_HOST mode in most cases which required the
hostname to be passed, thus the parameter was set properly.

This changed with 8d693d79c4 for post-copy migration, where libvirt now
instructs qemu to connect and thus passes NULL hostname to
qemuMigrationParamsEnableTLS, which in turn causes libvirt to try to
add NBD connection with empty string as tls-hostname resulting in:

  error: internal error: unable to execute QEMU command 'blockdev-add': Certificate does not match the hostname

To address this modify 'qemuMigrationParamsGetTLSHostname' to undo the
weird semantics the migration code uses to handle TLS hostname and make
it return NULL if the hostname is an empty string.

Fixes: e8fa09d66b
Resolves: https://issues.redhat.com/browse/RHEL-32880
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-04-24 13:45:56 +02:00
Peter Krempa
4690058b6d qemu_validate: Reject virtiofs with bootindex on s390x with CCW
The CCW variant of the 'vhost-user-fs' device in qemu doesn't
deliberately support the 'bootindex' attribute as the machine is unable
to boot from such device.

Reject '<boot order' on non-PCI virtiofs, add tests validating that it's
rejected as well as that virtiofs on PCI-based hosts but without address
specified will be accepted.

Resolves: https://issues.redhat.com/browse/RHEL-22728
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
2024-04-24 10:30:36 +02:00
Michal Privoznik
c38720b337 qemu_command: Generate mem-reserve for controllers
Pretty straightforward. Just put mem-reserve attribute whenever
it's set. Previous commit ensures it's set only for valid
controller models.

Resolves: https://issues.redhat.com/browse/RHEL-7461
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-04-19 14:27:30 +02:00
Michal Privoznik
772e33487a qemu_validate: Restrict setting @memReserve only to some controllers
Only two controller models allow setting mem-reserve:
pcie-root-port and pci-bridge. Reflect this fact during
validation.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-04-19 14:26:45 +02:00
Jiri Denemark
6eb4c6ad20 qemu: Change return type of qemuDomainFixupCPUs to void
The function never fails.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-04-17 17:36:59 +02:00
Jiri Denemark
efac33bfaa qemu: Change return type of qemuDomainUpdateCPU to void
The function never fails.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-04-17 17:36:59 +02:00
Jiri Denemark
4331048257 qemu: Fix migration with custom XML
Ages ago origCPU in domain private data was introduced to provide
backward compatibility when migrating to an old libvirt, which did not
support fetching updated CPU definition from QEMU. Thus origCPU will
contain the original CPU definition before such update. But only if the
update actually changed anything. Let's always fill origCPU with the
original definition when starting a domain so that we can rely on it
being always set, even if it matches the updated definition.

This fixes migration or save operations with custom domain XML after
commit v10.1.0-88-g14d3517410, which expected origCPU to be always set
to the CPU definition from inactive XML to check features explicitly
requested by a user.

https://issues.redhat.com/browse/RHEL-30622

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Tested-by: Han Han <hhan@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-04-17 17:36:59 +02:00
Jiri Denemark
ded74b3369 qemu: Use g_autoptr in qemuProcessInit
The only thing we need to free in the cleanup code is virCPUDef and for
that we already have g_autoptr handler.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-04-16 17:58:23 +02:00
Michal Privoznik
812a146dfe domain_interface: Introduce and use virDomainInterfaceClearQoS()
In QEMU and LXC drivers in a few places only
virNetDevBandwidthClear() is called. This means that if an
interface is of openvswitch vport profile, its QoS is not
removed. And to make matters worse - OVS is designed to remember
state even when corresponding interface is gone. This leads to
stale QoS settings piling up in OVS database.

To resolve this, introduce virDomainInterfaceClearQoS() which
looks at given interface and calls corresponding QoS clear
function. Then, basically replace virNetDevBandwidthClear() calls
in those hypervisor drivers with this new function.

Resolves: https://issues.redhat.com/browse/RHEL-30373
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-04-12 20:45:15 +02:00
Michal Privoznik
f8b5bd855f hypervisor: Introduce and use virDomainInterfaceVportRemove()
Both LXC and QEMU drivers have the same code to remove vport when
removing a domain's interface. Instead of repeating the same
pattern in both drivers, move the code into hypervisor agnostic
location (src/hypervisor/) and switch to calling this new
function.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-04-12 20:44:52 +02:00
Rayhan Faizel
e18c69bcd8 conf: Automatically assign address to usb-net device
This patch will allow usb-net devices to be automatically assigned a USB
address (and skip any attempt to assign a PCI one).

Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-04-03 10:40:14 +02:00
Jonathon Jongsma
21af003084 qemu: enable display/ramfb for vfio pci hostdevs
Implement display="on" and ramfb="on" for vfio PCI host devices in qemu.
This enables passthrough PCI devices for display just like we did for
mdevs.

Resolves: https://issues.redhat.com/browse/RHEL-28808

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-04-02 11:45:54 -05:00
Pavel Hrdina
0a164b74eb qemu_snapshot: correctly update metadata when deleting external snapshot with multiple branches
XML metadata for snapshot contains only single list of disk overlays
from the moment when the snapshot was taken. When user creates multiple
branches of snapshots the parent snapshot will still list only the
original disk overlays. This may cause an issue in a specific scenario:

     s1
      |
      +- s2
      +- s3 (active)

For this snapshot topology when we delete s2 metadata for s1 are not
updated. Now when we delete s1 the code operated with incorrect
overlays from s1 metadata in order to update s3 metadata resulting in no
changes to s3 metadata.

Now when user tries to delete s3 it fails with following error:

    error: Failed to delete snapshot s3
    error: operation failed: snapshot VM disk source and parent disk source are not the same

For the actual deletion there is a code to figure out the correct disk
source but it was not used to update metadata as well. Due to reasons
how block commit in libvirt works we need to create a copy of that disk
source in order to have it available when updating metadata as the
original source will be freed at that point.

Resolves: https://issues.redhat.com/browse/RHEL-26276
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-04-02 14:14:26 +02:00
Pavel Hrdina
79654f425c qemu_snapshot: call qemuSnapshotDeleteUpdateDisks only for external snapshots
Calling this function when deleting internal snapshot isn't required
because with internal snapshots all changes are done within the file
itself so there is no file deletion and no need to update snapshot
metadata.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-04-02 14:14:26 +02:00
Adam Julis
6821271123 qemuDomainChangeNet: Error when boot index changes in live XML
If the original code detected a missing or null boot index in the
new XML, it automatically added the current value. This
autocompletion was incorrect because it was impossible to
distinguish between user intent and user error - changing the
boot order itself is forbidden and should always be an error.

Resolves: https://issues.redhat.com/browse/RHEL-23416
Fixes: aa3e07caec
Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-03-22 10:49:40 +01:00
Lennart Fricke
71a604fce6 qemu: warn on pausing of guest due to watchdog or io error
Change the log level for pauses of guests due to watchdog timeouts
or io errors from debug to warn to enhance the visibility of such
events.

Signed-off-by: Lennart Fricke <lennart.fricke@drehpunkt.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-03-21 15:00:00 +01:00
Xianglai Li
3243783c32 Support for loongarch64 in the QEMU driver
Implement support for loongarch64 in the QEMU driver.

Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-03-21 14:42:24 +01:00
Xianglai Li
a4e3718981 Add loongarch cpu support
Add loongarch cpu support, Define new cpu type 'loongarch64'
and implement it's driver functions.

Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-03-21 14:42:20 +01:00
Andrea Bolognani
5fb47c5bed qemu: Tweak augeas schema
Current entries should always be listed before obsolete ones.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
2024-03-20 18:37:58 +01:00
Rayhan Faizel
c836887a02 qemu_command: Generate command line for MTP filesystem
The source tag sets the rootdir property of the device, which is
the directory exposed to the guest via the MTP device. The target
tag sets the desc property.  This device supports read-only mode
as well. Like virtiofs, it does not support additional access
modes.

Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-03-19 17:36:19 +01:00
Rayhan Faizel
5c70a7e328 conf: Introduce support for usb-mtp devices
Expose usb-mtp device as another type of <filesystem/>.

Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-03-19 17:36:19 +01:00
Rayhan Faizel
e529b7b5c4 qemu_capabilities: Add QEMU_CAPS_DEVICE_USB_MTP capability
This capability reflects presence of -device usb-mtp.

Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-03-19 17:36:19 +01:00
Ján Tomko
c9de7a1c3b qemu: virtiofs: error out if getting the group or user name fails
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-03-18 15:20:24 +01:00
Ján Tomko
4c5b2e1e0d qemu: virtiofs: set correct label when creating the socket
Use svirt_t instead of virtd_t, since virtd_t is not available in the
session mode and qemu with svirt_t won't be able to talk to unconfined_t
socket.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-03-18 15:20:24 +01:00
Ján Tomko
a9da009219 qemu: virtiofs: do not crash if cgroups are missing
On domain startup, qemuSetupCgroupForExtDevices checks
if a cgroup controller is present and skips the setup if not.

Add a similar check to qemuVirtioFSSetupCgroup to prevent
crashing when hotplugging a virtiofs filesystem.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-03-18 15:20:24 +01:00
Jiri Denemark
14d3517410 qemu: domain: Drop added features from migratable CPU
Features marked with added='yes' in CPU model definitions have to be
removed before migration, otherwise older libvirt would complain about
unknown CPU features. We only do this for features that were enabled for
a given CPU model even with older libvirt, which just ignored the
features. And only for features we added ourselves when updating CPU
definition during domain startup, that is we do not remove features
which were explicitly mentioned by a user.

That said, this is not the safest thing we could do, but it's
effectively the same thing we did before the affected features were
added: we ignored them completely on both sides of migration.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-03-14 16:15:06 +01:00
Jiri Denemark
909564c365 qemu: domain: Check arch in qemuDomainMakeCPUMigratable
The content is arch specific and checking for Icelake-Server CPU model
on non-x86 architectures does not make sense.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-03-14 16:15:06 +01:00
Michal Privoznik
36c6d40943 capabilities: Allow suppressing error message from virCapabilitiesDomainDataLookup()
In near future we will want to check whether capabilities for
given virtType exist, but report an error on our own. Introduce
reportError argument which makes the function report an error iff
set.

In one specific case (virQEMUCapsGetDefaultVersion()) we were
even overwriting (more specific) error message reportd by
virCapabilitiesDomainDataLookup(). Drop that too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-03-12 17:39:09 +01:00
Zheng Yan
a74897efe6 qemu: implement qemuDomainGraphicsReload
The 'display-reload' QMP command had been introduced from QEMU 6.0.0:

9cc0765165

Currently it only supports reloading TLS certificates for VNC.

Resloves: https://issues.redhat.com/browse/RHEL-16333

Signed-off-by: Zheng Yan <yanzheng759@huawei.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-03-08 17:00:15 +01:00
Zheng Yan
bec963f878 qemu_capabilities: Add QEMU_CAPS_DISPLAY_RELOAD
The 'display-reload' QMP command was introduced in QEMU 6.0.0, so we
add a compatible capability to check if target QEMU binary supports it.

{"execute":"display-reload", "arguments":{"type": "vnc", "tls-certs": true}}

The new QMP refer to:
9cc0765165

Signed-off-by: Zheng Yan <yanzheng759@huawei.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-03-08 17:00:15 +01:00
Peter Krempa
317ac911f6 qemu: command: Remove fallback '-usb' handling
Currently all machine types which do honour '-usb' are already covered
by code which will either select a proper controller model or would
select the same one which '-usb' would use.

Thus all of the legacy -usb controller code can be removed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-03-06 16:30:37 +01:00
Peter Krempa
a07544c0d7 qemu: command: Don't downgrade to '-usb' for arm based machines
- 'virt*' machines already don't allow downgrade
 - 'versatilepb' and 'realview' machines use 'pci-ohci' controller with '-usb'
 - all other machines ignore '-usb' (some have sysbus-based USB
   controller which we don't even consider)

For the 'versatilepb' and 'realview' machines libvirt would already
resort to picking either an existing controller model or trying to pick
the one which '-usb' would select and thus fail either way.

All other machine types ignore it.

We can thus remove the fallback for all arm-based machines.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-03-06 16:30:37 +01:00
Peter Krempa
5b136eba6d qemu: command: Don't downgrade to '-usb' for ppc based machines
- 'pseries' machines already don't allow downgrade
 - 'g3beige' and 'mac99' machines use 'pci-ohci' controller with '-usb'
 - all other machines ignore '-usb'

For 'g3beige' and 'mac99' libvirt already has 'pci-ohci' as contoller it
would select as one of the options when picking a model, thus it's
impossible to reach situation when '-usb' would be honoured.

All other machine types ignore it.

We can thus remove the fallback for all ppc-based machines.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-03-06 16:30:36 +01:00
Peter Krempa
5e84c6c1ce qemu: command: Don't downgrade to '-usb' with 'pseries' machines
The default USB device auto-selection code for 'pseries' machines picks
controller models which are also selected when '-usb' is used thus it's
impossible to end up in the case when using '-usb' would be possible:

 $ qemu-system-ppc64 --machine pseries,usb=on
 qemu-system-ppc64: could not find a module for type 'nec-usb-xhci'
 $ qemu-system-ppc64 --machine pseries-2.5,usb=on
 qemu-system-ppc64: could not find a module for type 'pci-ohci'

Remove the impossible downgrade and adjust tests.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-03-06 16:30:36 +01:00
Peter Krempa
ae642084ce qemu: command: Don't downgrade to '-usb' for x86 based machines
- 'q35' machine type already explicitly forbids fallback
- 'isapc' never supported USB and '-usb' is ignored
- 'i440fx' does support '-usb' and translates it into 'piix3-uhci' which
  is identical to what libvirt selects
- we currently don't care about 'microvm'

Attempting to start an 'pc' (i440fx) machine with -usb when 'piix3-uhci'
is compiled out will fail and in any other case libvirt will use the
proper explicitly selected controller.

Drop the '-usb' downgrade for x86 arch.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-03-06 16:30:36 +01:00
Peter Krempa
b37096778b qemuDomainControllerDefPostParse: Use 'pci-ohci' as last-resort fallback USB controller
This controller is used as the default/implicit USB controller by
multiple machine types which honour the '-usb' flag of qemu. Add it as
fallback in libvirt too.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-03-06 16:30:36 +01:00
Peter Krempa
c6d71bf813 qemuDomainDefAddDefaultDevices: Populate default USB for 'versatilepb' and 'realview' ARM machines
The machine types historically have a default USB controller populated
via '-usb' which libvirt assumed implicitly. Qemu will use 'pci-ohci'
for both if '-usb' is used.

Unfortunately an USB controller instantiated via '-usb' is unusable as
the bus name libvirt generates doesn't reflect the real name qemu uses,
and thus no libvirt-defined USB devices can be put on the controller.

This patch will populate the default USB controller into the XML and
select it's model to 'pci-ohci' unconditionally as the machine would
fail to start with '-usb' if that controller model is not available.

This patch doesn't try to make any other assumptions about
auto-populated model of USB controllers, which means that for an
explicit USB controller without model a different model will be picked.

Note that this will likely cause ABI differences and break migration for
the two machine types, in the corner case when the default USB
controller would be populated, but given that both are obsolete board
types and USB was unusable it doesn't make sense to keep supporting this
specific case when '-usb' was formatted.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-03-06 16:30:36 +01:00
Peter Krempa
d885d39f10 qemuDomainControllerDefPostParse: Use proper enum value for default USB controller model
Assign VIR_DOMAIN_CONTROLLER_MODEL_USB_DEFAULT rather than -1.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-03-06 16:30:36 +01:00
Peter Krempa
1dd0744b29 qemuDomainDefAddDefaultDevices: Handle defaults for all ARM arches together
Most machine types are avaliable in all arches by qemu. This is also
true for the 'versatilepb' machine type example in the tests.

Move all the ARM architectures together so that they are handled in
sync.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-03-06 16:30:36 +01:00
Jiri Denemark
2ba73ca83b qemu: Optimize CPU check='partial' for usable CPUs
Ideally check='partial' would check exactly the features QEMU would want
to enable when asked for a specific CPU model (and features). But there
is no way we could ask QEMU how a specific CPU would look like. So we
use our definition from CPU map, which may slightly differ as QEMU adds
or removes features from CPU models, and thus we may end up checking
features which QEMU would not enable while missing some required ones.

We can do better in specific cases, though. If a CPU definition uses
only a model and disabled features (or none at all), we already know
whether QEMU can enable all features required by the CPU model as that's
what we use to set usable='yes' attribute in the list of available CPU
models in domain capbilities XML. So when a usable CPU model is
requested without asking for additional features (disabling features is
fine) we can avoid our possible inaccurate check using our CPU map.

For backward compatibility we only consider usable models. If a
specified model is not usable, we still check it the old way and even
let QEMU start it (and disable some features) in case our definition
lacks some features compared to QEMU.

Fixes: https://gitlab.com/libvirt/libvirt/-/issues/608
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-03-05 16:04:28 +01:00
Andrea Bolognani
e4abb5f0fd qemu: Make firmware parsing failures non-fatal
At the moment, any kind of issue being detected in any of the
firmware descriptor files will result in the entire process
being aborted.

In particular, installing a build of edk2 for an architecture
that libvirt doesn't yet know about, for example loongarch64,
will break most firmware-related functionality: it will no
longer be possible to define new EFI VMs, start existing ones,
or even just obtain the domcapabilities for any architecture.

This is obviously unnecessarily harsh. Adopt a more relaxed
approach and simply ignore the firmware descriptors that we
are unable to parse correctly.

https://bugzilla.redhat.com/show_bug.cgi?id=2258946

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-03-04 14:36:39 +01:00
Andrea Bolognani
e0438b2e80 qemu: Rewrite qemuFirmwareFetchParsedConfigs()
Instead of returning the list of paths exactly as obtained
from qemuFirmwareFetchConfigs(), and allocating the list of
firmwares to be exactly that size right away, start with two
empty lists and add elements to them one by one.

At the moment this only makes things more verbose, but later
we're going to change things so that it's possible that some
of the paths/firmwares are not included in the lists returned
to the caller, and at that point the changes will pay off.

Note that we can't use g_auto() for the new list of paths,
because until the very last moment it's not null-terminated,
so g_strfreev() wouldn't be able to handle it correctly.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-03-04 14:36:37 +01:00
Andrea Bolognani
dcad670212 qemu: Add missing early returns
In a couple of cases, we were reporting an error without
actually terminating the parse process.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-03-04 14:36:26 +01:00
ray
04397de2a1 qemu: Fix guest-sync response time in qga command
The current implementation sets the guest-sync timeout to the
smaller value between the default value (QEMU_AGENT_WAIT_TIME)
and agent->timeout, without considering the timeout passed
via the qga command.

This patch enhances the guest-sync timeout logic to use the
minimum value among the default value, agent->timeout, and
the timeout passed via the qga command.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/590
Signed-off-by: ray <honglei.wang@smartx.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-22 09:51:23 +01:00
Peter Krempa
737e3daf5a qemuMigrationDstPrepareStorage: Annotate that existance of 'volume' disks is checked elswhere
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-21 14:15:49 +01:00
Peter Krempa
eeb1cecf1e qemuMigrationDstPrepareStorage: Move assumption that 'network' disks always exist
Move the assumption from the code pre-creating the storage to
qemuMigrationDstPrepareStorage where it's checked for other cases.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-21 14:15:49 +01:00
Peter Krempa
360fd479ae qemuMigrationDstPrepareStorage: Reject migration into 'dir' and 'vhost-user' types
Migrating into a 'directory' won't ever work as we ask qemu to emulate a
fat filesystem, so restoring of the files won't be possible. Same for
'vhost-user' disks which don't support blockjobs as there's no block
backend used in qemu.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-21 14:15:49 +01:00
Peter Krempa
d6ba6cbaa4 qemuMigrationDstPrepareStorage: Rework storage existence check
Check the existance of storage per-type rather than trying to come up
with a common "path".

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-21 14:15:49 +01:00
Peter Krempa
1e12394d3b qemuMigrationDstPrepareStorage: Move block device specific logic
Now that we have a switch statement, the code adding the 'slice' for
block devices of non-equal sizes can be moved to appropriate location.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-21 14:15:48 +01:00
Peter Krempa
e42a3935ac qemuMigrationDstPrecreateDisk: Refactor cleanup
Automatically free helper variables, remove the 'cleanup' label and
use virBufferCurrentContent() to take the XML from the buffer rather
than extracting it to a separate variable.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-21 14:15:48 +01:00
Peter Krempa
00c0a94ab5 qemuMigrationDstPrepareStorage: Properly consider path for 'vdpa' devices
Allow storage migration of VDPA devices by properly checking that they
exist on the destionation. Pre-creation is not supported but if the
device exists the migration should be able to succeed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-21 14:15:48 +01:00
Peter Krempa
e158b523b8 qemuMigrationDstPrepareStorage: Use 'switch' statement to include all storage types
Decrease the likelyhood that addition of a new storage type will be
forgotten.

This patch also unifies the type check to consult the 'actual' type of
the storage in both cases as the NVMe check looked for the XML declared
type while virStorageSourceIsLocalStorage() looks for the
actual/translated type.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-21 14:15:48 +01:00
Jiri Denemark
2a6799fd43 build: Add userfaultfd_sysctl build option
This option controls whether the sysctl config for enabling unprivileged
userfaultfd will be installed.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-13 17:44:26 +01:00
Jiri Denemark
66643931e7 qemu: Add support for /dev/userfaultfd
/dev/userfaultfd device is preferred over userfaultfd syscall for
post-copy migrations. Unless qemu driver is configured to disable mount
namespace or to forbid access to /dev/userfaultfd in cgroup_device_acl,
we will copy it to the limited /dev filesystem QEMU will have access to
and label it appropriately. So in the default configuration post-copy
migration will be allowed even without enabling
vm.unprivileged_userfaultfd sysctl.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-13 17:44:26 +01:00
Timothée Ravier
a2c3e390f7 qemu: Add sysusers config file for qemu & kvm user/groups
Install a systemd sysusers config file for the qemu & kvm user/groups.

We can not use the sysusers_create_compat macro in the RPM specfile to
create those users as we want to keep the specfile standalone and not
relying on additionnal files.

Update the specfile to make the commands closer to what is generated by
the current macro.

See: https://src.fedoraproject.org/rpms/libvirt/pull-request/22
See: https://gitlab.com/libvirt/libvirt/-/merge_requests/319
See: https://bugzilla.redhat.com/show_bug.cgi?id=2095429
See: https://docs.fedoraproject.org/en-US/packaging-guidelines/UsersAndGroups/

Based on previous work by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Timothée Ravier <tim@siosm.fr>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-02-13 16:59:57 +01:00
Jonathon Jongsma
94365a4871 qemu: handle adding/removing nbdkit-backed disk sources
Previously we were only starting or stopping nbdkit when the guest was
started or stopped or when hotplugging/unplugging a disk. But when doing
block operations, the disk backing store sources can also be be added or
removed independently of the disk device. When this happens the nbdkit
backend was not being handled properly. For example, when doing a
blockcopy from a nbdkit-backed disk to a new disk and pivoting to that
new location, the nbdkit process did not get cleaned up properly. Add
some functionality to qemuDomainStorageSourceAccessModify() to handle
this scenario.

Since we're now starting nbdkit from the ChainAccessAllow/Revoke()
functions, we no longer need to explicitly start nbdkit in hotplug code
paths because the hotplug functions already call these allow/revoke
functions and will start/stop nbdkit if necessary.

Add a check to qemuNbdkitProcessStart() to report an error if we
are trying to start nbdkit for a disk source that already has a running
nbdkit process. This shouldn't happen, and if it does it indicates an
error in another part of our code.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-02-12 16:13:17 -06:00
Jonathon Jongsma
4495ec7d9b qemu: roll back if not all nbdkit backends are successful
When starting nbdkit processes for the backing store of a disk, we were
returning an error if any backing store failed, but we were not cleaning
up processes that succeeded higher in the chain. Make sure that if we
return a failure status from qemuNbdkitStartStorageSource() that we roll
back any processes that had been started.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-02-12 16:13:17 -06:00
Jonathon Jongsma
7a03785d88 qemu: add a 'chain' parameter to nbdkit start/stop
This will allow us to start or stop nbdkit for just a single disk source
or for every source in the backing chain. This will be used in following
patches.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-02-12 16:13:17 -06:00
Michal Privoznik
f2a5494fbd qemu_monitor: Simplify qemuMonitorIOWriteWithFD()
After previous cleanups, qemuMonitorIOWriteWithFD() is but a thin wrapper
over virSocketSendMsgWithFDs(). Replace the body of the former
with a call to the latter.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-07 12:35:02 +01:00
Peter Krempa
6eaf3614b6 qemuBlockStorageSourceNeedsFormatLayer: Stop formatting 'raw' driver when not needed
The 'raw' driver without any special configuration is not needed and
creates overhead in qemu.

Stop using the 'raw' format driver in cases when it's not needed. A
special case when it is needed is for FD passed images with only a
single writable FD passed, where we need an overlay driver to properly
reflect the 'read-only' flag.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-02 16:03:08 +01:00
Peter Krempa
abdcb46012 qemu: monitor: Use 'backing-mask-protocol' for blockjobs when available
Store whether qemu supports the appropriate option for block-stream and
block-commit commands and always use it if available.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-02 16:03:08 +01:00
Peter Krempa
c38b4e34c8 qemu: capabilities: Introduce QEMU_CAPS_BLOCKJOB_BACKING_MASK_PROTOCOL
The capability is asserted when both block-stream and block-commit QMP
commands support the 'backing-mask-protocol' argument.

The argument causes qemu to record 'raw' as the backing file format in
case when a protocol node is used directly. This is needed to preserve
compatibility of images after a block-commit or block-pull libvirt
operation with older libvirt versions in case when we'll want to remove
the unneded 'raw' format drivers from the block graph.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-02-02 16:03:08 +01:00
Praveen K Paladugu
4bfd513d92 hypervisor: Move domain interface mgmt methods
Move domain interface management methods from qemu to hypervisor. This
refactoring allows the domain management methods to be shared between CH and
qemu drivers.

This commit does not introduce any functional changes.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-02 10:58:26 +01:00
Praveen K Paladugu
a22d7fde17 conf: Drop unused parameter
Drop unused parameter from virDomainNetReleaseActualDevice method.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-02-02 10:58:21 +01:00
Andrea Bolognani
ac29f9396c qemu: Use virDomainControllerDefNew() more
Instead of open-coding a partial version of it.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
2024-02-01 10:37:26 +01:00
Andrea Bolognani
518e70158b qemu: Handle MODEL_SCSI_{AUTO,DEFAULT} appropriately
The qemuDomainGetSCSIControllerModel() function, which is
responsible for choosing a model for a SCSI controller that
didn't have one provided by the user, considers values >0 to
mean "model has been set".

Since MODEL_SCSI_AUTO == 0, this means that such a value is
considered the same as MODEL_SCSI_DEFAULT (-1). This makes
sense, as not specifying a model name or explicitly asking for
one to be automatically chosen intuitively should result in
the same behavior.

Specifically, there is no case in which a value of
MODEL_SCSI_AUTO or MODEL_SCSI_DEFAULT is encountered after the
initial controller creation: it is either replaced with an
actual model, or an error is raised.

Despite this, there are a few places in the QEMU driver where
we incorrectly treat these values as if they were actual
model names. To reduce confusion, make sure that no longer
happens.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
2024-02-01 10:37:22 +01:00
Peter Krempa
a9f76d6ab7 Don't overwrite error message from 'virXPathNodeSet'
'virXPathNodeSet' returns -1 only when 'ctxt' or 'xpath' are NULL or
when the 'xpath' string is invalid. Both are programming errors. It
doesn't make sense for the code to overwrite the error message for
anything supposedly more relevant.

The majority of calls to 'virXPathNodeSet' already didn't do this, so
this patch fixes the rest to prevent it from spreading again.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-31 17:24:07 +01:00
Peter Krempa
36e11cca83 qemuMigrationDstStartNBDServer: Refactor cleanup
There's nothing under the 'cleanup:' label thus the whole code can be
simplified.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-01-31 15:25:54 +01:00
Peter Krempa
43f027b57c qemu: migration: Properly handle reservation of manually specified NBD port
Originally the migration code didn't register the NBD disk port with the
port allocator when it was manually specified. Later when commit
e74d627bb3 refactored the code and started registering it, the
old logic which was clearing 'priv->nbdPort' in case when it was manually
specified was not removed.

This caused following problems:
 - the port was not released after successful migration
 - the port was released even when it was not allocated on failures
   regarding the NBD server start
 - the port was not released on other failures of the migration after
   NBD server startup

To address this we remove the assumption that 'priv->nbdPort' is used
only for auto-allocated port and fill it only once the port is
allocated and make the caller of qemuMigrationDstStartNBDServer
responsible for releasing it.

Fixes: e74d627bb3
Resolves: https://issues.redhat.com/browse/RHEL-21543
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-01-31 15:25:54 +01:00
Pavel Hrdina
189fdeff10 qemu_snapshot: create: don't require disk-only flag for offline external snapshot
Historically creating offline external snapshot required disk-only flag
as well. Now when user requests new snapshot for offline VM and at least
one disk is specified to use external snapshot we will no longer require
disk-only flag as all other not specified disk will use external
snapshots as well.

Resolves: https://issues.redhat.com/browse/RHEL-22797
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 13:54:42 +01:00
Pavel Hrdina
faa2e3bb54 qemu_snapshot: create: refactor external snapshot detection
Introduce new function qemuSnapshotCreateUseExternal() that will return
true if we will use external snapshots as default location.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 13:54:32 +01:00
Pavel Hrdina
7143c4e1f9 qemu_snapshot: fix detection if non-leaf snapshot isn't in active chain
The condition was completely wrong. As per the comment for function
virDomainMomentIsAncestor() it checks that the first argument is
descendant of the second argument.

Consider the following snapshot tree for VM:

  s1
    |
    +- s2
    |   |
    |   +- s3
    |
    +- s4
        |
        +- s5 (current)

When deleting s2 with the original code we checked if
virDomainMomentIsAncestor(s2, s5) which would return false basically for
any snapshot as s5 is leaf snapshot so no children.

When deleting s2 with fixed code we check if
virDomainMomentIsAncestor(s5, s2) which still returns false but when
deleting s4 it will correctly return true.

Before this fix it fails with the following error:

    error: Failed to delete snapshot s2
    error: invalid argument: could not find base disk source in disk source chain

After the fix it fails with correct error:

    error: Failed to delete snapshot s2
    error: unsupported configuration: deletion of non-leaf external snapshot that is not in active chain is not supported

Resolves: https://issues.redhat.com/browse/RHEL-23212
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 13:32:04 +01:00
Andrea Bolognani
19dc73d16e qemu: Move qemuDomainGetSCSIControllerModel()
It has nothing to do with assigning addresses, so it makes more
sense to have it in qemu_domain.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:58:13 +01:00
Andrea Bolognani
89a8862d42 qemu: Add missing error handling
qemuDomainGetSCSIControllerModel() can return -1 on failure,
but qemuDomainFindOrCreateSCSIDiskController() didn't implement
any handling for this scenario.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:58:13 +01:00
Andrea Bolognani
27cb524a9f qemu: Drop qemuDomainSetSCSIControllerModel()
It only has a single caller.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:58:13 +01:00
Andrea Bolognani
7d6ec89243 qemu: Drop qemuDomainFindSCSIControllerModel()
It only has a single caller.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:58:13 +01:00
Andrea Bolognani
d0087e65d8 qemu: Clean up qemuDomainDefaultNetModel()
Group things together where it makes sense, avoid unnecessary
uses of 'else if', plus other tweaks.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:58:13 +01:00
Andrea Bolognani
d583ff601f qemu: Default to no USB and no memballoon for new architectures
The current defaults, that can be altered on a per-architecture
basis, are derived from the historical x86 behavior.

Every time support for a new architecture is added to libvirt,
care must be taken to override these default: if that doesn't
happen, guests will end up with additional hardware, which is
something that's generally undesirable.

Turn things around, and require architectures to explicitly
ask for the devices to be created by default instead. The
behavior for existing architectures is preserved.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:58:13 +01:00
Andrea Bolognani
95e54bce7d qemu: Fix a few comments
They reference functions that have since been renamed.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:58:13 +01:00
Michal Privoznik
dab99eedcd qemu_command: Generate cmd line for virtio-mem dynamicMemslots
This is pretty straightforward.

Resolves: https://issues.redhat.com/browse/RHEL-15316
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:44:36 +01:00
Michal Privoznik
6be07af817 qemu_validate: Check capability for virtio-mem dynamicMemslots
The QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI_DYNAMIC_MEMSLOTS reflects
whether QEMU is capable of .dynamic-memslots for virtio-mem.
Use it when validating domain configuration.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:44:36 +01:00
Michal Privoznik
497cab753b qemu_capabilities: Add QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI_DYNAMIC_MEMSLOTS capability
Starting from v8.2.0-rc0~74^2~2 QEMU has .dynamic-memslots
attribute for virtio-mem-pci device. Introduce a capability which
reflects that.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-30 10:44:36 +01:00
Or Ozeri
66aee6e5c2 qemu: block: fix error when blockcopy target is librbd encrypted
Encryption secrets are considered a format dependency, even
when being used by the storage node itself, as in the case of
using encryption engine=librbd.
Currently, the storage node is created (blockdev-add) before
creating the format dependencies (including encryption secrets).
As a result, when trying to perform a blockcopy when the target
disk uses librbd encryption, an error of this form is returned:

  "error: internal error: unable to execute QEMU command 'blockdev-add': No secret with id 'libvirt-5-format-encryption-secret0'"

To overcome this error, we change the order of commands so that
format dependencies are created BEFORE creating the storage node.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2024-01-26 09:52:19 +01:00
Michal Privoznik
ccfc5c1e16 qemu_hotplug: Don't lose 'created' flag in qemuDomainChangeNet()
After v9.1.0-rc1~116 we track whether it's us who created a
macvtap or not. But when updating a vNIC its definition might be
replaced with a new one (though, ifname is not allowed to
change), e.g. to reflect new QoS, link state, etc.

Now, the fact whether we created macvtap for given vNIC is stored
in net->privateData->created. And replacing definition is done by
simply freeing the old definition and making the pointer point to
the new one. But this does not preserve the 'created' flag, which
in turn means when a domain is shutting off, the macvtap is not
removed (see loop inside of qemuProcessStop()).

Copy this flag into new definition and leave a note in
_qemuDomainNetworkPrivate struct.

Fixes: 61d1b9e659
Resolves: https://issues.redhat.com/browse/RHEL-22714
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-25 16:41:12 +01:00
Michal Privoznik
bee5301afa qemu_process: Skip over non-virtio non-TAP NIC models when refreshing rx-filter
After guest is started, or we are reconnecting to already running
one (after daemon restart), qemuProcessRefreshRxFilters() is
called to refresh rx-filters (basically MAC addresses of guest
NICs) as they might have changed while we were not running (for
the case when reconnecting to an already running guest), or we
need to enable them by running a command (for freshly started
guest - see processNicRxFilterChangedEvent()).

Now, our XML parser allowed trustGuestRxFilters attribute for all
types and models of <interface/> while in reality, only virtio
model AND TUN/TAP based types can see MAC address changes. For
other combinations, QEMU reports an error.

This all means that when the daemon is restarted and it
reconnects to a guest with, well invalid configuration, or when
such guest is restored from a saved image, or migrated then we
issue the monitor command, to which QEMU replies with an error
which is then propagated to users:

  error: internal error: unable to execute QEMU command 'query-rx-filter': invalid net client name: hostdev0

While on one hand users should fix their configuration (and after
v10.0.0-rc1~123 they can do that even on live domains), libvirt
can also has some logic built in that prevent issuing the command
in the first place (for obviously wrong cases).

Fixes: 060d4c83ef
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-25 15:55:33 +01:00
Jonathon Jongsma
95c843eae3 qemu: Fix bug in nbdkit-backed backing chains
When trying to start nbdkit-backed disks in backing chains, we were
accidentally always checking the private data of the top of the chain
instead of using the loop variable.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
2024-01-24 07:45:34 -06:00
Andrea Bolognani
5df470f47d qemu: Improve qemuDomainSupportsPCIMultibus()
Rewrite the function so that it's more compact and easier to
extend as new architectures, which will likely come with
multibus support right out the gate, are introduced.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 19:28:50 +01:00
Andrea Bolognani
176c3b105e qemu: Move qemuDomainSupportsPCIMultibus()
It belongs next to qemuDomainSupportsPCI().

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 19:28:41 +01:00
Andrea Bolognani
11a861e9e9 qemu: Improve qemuDomainSupportsPCI()
The way the function is currently written sort of obscures this
fact, but ultimately we already unconditionally assume PCI
support on most architectures.

Arm and RISC-V need some additional checks to maintain
compatibility with existing configurations but for all future
architectures, such as the upcoming LoongArch64, we expect PCI
support to come out of the box.

Last but not least, the functions is made const-correct.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 19:28:29 +01:00
Andrea Bolognani
e622233eda qemu: Retire QEMU_CAPS_OBJECT_GPEX
It's no longer used anywhere.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 19:28:19 +01:00
Andrea Bolognani
ac48405fa7 qemu: Stop checking QEMU_CAPS_OBJECT_GPEX
For all versions of QEMU that we support, the virt machine type
has a hard dependency on this device, so we can stop checking
whether the capability is present and just use it unconditionally.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 19:28:07 +01:00
Andrea Bolognani
f39f15313a qemu: Fix handling of user aliases for default PHB
The bus name for the default PHB is always "pci.0".

Fixes: 937f319536
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 19:18:20 +01:00
Peter Krempa
f51c6b5b02 qemuDomainAssignPCIAddresses: Assign extension addresses when auto-assigning PCI address
Assigning a PCI address needs to also assign any extension addresses
right away. Otherwise they'd be assigned only after subsequent
format->parse cycle and thus be potentially missing on first run after
defining the VM and thus could change.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 17:31:12 +01:00
Peter Krempa
7dd3d77940 qemu: Move 'shmem' device size validation to qemu_validate
The 'size' of a 'shmem' device is parsed and formatted as a "scaled"
value, stored in bytes, but the formatting scale is mebibytes. This
precission loss combined with the fact that the value was validated only
when starting and the size is formatted only when non-zero meant that
on first parse a value < 1 MiB would be accepted, but would be formatted
to the XML as 0 MiB as it was non-zero but truncated and a subsequent
parse would parse of such XML would parse it as 0 bytes, which in turn
would be interpreted as 'default' size.

Fix the issue by moving the validator, which ensures that the number is
a power of two and more than 1 MiB to the validator code so that it'll
be rejected at XML parsing time.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-17 17:31:12 +01:00
Andrea Bolognani
763381df53 qemu: Make monitor aware of CPU clusters
This makes it so libvirt can obtain accurate information about
guest CPUs from QEMU, and should make it possible to correctly
perform operations such as CPU hotplug.

Of course this is mostly moot at the moment: only aarch64 can use
CPU clusters, and CPU hotplug is not yet implemented on that
architecture.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-15 14:56:36 +01:00
Andrea Bolognani
655459420a qemu: Use CPU clusters for guests
https://issues.redhat.com/browse/RHEL-7043

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-15 14:56:35 +01:00
Andrea Bolognani
beb27dc61e qemu: Introduce QEMU_CAPS_SMP_CLUSTERS
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-15 14:56:35 +01:00
Andrea Bolognani
ef5c397584 conf: Allow specifying CPU clusters
The default number of CPU clusters is 1, and values other than
that one are currently rejected by all hypervisor drivers.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-15 14:56:35 +01:00
Michal Privoznik
9cbda34e98 qemu: Be less aggressive when dropping channel source paths
In v9.7.0-rc1~130 I've shortened the path that's generated for
<channel/> source. With that, I had to adjust regex that matches
all versions of paths we have ever generated so that we can drop
them (see comment around qemuDomainChrDefDropDefaultPath()). But
as it is usually the case with regexes - they are write only. And
while I attempted to make one portion of the path optional
("/target/") I accidentally made regex accept more, which
resulted in libvirt dropping the user provided path and
generating our own instead.

Fixes: d3759d3674
Resolves: https://issues.redhat.com/browse/RHEL-20807
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-01-10 08:44:32 +01:00
Daniel P. Berrangé
024d6dc263 qemu: tighten semantics of 'size' when resizing block devices
When VIR_DOMAIN_BLOCK_RESIZE_CAPACITY is set, the 'size' parameter
is currently ignored. Since applications must none the less pass a
value for this parameter, it is preferrable to declare some explicit
semantics for it.

This declare that the parameter must be 0, or the exact size of the
underlying block device. The latter gives the management application
the ability to sanity check that the block device size matches what
they think it should be.

Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-01-09 11:57:13 +00:00
Jiri Denemark
8d693d79c4 qemu: Enable postcopy-preempt migration capability
During post-copy migration (once it actually switches to post-copy mode)
dirty memory pages are continued to be migrated iteratively, while the
destination can explicitly request a specific page to be migrated before
the iterative process gets to it (which happens when a guest wants to
read a page that was not migrated yet). Without the postcopy-preempt
capability enabled such pages need to wait until all other pages already
queued are transferred. Enabling this capability will instruct the
hypervisor to create a separate migration channel for explicitly
requested pages so that they can preempt the queue.

The only requirement for the feature to work is running a migration over
a protocol that supports multiple connections. In other words, we can't
pre-create the connection and pass its file descriptor to QEMU (i.e.,
using MIGRATION_DEST_CONNECT_SOCKET), but we have to let QEMU open the
connections itself (using MIGRATION_DEST_SOCKET). This change is applied
to all post-copy migrations even if postcopy-preempt is not supported to
avoid making the code even uglier than it is now. There's no real
difference between the two methods with modern QEMU (which can properly
report connection failures) anyway.

This capability is enabled for all post-copy migration as long as the
capability is supported on both sides of migration.

https://issues.redhat.com/browse/RHEL-7100

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 22:41:23 +01:00
Jiri Denemark
61e34b0856 qemu: Add support for optional migration capabilities
We enable various migration capabilities according to the flags passed
to a migration API. Missing support for such capabilities results in an
error because they are required by the corresponding flag. This patch
adds support for additional optional capability we may want to enable
for a given API flag in case it is supported. This is useful for
capabilities which are not critical for the flags to be supported, but
they can make things work better in some way.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 22:39:56 +01:00
Jiri Denemark
efc26a665d qemu: Rename remoteCaps parameter in qemuMigrationParamsCheck
The migration cookie contains two bitmaps of migration capabilities:
supported and automatic. qemuMigrationParamsCheck expects the letter so
lets make it more obvious by renaming the parameter as remoteAuto.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 22:38:53 +01:00
Jiri Denemark
c941106f7c qemu: Use C99 initializers for qemuMigrationParamsFlagMap
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 22:38:36 +01:00
Jiri Denemark
ff128d3761 qemu: Document qemuMigrationParamsFlagMapItem fields
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 22:38:24 +01:00
Peter Krempa
397218c433 qemu: Implement support for configuring iothread to virtqueue mapping for disks
Add validation and formatting of the commandline arguments for
'iothread-vq-mapping' parameter. The validation logic mirrors what qemu
allows.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-08 09:27:31 +01:00
Peter Krempa
ee7121ab8e qemu: capabilities: Introduce QEMU_CAPS_VIRTIO_BLK_IOTHREAD_MAPPING
The capability represents the support for mapping virtqueues to
iothreads for the 'virtio-blk' device.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-08 09:27:31 +01:00
Laine Stump
b9a1e7c436 conf: replace virHostdevIsVFIODevice with virHostdevIsPCIDevice
virHostdevIsVFIODevice() and virDomainDefHasVFIOHostdev() are only ever
called from the QEMU driver, and in the case of the QEMU driver, any
PCI hostdev by definition uses VFIO, so really all these callers only
need to know if the device is a PCI hostdev.

(It turned out that the less specific virHostdevIsPCIDevice() already
existed in hypervisor/virhostdev.c, so I had to remove one of them;
since conf is a lower level directory than hypervisor, and the
function is called from conf, keeping the copy in hypervisor would
have required moving its caller (virDomainDefHasPCIHostdev()) into
hypervisor as well, so I just removed the copy in hypervisor.)

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-07 23:58:44 -05:00
Laine Stump
e04ca000bd conf: put hostdev PCI backend into a struct
The new struct is virDeviceHostdevPCIDriverInfo, and the "backend"
enum in the hostdevDef will be replaced with a
virDeviceHostdevPCIDriverInfo named "driver'. Since the enum value in
this new struct is called "name", it means that all references to
"backend" will become "driver.name".

This will allow easily adding other items for new attributes in the
<driver> element / C struct, which will be useful once we are using
this new struct in multiple places.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-07 23:57:09 -05:00
Laine Stump
a435e7e6c8 conf: move/rename hostdev PCI driver type enum to device_conf.h
Currently this enum is defined in domain_conf.h and named
virDomainHostdevSubsysPCIDriverType. I want to use it in parts of the
network and networkport config, so am moving its definition to
device_conf.h which is / can be included by all interested parties,
and renaming it to match the name of the corresponding XML attribute
("driver name"). The name change (which includes enum values) does cause a
lot of churn, but it's all mechanical.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-07 23:57:09 -05:00
Peter Krempa
2da71d8e43 qemu: process: Separate setup of network device objects
Separate the SLIRP bits from 'qemuProcessNetworkPrepareDevices' and do
the setup of the internal data when setting up domain data.

This will allow tests to use the same code path to lookup data for a
network.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-01-04 22:26:10 +01:00
Peter Krempa
b448abd972 qemuxml2argvmock: Mock qemuInterfaceBridgeConnect
Prepare for test cases which would want to call that function.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-01-04 22:26:10 +01:00
Jonathon Jongsma
9eabf14afb qemu: add runtime config option for nbdkit
Currently when we build with nbdkit support, libvirt will always try to
use nbdkit to access remote disk sources when it is available. But
without an up-to-date selinux policy allowing this, it will fail.
because the required selinux policies are not yet widely available, we
have disabled nbdkit support on rpm builds for all distributions before
Fedora 40.

Unfortunately, this makes it more difficult to test nbdkit support.
After someone updates to the necessary selinux policies, they would also
need to rebuild libvirt to enable nbdkit support. By introducing a
configure option (nbdkit_config_default), we can build packages with
nbdkit support but have it disabled by default.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Suggested-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-01-04 14:34:40 -06:00
Artem Chernyshev
d05cdd1879 virprocess: virProcessGetNamespaces() to void
virProcessGetNamespaces() return value is invariant, so change it
type and remove all dependent checks.

Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-01-04 17:06:14 +01:00
Artem Chernyshev
88903f9abf conf: virDomainNetUpdate() to void
virDomainNetUpdate() return value is invariant, so change it type
and remove all dependent checks.

Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-01-04 17:06:04 +01:00
Artem Chernyshev
46c9458654 cpu: : virCPUx86DataAddItem() to void
virCPUx86DataAddItem() return value is invariant, so change it
type and remove all dependent checks.

Functions changed to void:

virCPUx86DataAddItem()
x86DataAdd()
virCPUx86DataAdd()
x86DataAddSignature()
virCPUx86DataSetSignature()
libxlCapsAddCPUID()
cpuidSetLeaf4()
cpuidSetLeaf7()
cpuidSetLeafB()
cpuidSetLeafD()
cpuidSetLeafResID()
cpuidSetLeaf12()
cpuidSetLeaf14()
cpuidSetLeaf17()

Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-01-04 17:05:02 +01:00
Guoyi Tu
dd2f36d66e qemu_driver: Don't handle the EOF event if vm get restarted
Currently, libvirt creates a thread pool with only on thread to handle all
qemu monitor events for virtual machines, In the cases that if the thread
gets stuck while handling a monitor EOF event, such as unable to kill the
virtual machine process or release resources, the events of other virtual
machine will be also blocked, which will lead to the abnormal behavior of
other virtual machines.

For instance, when another virtual machine completes a shutdown operation
and the monitor EOF event has been queued but remains unprocessed, we
immediately destroy and start the virtual machine again, at a later time
when EOF event get processed, the processMonitorEOFEvent() will kill the
virtual machine that just started.

To address this issue, in the processMonitorEOFEvent(), we check whether
the current virtual machine's id is equal to the the one at the time
the event was generated. If they do not match, we immediately return.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
Signed-off-by: dengpengcheng <dengpc12@chinatelecom.cn>
2024-01-03 17:13:23 +00:00
Han Han
b72d7c46e5 qemu: Replace the deprecated short-formed option "unix"
Change to the boolean option "unix=on"

Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-21 12:21:10 +01:00
Egor Makrushin
c3a8d04980 conf: fix integer overflow in virDomainControllerDefParseXML
Multiplication results in integer overflow.
Thus, replace it with ULLONG_MAX and change
def->opts.pciopts.pcihole64size type to ULL.
Update variable usage according to new type.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Egor Makrushin <emakrushin@astralinux.ru>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-12-20 15:57:07 +01:00
Peter Krempa
19ce02c773 qemuDomainBlockPeek: Fix format checking logic
Recent refactor which changed the format check to use
qemuBlockStorageSourceIsRaw accidentaly inverted the condition.

Caught by the CI test suite.

Fixes: b600b69f82
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2023-12-16 09:20:41 +01:00
Ján Tomko
1a4412f568 qemu: virtiofs: auto-fill idmap for unprivileged use
If the user did not specify any uid mapping, map its own
user ID to ID 0 inside the container and the rest of the IDs
to the first found user's authorized range in /etc/sub[ug]id

https://issues.redhat.com/browse/RHEL-7386
https://gitlab.com/libvirt/libvirt/-/issues/535

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-12-14 17:10:22 +01:00
Ján Tomko
42edb10f17 qemu: allow running virtiofsd in session mode
https://gitlab.com/libvirt/libvirt/-/issues/535

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-12-14 17:10:22 +01:00
Ján Tomko
d27b6e5f49 qemu: virtiofs: do not force UID 0
Remove the explicit setting of uid 0 when running virtiofsd.

It is not required for privileged mode, where virtiofsd will be run
as root anyway. And for unprivileged mode, virtiofsd no longer requires
to be run as root.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-12-14 17:10:22 +01:00
Ján Tomko
bdf96a0f72 qemu: format uid/gid map for virtiofs
Pass the ID map to virtiofsd, which will run the suid `newuidmap`
binary for us.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-12-14 17:10:22 +01:00
Peter Krempa
641ed83e3d qemuDomainBlockResize: Properly resize disks with storage slice
Until now resizing a disk with a storage slice would break in one of the
following ways:

1) for a non-raw format, the virtual size would change, but the slice
would still remain in place
2) for raw disks qemu would refuse to change the size

The only reasonable scenario we want to support is a 'raw' image with 0
offset (inside a block device), where we can just drop the slice.

Anything else comes from a non-standard storage setup that we don't want
to touch.

To facilitate the resize, we first remove the 'size' parameter in qemu
thus dropping the slice and then instructing qemu to resize the disk.

Resolves: https://issues.redhat.com/browse/RHEL-18782
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-14 16:12:04 +01:00
Peter Krempa
48704d4605 qemu: block: Format storage slice properties optionally
Prepare the blockdev props formatter to skip formatting the slice props
in case they are not applicable.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-14 16:11:07 +01:00
Peter Krempa
a2cc772031 qemu: block: Make 'slice' layer effective for 'raw' storage source
Rather than pulling the configuration of the storage slice into the
'format' layer make the 'slice' layer effective for raw disks with a
storage slice. This was made possible by the recent refactors which made
the 'format' layer optional if not needed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-14 16:11:07 +01:00
Peter Krempa
d09f46a9fb qemuDomainBlockResize: Implement VIR_DOMAIN_BLOCK_RESIZE_CAPACITY
Resizing of block-backed storage requires the user to pass the exact
capacity of the device. Implement code which will query it instead so
the user doesn't need to do that.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/449
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-14 16:11:07 +01:00
Peter Krempa
59ec4c6619 qemuDomainBlockResize: Agregate all checks at the beginning
Move the check for readonly and empty disks to the top where all other
checks will be done.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-14 16:06:23 +01:00
Peter Krempa
b600b69f82 qemu: Use qemuBlockStorageSourceIsLUKS/qemuBlockStorageSourceIsRaw
Refactor code checking whether image is raw. This fixes multiple places
where a LUKS encrypted disk could be mistreated.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-14 16:04:27 +01:00
Peter Krempa
04b94593d1 qemu: block: Introduce helpers for properly testing for 'raw' and 'luks' images
Unfortunately a LUKS image to be decrypted by qemu has
VIR_STORAGE_FILE_RAW as format, but has encryption properties populated.

Many places in the code don't check it properly and also don't check
properly whether the image is indeed LUKS to be decrypted by qemu.

Introduce helpers which will simplify this task.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Spellchecked-by: Ján Tomko <jtomko@redhat.com>
2023-12-14 16:03:40 +01:00
Peter Krempa
aded3c622f qemu: migration: Automatically fix non-shared-storage migration to bigger block devices
QEMU's blockdev-mirror job doesn't allow copy into a destination which
isn't exactly the same size as source. This is a problem for
non-shared-storage migration when migrating into a raw block device, as
there it's very hard to ensure that the destination size will match the
source size.

Rather than failing the migration, we can add a storage slice in such
case automatically and thus make the migration pass.

To do this we need to probe the size of the block device on the
destination and if it differs form the size detected on the source we'll
install the 'slice'.

An additional handling is required when persisting the VM as we want to
propagate the slice even there to ensure that the device sizes won't
change.

Resolves: https://issues.redhat.com/browse/RHEL-4607
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-13 20:15:50 +01:00
Peter Krempa
814069bd56 qemu: Move and export qemuDomainStorageUpdatePhysical and dependencies
Move qemuDomainStorageUpdatePhysical, qemuDomainStorageOpenStat,
qemuDomainStorageCloseStat to qemu_domain.c and export them. They'll be
reused in the migration code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-13 20:15:49 +01:00
Peter Krempa
426eeb0d3c qemu: migration: Rename qemuMigrationDstPrecreateStorage to qemuMigrationDstPrepareStorage
The function will be used to setup storage for non-shared-storage
migration, not just precreate images.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-13 20:15:49 +01:00
Peter Krempa
f0fe94605c qemuDomainStorageOpenStat: Remove unused 'driver' argument and untangle callers
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-13 20:15:49 +01:00
Peter Krempa
6a38559092 qemu: migration: Improve handling of VIR_MIGRATE_PARAM_DEST_XML with VIR_MIGRATE_PERSIST_DEST
When a user provides a migration XML via the VIR_MIGRATE_PARAM_DEST_XML
it's expected that they want to change ABI-compatible aspects of the XML
such as the disk paths or similar.

If the user requests persisting of the VM but does not provide an
explicit persistent XML libvirt would take the persistent XML from the
source of the migration as the persistent config. This usually involves
the old paths to images.

Doing this would result into failure to start the VM.

It makes more sense to take the XML used for migration and use that as
the base for persisting the config.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-13 20:15:49 +01:00
Peter Krempa
7c1244f3a5 qemuMigrationDstPrecreateStorage: Fix and clarify logic
While it's intended that qemuMigrationDstPrecreateDisk is called with
any kind of the disk, the logic in qemuMigrationDstPrecreateStorage
which checks the existence of the image wouldn't properly handle e.g.
network backed disks, where it would attempt to use virFileExists() on
the disk's 'src->path'.

Fix the logic by first skipping disks not meant for migration, then do
the existence check only when 'disk->src' is local storage.

Since qemuMigrationDstPrecreateDisk has a debug statement there's no
need to have an extra one right before calling into it.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-13 20:15:49 +01:00
Peter Krempa
9864486966 qemuMigrationDstPrecreateStorage: Refactor cleanup
Use automatic pointer freeing for 'conn' and remove the 'cleanup' label.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-13 20:15:49 +01:00
Peter Krempa
16832e0dd2 qemuMigrationDstPrecreateStorage: Improve error messages
Change the error messages so that they can be used to identify the
problematic disk or image.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-13 20:15:49 +01:00
Peter Krempa
b99c709d8d qemu: migration: Validate migration XML
There's no point in skiping the validation step:
- on the source, the VM is parsed for ABI stability checking, thus the
  equivalent config was validated when the VM was started

- on the destination, the XML will be validated inside qemuProcessInit
  very soon after it is parsed

This fixes problems such as if the user uses a relative path in the disk
source or omits the source, as the disk migration code reasonably
expects that all checks were performed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-12-07 16:42:32 +01:00
Michal Privoznik
a3129ae6df qemuDomainChangeNet: Reflect trustGuestRxFilters change
On device-update, when user requests change of
trustGuestRxFilters we currently do nothing. Nor error out, nor
act on the request. While we can just throw an error,
implementing this is pretty trivial.

Resolves: https://issues.redhat.com/browse/RHEL-735
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-06 12:36:03 +01:00
Michal Privoznik
d6169ad739 qemuMonitorJSONQueryRxFilter: Allow @filter to be NULL
Sometimes it may be handy to just issue the query-rx-filter
monitor command without actually parsing the output. Adapt
qemuMonitorJSONQueryRxFilter() to this behavior.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-06 12:30:10 +01:00
Michal Privoznik
cab49d394f qemu: Relax check for memory device coldplug
When cold plugging a memory device we check whether there's
enough free memory slots to accommodate new module. Well, this
checks makes sense only for those memory devices that are plugged
into DIMM slots (DIMM and NVDIMM models). Other memory device
models, like VIRTIO_MEM, VIRTIO_PMEM or SGX_EPC are attached into
PCI bus, or no bus at all.

Resolves: https://issues.redhat.com/browse/RHEL-15480
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-12-05 16:36:53 +01:00
Michal Privoznik
b41f730c33 qemu: Move memory device coldplug into a separate function
The code that handles coldplug of a memory device is pretty
trivial and such could continue to live in the huge switch()
where other devices are handled. But the code is about to get
more complicated. To help with code readability, move it into a
separate function.

And while at it, make the function accept a double pointer to the
memory device definition to make the ownership transfer obvious
(the device is part of the domain on successful run).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-12-05 16:36:46 +01:00
Andrea Bolognani
6aa2fa38b0 meson: Stop looking for passt at build time
We only use it at runtime, not during the build process.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2023-12-05 11:50:44 +01:00
Michal Privoznik
cc9f439be9 qemu_command: Don't open code virPCIDeviceAddressAsString()
When building a hostdev props, its PCI address is formatted via
g_strdup_printf(VIR_PCI_DEVICE_ADDRESS_FMT, ...); Well, we have a
function that does exactly that: virPCIDeviceAddressAsString().
Use the latter.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-05 09:26:47 +01:00
Peter Krempa
69880584e6 qemuProcessStartWithMemoryState: Don't start qemu with '-loadvm SNAP' and '-incoming defer' together
A bug in qemuProcessStartWithMemoryState caused that we would start qemu
with '-loadvm SNAP' and '-incoming defer' together.  qemu doesn't expect
that and crashes on an assertion failure [1].

[1]: https://issues.redhat.com/browse/RHEL-16782

Fixes: 8a88d3e586
Resolves: https://issues.redhat.com/browse/RHEL-17841
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2023-12-01 11:35:14 +01:00
Peter Krempa
0728bc47af qemuMigrationSrcNBDStorageCopyBlockdevPrepareSource: Don't setup 'raw' layer for migration NBD connection
The raw driver layer is not needed in this case and can be dropped.
Removing the nodename will cause other pieces of the code to pick up and
stop adding the layer.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-27 10:14:20 +01:00
Peter Krempa
734e4e9783 qemu: block: Remove unused qemuBlockStorageSourceDetachOneBlockdev
The only caller was converted to use the common blockdev infrastructure
thus this function is no longer needed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-27 10:14:20 +01:00
Peter Krempa
e534d19b5d qemuMigrationSrcNBDCopyCancel: Use qemuBlockStorageSourceAttachRollback to detach migration NBD blockdevs
Rewrite the code to use the common tooling for removing blockdevs
instead of the ad-hoc qemuBlockStorageSourceDetachOneBlockdev helper.

Use of the common infrastructure will properly handle cases when the raw
driver is ommited from the block graph.

Since the TLS data object is shared for all migration QMP commands and
objects we need to strip its alias from the definition of the storage
source before attempting to detach it.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-27 10:14:20 +01:00
Peter Krempa
9ec0e28e87 qemuBlockReopenAccess: prepare for removal of 'raw' format layer
Make the helper reopening a blockdev for access pick the correct layer
to reopen based on what is currently in use.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-27 10:14:20 +01:00
Peter Krempa
dee5b3fb8e qemu: block: Absorb qemuBlockReopenFormatMon into qemuBlockReopenAccess
Move all the code into the now only caller.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-27 10:14:20 +01:00
Peter Krempa
7e66ff4fd1 qemu: monitor: Sanitize arguments of qemuMonitorBlockdevReopen
Take the virJSONValue array object which is passed to the
'blockdev-reopen' command as the 'options' argument rather than making
the caller wrap all the properties.

The code was a leftover from the time when the blockdev-reopen command
had a different syntax, and thus can be cleaned up.

Also note that the logging of the node name never worked as the top
level object didn't ever contain a 'node-name' property.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-27 10:14:20 +01:00
Peter Krempa
24b667eeed qemu: block: Absorb logic from qemuBlockReopenFormat to qemuBlockReopenAccess
Move all the logic into the new function and remove the old one.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-27 10:14:20 +01:00
Peter Krempa
214794c9c7 qemu: block: Extract logic from qemuBlockReopenReadWrite/ReadOnly
We want to preserve the wrappers for clarity but the inner logic can be
extracted to a common function qemuBlockReopenAccess. In further patches
the code from qemuBlockReopenFormat will be merged into the new wrapper
as well as logic for handling scenarios with missing 'format' layers
will be added.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-27 10:14:20 +01:00
Peter Krempa
834d283bcf qemuBlockStorageSourceGetEffectiveNodename: Prepare for missing 'format' driver
Return the effective storage nodename if the format layer is not
present.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-27 10:14:20 +01:00
Peter Krempa
6ab5ee9a9a qemuDomainPrepareStorageSourceBlockdevNodename: Restructure code to allow missing 'format' layer
Similarly to other bits of code, we don't need to setup the format layer
if it will not be formatted. Add logic which uses
qemuBlockStorageSourceNeedsFormatLayer to see whether the setup of the
format node is needed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-27 10:14:20 +01:00
Peter Krempa
813ccd553b qemuBlockStorageSourceDetachPrepare: Prepare for possibly missing 'format' layer
Setup the data for detaching of the 'format' layer only when it's
present.

Restructure the logic to follow the same order as
qemuBlockStorageSourceAttachPrepareBlockdev in terms of
format/slice/storage -blockdev objects, and drop the now-misleading
comment for 'slice' of raw disks.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-27 10:14:20 +01:00
Peter Krempa
10cc057074 qemuBlockStorageSourceAttachPrepareBlockdev: Prepare for optionally missing format layer
Restructure the code logic so that the function is prepared for the
possibility that the 'format' blockdev layer may be missing if not
needed.

To achieve this we need to introduce logic that selects which node
(format/slice/storage) becomes the effective node and thus formats the
correct set of arguments.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-27 10:14:20 +01:00
Peter Krempa
b27e8b5a0b qemuBlockStorageSourceGetBlockdevStorageSliceProps: Allow turning the slice layer into effective blockdev layer
Allow using the slice layer as effective layer once we stop formatting
the unnecessary 'raw' driver.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-27 10:14:20 +01:00
Peter Krempa
7f19a55a9e qemu: block: Introduce helper for deciding when a 'format' layer is needed
The 'format' layer is not required in certain cases. As the logic for
this will be a bit more involved create a helper function to do the
decision.

For now we'll keep to always format the 'format' -blockdev layer.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-27 10:14:20 +01:00
Peter Krempa
aaf828d3d4 qemu: block: Use qemuBlockStorageSourceNeedsStorageSliceLayer only for setup
Add a note stating that qemuBlockStorageSourceNeedsStorageSliceLayer
must be used only when setting up a new blockdev, any other case when
the device might been already set up must use the existence of the
nodename to do so.

Adjust qemuBlockStorageSourceAttachPrepareBlockdev to do so and refactor
qemuBlockStorageSourceDetachPrepare to use the same logic.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-27 10:14:20 +01:00
Peter Krempa
05326395a9 qemu: block: Introduce qemuBlockStorageSourceGetSliceNodename
The helper retrieves the nodename of the slice layer if it's present.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-27 10:14:20 +01:00
Michal Privoznik
10594bb311 qemu: Generate cmd line for pipewire audio backend
This is mostly straightforward, except for a teensy-weensy
detail: usually, there's no system wide daemon running, no system
wide available socket that anybody could connect to. PipeWire
uses a per user daemon approach instead. But this in turn means,
that the socket location floats between various locations and is
derived from various environment variables (just like the actual
socket name) and thus we must pass the variables to QEMU.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/560
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-24 17:49:20 +01:00
Michal Privoznik
c472ce024b conf: Introduce pipewire audio backend
QEMU gained support for PipeWire audio backend (see QEMU commit
of v8.0.0-403-gc2d3d1c294). Its configuration knobs are basically
the same as pulseaudio's, except for PA's server name. Therefore,
a lot of code is copied over from pulseadio and fixed by
s/Pulse/Pipewire/ or s/pulseaudio/pipewire/.

There's one ley difference to PA though: pipewire daemon is
usually on per user basis (just like our qemu:///session).
Therefore, introduce this 'runtimeDir' attribute, which allows
specifying path to pipewire daemon socket (useful for
qemu:///system for instance).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-24 17:49:02 +01:00
Peter Krempa
0a1d2b43e0 qemu: block: Don't try to merge bitmaps into 'raw' images
If any of the images in a chain above a raw image have bitmaps, libvirt
would attempt to merge them when doing a block commit or block copy
operation, which would result into a error in the logs as creating
persistent bitmaps in a raw image is not supported.

Since libvirt cares only about persistent bitmaps we can simply skip the
operation if the target of a block copy or block commit is a raw image.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-24 15:53:41 +01:00
Peter Krempa
94f1883c89 qemu: hotplug: Detect disk backing images before setting up security access
The VM will require access also to the detected images. Unfortunately a
recent reordering of the code introduced a bug where the backing chain
was probed after setting up cgroups/selinux/namespaces, which caused
that any detected images were not allowed/added and qemu was then not
able to use them.

Fixes: 9b8bb536ff
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-24 15:53:38 +01:00
Michal Privoznik
a6fec3881c qemu_domain: Drop qemuCheckMemoryDimmConflict()
The virDomainMemoryDefCheckConflict() already does the same set
of checks. There's no need to duplicate them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2023-11-24 12:37:39 +01:00
Michal Privoznik
cfcbba4c2b lib: Replace qsort() with g_qsort_with_data()
While glibc provides qsort(), which usually is just a mergesort,
until sorting arrays so huge that temporary array used by
mergesort would not fit into physical memory (which in our case
is never), we are not guaranteed it'll use mergesort. The
advantage of mergesort is clear - it's stable. IOW, if we have an
array of values parsed from XML, qsort() it and produce some
output based on those values, we can then compare the output with
some expected output, line by line.

But with newer glibc this is all history. After [1], qsort() is
no longer mergesort but introsort instead, which is not stable.
This is suboptimal, because in some cases we want to preserve
order of equal items. For instance, in ebiptablesApplyNewRules(),
nwfilter rules are sorted by their priority. But if two rules
have the same priority, we want to keep them in the order they
appear in the XML. Since it's hard/needless work to identify
places where stable or unstable sorting is needed, let's just
play it safe and use stable sorting everywhere.

Fortunately, glib provides g_qsort_with_data() which indeed
implement mergesort and it's a drop in replacement for qsort(),
almost. It accepts fifth argument (pointer to opaque data), that
is passed to comparator function, which then accepts three
arguments.

We have to keep one occurance of qsort() though - in NSS module
which deliberately does not link with glib.

1: https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=03bf8357e8291857a435afcc3048e0b697b6cc04
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-11-24 09:53:14 +01:00
Peter Krempa
894c6c5c16 qemu: hotplug: Don't try to setup disk image when hotplugging empty cdrom drive
Originally the disk hotplug code didn't know how to attach a CD-ROM
drive, thus didn't have the necessary logic to handle empty cdroms.

Other disks can't be empty which is enforced by the parser validation
logic.

When support for hotplugging cdroms was added the code was not adjusted
to deal with empty drives thus attempted to setup the blockdev backend
for it.

Fixes: 3078799fef
Resolves: https://issues.redhat.com/browse/RHEL-16870
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-23 14:31:05 +01:00
Peter Krempa
fe42189d76 qemuDomainAttachDeviceDiskLiveInternal: Add missing jump to 'cleanup' on error
Commit allowing hotplug of CDROMs moved the logic forbidding the hotplug
to the appropriate blocks based on the disk frontend but forgot to
actually bail out on such error.

Fixes: 3078799fef
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-23 14:31:05 +01:00
Peter Krempa
16f8daf2df qemuDomainAttachDeviceDiskLiveInternal: Fix jumps on error
When I've originally refactored the function in commit 0d981bcefc
the logic was still correct, but then later in commit 52f8655439
I've moved most of the image setup logic into the function neglecting to
add the 'goto cleanup;' needed to skip over the setup of the disk
images.

Fixes: 52f8655439
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-23 14:31:05 +01:00
Peter Krempa
a1f7faa402 qemu: validate: Reword error message when CCW addresses are not supported for a machine
Reword the error message to clearly state that the machine type doesn't
support the address type. It doesn't matter which device it's for.

Additionally the alias may be still NULL at the point when the error is
being reported misleading users that they have something wrong with a
specific device.

Resolves: https://issues.redhat.com/browse/RHEL-16878
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
2023-11-23 14:29:48 +01:00
Pavel Hrdina
3ad5817053 qemu_snapshot: fix reverting to inactive snapshot
When reverting to inactive snapshot updating the domain definition needs
to happen after the new overlays are created otherwise qemu-img will
correctly fail with error:

    Trying to create an image with the same filename as the backing file

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2023-11-16 15:29:45 +01:00
Pavel Hrdina
03a9a39c42 qemu_snapshot: fix snapshot deletion that had multiple children
When we revert to non-leaf snapshot and create new branch or branches
the overlay in snapshot metadata is no longer usable as a disk source
for deletion of that snapshot. We need to use other places to figure out
the correct storage source.

Fixes: https://gitlab.com/libvirt/libvirt/-/issues/534

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2023-11-16 15:29:03 +01:00
Pavel Hrdina
4f4a8dce94 qemu_process: fix crash in qemuSaveImageDecompressionStart
Commit changing the code to allow passing NULL as @data into
qemuSaveImageDecompressionStart() was not correct as it left the
original call into the function as well.

Introduced-by: 2f3e582a1a
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2247754
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-03 14:17:06 +01:00
Peter Krempa
47ee78048c qemu: block: Remove unused flags QEMU_BLOCK_STORAGE_SOURCE_BACKEND_PROPS_ flags
QEMU_BLOCK_STORAGE_SOURCE_BACKEND_PROPS_SKIP_UNMAP is no longer
referenced inside the code.

QEMU_BLOCK_STORAGE_SOURCE_BACKEND_PROPS_AUTO_READONLY is passed from
various code paths to the qemuBlockStorageSourceGetBackendProps helper,
but it's no longer used.

Both thus can be removed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-11-02 15:32:43 +01:00