Commit Graph

50770 Commits

Author SHA1 Message Date
Yuri Chornoivan
1949f028e3 Translated using Weblate (Ukrainian)
Currently translated at 100.0% (10497 of 10497 strings)

Translation: libvirt/libvirt
Translate-URL: https://translate.fedoraproject.org/projects/libvirt/libvirt/uk/

Co-authored-by: Yuri Chornoivan <yurchor@ukr.net>
Signed-off-by: Yuri Chornoivan <yurchor@ukr.net>
2024-06-27 15:28:08 +01:00
Weblate
32cd35bf60 Update translation files
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Translation: libvirt/libvirt
Translate-URL: https://translate.fedoraproject.org/projects/libvirt/libvirt/

Co-authored-by: Weblate <noreply@weblate.org>
Signed-off-by: Fedora Weblate Translation <i18n@lists.fedoraproject.org>
2024-06-27 15:28:06 +01:00
Göran Uddeborg
889eb95301 Translated using Weblate (Swedish)
Currently translated at 77.0% (8086 of 10497 strings)

Translation: libvirt/libvirt
Translate-URL: https://translate.fedoraproject.org/projects/libvirt/libvirt/sv/

Translated using Weblate (Swedish)

Currently translated at 77.3% (8082 of 10454 strings)

Translation: libvirt/libvirt
Translate-URL: https://translate.fedoraproject.org/projects/libvirt/libvirt/sv/

Co-authored-by: Göran Uddeborg <goeran@uddeborg.se>
Signed-off-by: Göran Uddeborg <goeran@uddeborg.se>
2024-06-27 15:17:23 +01:00
Jiri Denemark
0c94ec428f po: Refresh potfile for v10.5.0
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2024-06-25 15:41:07 +02:00
Michal Privoznik
1a8f646f29 virt-host-validate: Detect SEV-ES and SEV-SNP
With a simple cpuid (Section "E.4.17 Function
8000_001Fh—Encrypted Memory Capabilities" in "AMD64 Architecture
Programmer’s Manual Vol. 3") we can detect whether CPU is capable
of running SEV-ES and/or SEV-SNP guests. Report these in
virt-host-validate tool.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:59:30 +02:00
Michal Privoznik
30c01e535d virt-host-validate: Move AMD SEV into a separate func
The code that validates AMD SEV is going to be expanded soon.
Move it into its own function to avoid lengthening
virHostValidateSecureGuests() where the code lives now, even
more.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:59:28 +02:00
Michal Privoznik
fbe97ee17d qemu_validate: Use domaincaps to validate supported launchSecurity type
Now that the logic for detecting supported launchSecurity types
has been moved to domain capabilities generation, we can just use
it when validating launchSecurity type. Just like we do for
device models and so on.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:46:08 +02:00
Michal Privoznik
66df7992d8 qemu: Fill launchSecurity in domaincaps
The inspiration for these rules comes from
qemuValidateDomainDef().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:46:05 +02:00
Michal Privoznik
d460e17282 domcaps: Report launchSecurity
In order to learn what types of <launchSecurity/> are supported
users can turn to domain capabilities and find <sev/> and
<s390-pv/> elements. While these may expose some additional info
on individual launchSecurity types, we are lacking clean
enumeration (like we do for say device models). And given that
SEV and SEV SNP share the same basis (info found under <sev/> is
applicable to SEV SNP too) we have no other way to report SEV SNP
support.

Therefore, report supported launchSecurity types in domain
capabilities.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:46:03 +02:00
Michal Privoznik
d00816209e qemu_capabilities: Probe SEV capabilities even for QEMU_CAPS_SEV_SNP_GUEST
While it's very unlikely to have QEMU that supports SEV-SNP but
doesn't support plain SEV, for completeness sake we ought to
query SEV capabilities if QEMU supports either. And similarly to
QEMU_CAPS_SEV_GUEST we need to clear the capability if talking to
QEMU proves SEV is not really supported.

This in turn removes the 'sev-snp-guest' capability from one of
our test cases as Peter's machine he uses to refresh capabilities
is not SEV capable. But that's okay. It's consistent with
'sev-guest' capability.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:46:00 +02:00
Michal Privoznik
3ec87cd4b8 qemuxmlconftest; Explicitly enable QEMU_CAPS_SEV_SNP_GUEST for "launch-security-sev-snp"
Soon, QEMU_CAPS_SEV_SNP_GUEST is going to be dependant on more
than plain presence of "sev-snp-guest" object in QEMU. Explicitly
enable the capability for "launch-security-sev-snp" test so that
we can continue testing cmd line and xml2xml.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:45:58 +02:00
Michal Privoznik
3a6ca064ca libvirt_private.syms: Export virDomainLaunchSecurity enum handlers
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-25 14:45:54 +02:00
Rayhan Faizel
9b0606ef8e qemu_block: Validate number of hosts for iSCSI disk device
An iSCSI device with zero hosts will result in a segmentation fault. This patch
adds a check for the number of hosts, which must be one in the case of iSCSI.

Minimal reproducing XML:

<domain type='qemu'>
    <name>MyGuest</name>
    <uuid>4dea22b3-1d52-d8f3-2516-782e98ab3fa0</uuid>
    <os>
        <type arch='x86_64'>hvm</type>
    </os>
    <memory>4096</memory>
    <devices>
        <disk type='network'>
            <source name='dummy' protocol='iscsi'/>
            <target dev='vda'/>
        </disk>
    </devices>
</domain>

Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-06-25 10:05:49 +02:00
Jon Kohler
1cc7737f69 qemu: add support for qemu switchover-ack
Add plumbing for QEMU's switchover-ack migration capability, which
helps lower the downtime during VFIO migrations. This capability is
enabled by default as long as both the source and destination support
it.

Note: switchover-ack depends on the return path capability, so this may
not be used when VIR_MIGRATE_TUNNELLED flag is set.

Extensive details about the qemu switchover-ack implementation are
available in the qemu series v6 cover letter [1] where the highlight is
the extreme reduction in guest visible downtime. In addition to the
original test results below, I saw a roughly ~20% reduction in downtime
for VFIO VGPU devices at minimum.

  === Test results ===

  The below table shows the downtime of two identical migrations. In the
  first migration swithcover ack is disabled and in the second it is
  enabled. The migrated VM is assigned with a mlx5 VFIO device which has
  300MB of device data to be migrated.

  +----------------------+-----------------------+----------+
  |    Switchover ack    | VFIO device data size | Downtime |
  +----------------------+-----------------------+----------+
  |       Disabled       |         300MB         |  1900ms  |
  |       Enabled        |         300MB         |  420ms   |
  +----------------------+-----------------------+----------+

  Switchover ack gives a roughly 4.5 times improvement in downtime.
  The 1480ms difference is time that is used for resource allocation for
  the VFIO device in the destination. Without switchover ack, this time is
  spent when the source VM is stopped and thus the downtime is much
  higher. With switchover ack, the time is spent when the source VM is
  still running.

[1] https://patchwork.kernel.org/project/qemu-devel/cover/20230621111201.29729-1-avihaih@nvidia.com/

Signed-off-by: Jon Kohler <jon@nutanix.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Avihai Horon <avihaih@nvidia.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: YangHang Liu <yanghliu@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2024-06-25 09:51:00 +02:00
Jiri Denemark
e622970c87 qemu: Fix migration with disabled vmx-* CPU features
When starting a domain on a host which lacks a vmx-* CPU feature which
is expected to be enabled by the CPU model specified in the domain XML,
libvirt properly marks such feature as disabled in the active domain
XML. But migrating the domain to a similar host which lacks the same
vmx-* feature will fail with libvirt reporting the feature as missing.
This is because of a bug in the hack ensuring backward compatibility
libvirt running on the destination thinks the missing feature is
expected to be enabled.

https://issues.redhat.com/browse/RHEL-40899

Fixes: v10.1.0-85-g5fbfa5ab8a
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-06-25 09:41:16 +02:00
Göran Uddeborg
ba6cd2d5a8 Translated using Weblate (Swedish)
Currently translated at 77.1% (8062 of 10454 strings)

Translation: libvirt/libvirt
Translate-URL: https://translate.fedoraproject.org/projects/libvirt/libvirt/sv/

Translated using Weblate (Swedish)

Currently translated at 76.9% (8042 of 10454 strings)

Translation: libvirt/libvirt
Translate-URL: https://translate.fedoraproject.org/projects/libvirt/libvirt/sv/

Co-authored-by: Göran Uddeborg <goeran@uddeborg.se>
Signed-off-by: Göran Uddeborg <goeran@uddeborg.se>
2024-06-24 15:53:18 +02:00
Jonathon Jongsma
af437d2d64 qemu: Don't specify vfio-pci.ramfb when ramfb is false
Commit 7c8e606b64 attempted to fix
the specification of the ramfb property for vfio-pci devices, but it
failed when ramfb is explicitly set to 'off'. This is because only the
'vfio-pci-nohotplug' device supports the 'ramfb' property. Since we use
the base 'vfio-pci' device unless ramfb is enabled, attempting to set
the 'ramfb' parameter to 'off' this will result in an error like the
following:

  error: internal error: QEMU unexpectedly closed the monitor
  (vm='rhel'): 2024-06-06T04:43:22.896795Z qemu-kvm: -device
  {"driver":"vfio-pci","host":"0000:b1:00.4","id":"hostdev0","display":"on
  ","ramfb":false,"bus":"pci.7","addr":"0x0"}: Property 'vfio-pci.ramfb'
  not found.

This also more closely matches what is done for mdev devices.

Resolves: https://issues.redhat.com/browse/RHEL-28808

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-24 08:55:50 -05:00
Laine Stump
397c0f4b01 network: add more firewall test cases
This patch adds some previously missing test cases that test for
proper firewall rule creation when the following are included in the
network definition:

* <forward dev='blah'>
* no forward element (an "isolated" network)
* nat port range when only ipv4 is nat-ed
* nat port range when both ipv4 & ipv6 are nated

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Laine Stump <laine@redhat.com>
2024-06-24 13:51:04 +01:00
Laine Stump
aabf279ca0 tests: fix broken nftables test data so that individual tests are successful
When the chain names and table name used by the nftables firewall
backend were changed in commit
958aa7f274, I forgot to change the test
data file base.nftables, which has the extra "list" and "add
chain/table" commands that are generated for the first test case of
networkxml2firewalltest.c. When the full set of tests is run, the
first test will be an iptables test case, so those extra commands
won't be added to any of the nftables cases, and so the data in
base.nftables never matches, and the tests are all successful.

However, if the test are limited with, e.g. VIR_TEST_RANGE=2 (test #2
will be the nftables version of the 1st test case), then the commands
to add nftables table/chains *will* be generated in the test output,
and so the test will fail. Because I was only running the entire test
series after the initial commits of nftables tests, I didn't notice
this. Until now.

base.nftables has now been updated to reflect the current names for
chains/table, and running individual test cases is once again
successful.

Fixes: 958aa7f274
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Laine Stump <laine@redhat.com>
2024-06-24 13:49:26 +01:00
Adam Julis
3a9095976e qemuDomainDiskChangeSupported: Fill in missing check
The attribute 'discard_no_unref' of <disk/> is not allowed to be
changed while the virtual machine is running.

Resolves: https://issues.redhat.com/browse/RHEL-37542
Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-24 11:14:56 +02:00
Laine Stump
43a0881274 network: allow for forward dev to be a transient interface
A user reported that if they set <forward mode='nat|route' dev='blah'>
starting the network would fail if the device 'blah' didn't already
exist.

This is caused by using "iif" and "oif" in nftables rules to check for
the forwarding device - these two commands work by saving the named
interface's ifindex (an unsigned integer) when the rule is added, and
comparing it to the ifindex associated with the packet's path at
runtime. This works great if the interface both 1) exists when the
rule is added, and 2) is never deleted and re-created after the rule
is added (since it would end up with a different ifindex).

When checking for the network's bridge device, it is okay for us to
use "iif" and "oif", because the bridge device is created before the
firewall rules are added, and will continue to exist until just after
the firewall rules are deleted when the network is shutdown.

But since the forward device might be deleted/re-added during the
lifetime of the network's firewall rules, we must instead us "oifname"
and "iifname" - these are much less efficient than "Xif" because they
do a string compare of the interface's name rather than just comparing
two integers (ifindex), but they don't require the interface to exist
when the rule is added, and they can properly cope with the named
interface being deleted and re-added later.

Fixes: a4f38f6ffe
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 06:52:57 -04:00
Michal Privoznik
da082e5927 domain_validate: Add missing 'break' in virDomainDefLaunchSecurityValidate()
A few commits ago (v10.4.0-101-gc65eba1f57) I've introduced
virDomainDefLaunchSecurityValidate() and a switch() statement in
it. Some cases are empty but are lacking 'break' statement which
is not valid. Provide missing 'break' statement.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-21 10:37:35 +02:00
Michal Privoznik
58b5219961 qemu_firmware: Pick the right firmware for SEV-SNP guests
The firmware descriptors have 'amd-sev-snp` feature which
describes whether firmware is suitable for SEV-SNP guests.
Provide necessary implementation to detect the feature and pick
the right firmware if guest is SEV-SNP enabled.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:59:04 +02:00
Michal Privoznik
a1d850b300 qemu: Build cmd line for SEV-SNP
Pretty straightforward as qemu has 'sev-snp-guest' object which
attributes maps pretty much 1:1 to our XML model. Except for
@vcek where QEMU has 'vcek-disabled`, an inverted boolean, while
we model it as virTristateBool. But that's easy to map too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:58:10 +02:00
Michal Privoznik
c65eba1f57 conf: Introduce SEV-SNP support
SEV-SNP is an enhancement of SEV/SEV-ES and thus it shares some
fields with it. Nevertheless, on XML level, it's yet another type
of <launchSecurity/>.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:56:57 +02:00
Michal Privoznik
1abcba9d4d qemu_capabilities: Introduce QEMU_CAPS_SEV_SNP_GUEST
This capability tracks sev-snp-guest object availability.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:56:18 +02:00
Michal Privoznik
be26d0ebbe qemu: Report snp-policy in virDomainGetLaunchSecurityInfo()
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:36:04 +02:00
Michal Privoznik
914b986275 qemu_monitor: Allow querying SEV-SNP state in 'query-sev'
In QEMU commit v9.0.0-1155-g59d3740cb4 the return type of
'query-sev' monitor command changed to accommodate SEV-SNP. Even
though we currently support launching plain SNP guests, this will
soon change.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:35:32 +02:00
Michal Privoznik
7d16c296e3 src: Convert some _virDomainSecDef::sectype checks to switch()
In a few instances there is a plain if() check for
_virDomainSecDef::sectype. While this works perfectly for now,
soon there'll be another type and we can utilize compiler to
identify all the places that need adaptation. Switch those if()
statements to switch().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:32:09 +02:00
Michal Privoznik
a44a43361f Drop needless typecast to virDomainLaunchSecurity
The sectype member of _virDomainSecDef struct is already declared
as of virDomainLaunchSecurity type. There's no need to typecast
it to the very same type when passing it to switch().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:31:33 +02:00
Michal Privoznik
faa3548ed5 conf: Separate SEV formatting into a function
To avoid convolution of switch() inside of virDomainSecDefFormat() even
more (as new sectypes are added), move formatting into a separate
function.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:30:24 +02:00
Michal Privoznik
d2cad18ca3 conf: Move some members of virDomainSEVDef into virDomainSEVCommonDef
Some parts of SEV are to be shared with SEV SNP. In order to
reuse XML parsing / formatting code cleanly, let's move those
common bits into a new struct (virDomainSEVCommonDef) and adjust
rest of the code.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:28:54 +02:00
Michal Privoznik
66efdfabd9 qemu_monitor_json: Report error in error paths in SEV related code
While working on qemuMonitorJSONGetSEVMeasurement() and
qemuMonitorJSONGetSEVInfo() I've noticed that if these functions
fail, they do so without appropriate error set. Fill in error
reporting.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-21 09:25:32 +02:00
Peter Krempa
e6b94cba7e qemu: migration: Preserve error across qemuDomainSetMaxMemLock() on error paths
When a VM terminates itself while it's being migrated in running state
libvirt would report wrong error:

 error: cannot get locked memory limit of process 2502057: No such file or directory

rather than the proper error:

 error: operation failed: domain is not running

Remember the error on error paths in qemuMigrationSrcConfirmPhase and
qemuMigrationSrcPerformPhase.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
e00a58c10a qemuMigrationSrcRun: Re-check whether VM is active before accessing job data
'qemuProcessStop()' clears the 'current' job data. While the code under
the 'error' label in 'qemuMigrationSrcRun()' does check that the VM is
active before accessing the job, it also invokes multiple helper
functions to clean up the migration including
'qemuMigrationSrcNBDCopyCancel()' which calls 'qemuDomainObjWait()'
invalidating the result of the liveness check as it unlocks the VM.

Duplicate the liveness check and explain why. The rest of the code e.g.
accessing the monitor is safe as 'qemuDomainEnterMonitorAsync()'
performs a liveness check. The cleanup path just ignores the return
values of those functions.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
9243e87820 qemu: migration: Inline 'qemuMigrationDstFinishResume()'
The function is a pointless wrapper on top of
qemuMigrationDstWaitForCompletion.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
a52e125d56 qemu: migration: Properly check for live VM after qemuDomainObjWait()
Similarly to the one change in commit 4d1a1fdffd
we should be checking that the VM is not being yet destroyed if we've
invoked qemuDomainObjWait().

Use the new helper qemuDomainObjIsActive().

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
9eb33b7f03 qemu: domain: Introduce qemuDomainObjIsActive helper
The helper checks whether VM is active including the internal qemu
state. This helper will become useful in situations when an async job
is in use as VIR_JOB_DESTROY can run along async jobs thus both checks
are necessary.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
d9935a5c4f qemu: process: Ensure that 'beingDestroyed' gets cleared only after VM id is reset
Prevent the possibility that a VM could be considered as alive while
inside qemuProcessStop.

A recently fixed bug which unlocked the domain object while inside
qemuProcessStop showed that there's possibility to confuse the state of
the VM to be considered active while 'qemuProcessStop' is processing
shutdown of the VM. Ensure that this doesn't happen by clearing the
'beingDestroyed' flag only after the VM id is cleared.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
3865410e7f qemuProcessStop: Move code not depending on 'vm->def->id' after reset of the ID
There are few function calls done while cleaning up a stopped VM which
do require the old VM id, to e.g. clean up paths containing the 'short'
domain name in the path.

Anything else, which doesn't strictly require it can be moved after
clearing the 'id' in order to decrease likelyhood of potential bugs.

This patch moves all the code which does not require the 'id' (except
for the log entry and closing the monitor socket) after the statement
clearing the id and adds a comment explaining that anything in the
section must not unlock the VM object.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:52 +02:00
Peter Krempa
d29e0f3d4a qemuProcessStop: Prevent crash when qemuDomainObjStopWorker() unlocks the VM
'qemuDomainObjStopWorker()' which is meant to dispose of the event loop
thread for the monitor unlocks the VM object while disposing the thread
to prevent possible deadlocks with events waiting on the monitor thread.

Unfortunately 'qemuDomainObjStopWorker()' is called *before* the VM is
marked as inactive by clearing 'vm->def->id', but at the same time it's
no longer marked as 'beingDestroyed' when we're inside
'qemuProcessStop()'.

If 'vm' would be kept locked this wouldn't be a problem. Same way it's
not a problem for anything that uses non-ASYNC VM jobs, or when the
monitor is accessed in an async job, as the 'destroy' job interlocks
with those.

It is a problem for code inside an async job which uses
'qemuDomainObjWait()' though. The API contract of qemuDomainObjWait()
ensures the caller that the VM on successful return from it, but in this
specific reason it's not the case, as both 'beingDestroyed' is already
false, and 'vm->def->id' is not yet cleared.

To fix the issue move the 'qemuDomainObjStopWorker()' call *after*
clearing 'vm->def->id' and also add a note stating what the function is
doing.

Fixes: 860a999802
Closes: https://gitlab.com/libvirt/libvirt/-/issues/640
Reported-by: luzhipeng <luzhipeng@cestc.cn>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:58:21 +02:00
Peter Krempa
da8d97e4e2 qemuDomainObjWait: Add documentation
Document why this function exists and meaning of return values.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:52:55 +02:00
Peter Krempa
f9ad21996d qemuDomainDeviceBackendChardevForeach: Fix typo in comment
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:52:54 +02:00
Peter Krempa
b4423a753b qemuDomainDiskPrivateDispose: Prevent dangling 'disk' pointer in blockjob data
Clear the 'disk' member of 'blockjob' as we're freeing the disk object
at this point. While this should not normally happen it was observed
when other bug allowed the VM to be cleared while other threads didn't
yet finish.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:52:54 +02:00
Peter Krempa
737f897c29 qemuBlockJobProcessEventConcludedBackup: Handle potentially NULL 'job->disk'
Similarly to other blockjob handlers, if there's no disk associated with
the blockjob the handler needs to behave correctly. This is needed as
the disk might have been de-associated on unplug or other operations.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:52:54 +02:00
Boris Fiuczynski
09cc83dcf6 nodedev: add ccw device state and remove fencing
Instead of fencing offline ccw devices add the state to the ccw
capability.

Resolves: https://issues.redhat.com/browse/RHEL-39497
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:38:46 +02:00
Boris Fiuczynski
69d8a327f1 nodedev: prevent invalid DASD node object creation
Prevent the creation of a new DASD node object when the device does not
exist.

Resolves: https://issues.redhat.com/browse/RHEL-39497
Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:34:54 +02:00
Boris Fiuczynski
e9c23d906f nodedev: improve DASD detection
In newer DASD driver versions the ID_TYPE tag is supported. This tag is
missing after a system reboot but when the ccw device is set offline and
online the tag is included. To fix this version independently we need to
check if devices detected as type disk is actually a DASD to maintain
the node object consistency and not end up with multiple node objects
for DASDs.

Resolves: https://issues.redhat.com/browse/RHEL-39497
Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:34:19 +02:00
Boris Fiuczynski
4062440b4b nodedev: refactor storage type fixup
Refactor the storage type fixup into a reusable method.

Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-06-20 09:33:32 +02:00
Michal Privoznik
43d2edc08f virnetworkobj: Free fwRemoval before setting another one in virNetworkObjSetFwRemoval()
The virNetworkObjSetFwRemoval() function is called at least two
times when there's a network running and network driver
initializes:

1) when loading state XMLs:
  #0  virNetworkObjSetFwRemoval (obj=0x7fffd4028250, fwRemoval=0x7fffd4020ad0) at ../src/conf/virnetworkobj.c:258
  #1  0x00007ffff7a69c68 in virNetworkLoadState (...) at ../src/conf/virnetworkobj.c:952
  #2  0x00007ffff7a6a35d in virNetworkObjLoadAllState (...) at ../src/conf/virnetworkobj.c:1072
  #3  0x00007ffff7f9625f in networkStateInitialize (...) at ../src/network/bridge_driver.c:624

2) when firewall rules are being reloaded:
  #0  virNetworkObjSetFwRemoval (obj=0x7fffd4028250, fwRemoval=0x7fffd402e5b0) at ../src/conf/virnetworkobj.c:258
  #1  0x00007ffff7f997b4 in networkReloadFirewallRulesHelper (obj=0x7fffd4028250, opaque=0x0) at ../src/network/bridge_driver.c:1703
  #2  0x00007ffff7a6b09b in virNetworkObjListForEachHelper (payload=0x7fffd4028250, ...) at ../src/conf/virnetworkobj.c:1414
  #3  0x00007ffff79287b6 in virHashForEachSafe (...) at ../src/util/virhash.c:387
  #4  0x00007ffff7a6b119 in virNetworkObjListForEach (...) at ../src/conf/virnetworkobj.c:1441
  #5  0x00007ffff7f99978 in networkReloadFirewallRules (...) at ../src/network/bridge_driver.c:1742
  #6  0x00007ffff7f962f2 in networkStateInitialize (...) at ../src/network/bridge_driver.c:645

Since virNetworkObjSetFwRemoval() does not free the object stored
in the first call, the second call just overwrites the stored
pointer leading to a memory leak:

  5,530 (48 direct, 5,482 indirect) bytes in 1 blocks are definitely lost in loss record 1,863 of 1,880
     at 0x4848C43: calloc (vg_replace_malloc.c:1595)
     by 0x4F1E979: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.7800.6)
     by 0x4976E32: virFirewallNew (virfirewall.c:118)
     by 0x4979BA9: virFirewallParseXML (virfirewall.c:1071)
     by 0x4ABEB1E: virNetworkLoadState (virnetworkobj.c:938)
     by 0x4ABF35C: virNetworkObjLoadAllState (virnetworkobj.c:1072)
     by 0x4E9A25E: networkStateInitialize (bridge_driver.c:624)
     by 0x4CB1FA6: virStateInitialize (libvirt.c:665)
     by 0x15A6C6: daemonRunStateInit (remote_daemon.c:611)
     by 0x49E69F0: virThreadHelper (virthread.c:256)
     by 0x532B428: start_thread (in /lib64/libc.so.6)
     by 0x5397373: clone (in /lib64/libc.so.6)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-06-19 16:31:23 +02:00