Commit 7c8e606b64c73ca56d7134cb16d01257f39c53ef attempted to fix
the specification of the ramfb property for vfio-pci devices, but it
failed when ramfb is explicitly set to 'off'. This is because only the
'vfio-pci-nohotplug' device supports the 'ramfb' property. Since we use
the base 'vfio-pci' device unless ramfb is enabled, attempting to set
the 'ramfb' parameter to 'off' this will result in an error like the
following:
error: internal error: QEMU unexpectedly closed the monitor
(vm='rhel'): 2024-06-06T04:43:22.896795Z qemu-kvm: -device
{"driver":"vfio-pci","host":"0000:b1:00.4","id":"hostdev0","display":"on
","ramfb":false,"bus":"pci.7","addr":"0x0"}: Property 'vfio-pci.ramfb'
not found.
This also more closely matches what is done for mdev devices.
Resolves: https://issues.redhat.com/browse/RHEL-28808
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This patch adds some previously missing test cases that test for
proper firewall rule creation when the following are included in the
network definition:
* <forward dev='blah'>
* no forward element (an "isolated" network)
* nat port range when only ipv4 is nat-ed
* nat port range when both ipv4 & ipv6 are nated
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Laine Stump <laine@redhat.com>
When the chain names and table name used by the nftables firewall
backend were changed in commit
958aa7f274904eb8e4678a43eac845044f0dcc38, I forgot to change the test
data file base.nftables, which has the extra "list" and "add
chain/table" commands that are generated for the first test case of
networkxml2firewalltest.c. When the full set of tests is run, the
first test will be an iptables test case, so those extra commands
won't be added to any of the nftables cases, and so the data in
base.nftables never matches, and the tests are all successful.
However, if the test are limited with, e.g. VIR_TEST_RANGE=2 (test #2
will be the nftables version of the 1st test case), then the commands
to add nftables table/chains *will* be generated in the test output,
and so the test will fail. Because I was only running the entire test
series after the initial commits of nftables tests, I didn't notice
this. Until now.
base.nftables has now been updated to reflect the current names for
chains/table, and running individual test cases is once again
successful.
Fixes: 958aa7f274904eb8e4678a43eac845044f0dcc38
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Laine Stump <laine@redhat.com>
The attribute 'discard_no_unref' of <disk/> is not allowed to be
changed while the virtual machine is running.
Resolves: https://issues.redhat.com/browse/RHEL-37542
Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
A user reported that if they set <forward mode='nat|route' dev='blah'>
starting the network would fail if the device 'blah' didn't already
exist.
This is caused by using "iif" and "oif" in nftables rules to check for
the forwarding device - these two commands work by saving the named
interface's ifindex (an unsigned integer) when the rule is added, and
comparing it to the ifindex associated with the packet's path at
runtime. This works great if the interface both 1) exists when the
rule is added, and 2) is never deleted and re-created after the rule
is added (since it would end up with a different ifindex).
When checking for the network's bridge device, it is okay for us to
use "iif" and "oif", because the bridge device is created before the
firewall rules are added, and will continue to exist until just after
the firewall rules are deleted when the network is shutdown.
But since the forward device might be deleted/re-added during the
lifetime of the network's firewall rules, we must instead us "oifname"
and "iifname" - these are much less efficient than "Xif" because they
do a string compare of the interface's name rather than just comparing
two integers (ifindex), but they don't require the interface to exist
when the rule is added, and they can properly cope with the named
interface being deleted and re-added later.
Fixes: a4f38f6ffe6a9edc001d18890ccfc3f38e72fb94
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
A few commits ago (v10.4.0-101-gc65eba1f57) I've introduced
virDomainDefLaunchSecurityValidate() and a switch() statement in
it. Some cases are empty but are lacking 'break' statement which
is not valid. Provide missing 'break' statement.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The firmware descriptors have 'amd-sev-snp` feature which
describes whether firmware is suitable for SEV-SNP guests.
Provide necessary implementation to detect the feature and pick
the right firmware if guest is SEV-SNP enabled.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Pretty straightforward as qemu has 'sev-snp-guest' object which
attributes maps pretty much 1:1 to our XML model. Except for
@vcek where QEMU has 'vcek-disabled`, an inverted boolean, while
we model it as virTristateBool. But that's easy to map too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
SEV-SNP is an enhancement of SEV/SEV-ES and thus it shares some
fields with it. Nevertheless, on XML level, it's yet another type
of <launchSecurity/>.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
This capability tracks sev-snp-guest object availability.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
In QEMU commit v9.0.0-1155-g59d3740cb4 the return type of
'query-sev' monitor command changed to accommodate SEV-SNP. Even
though we currently support launching plain SNP guests, this will
soon change.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
In a few instances there is a plain if() check for
_virDomainSecDef::sectype. While this works perfectly for now,
soon there'll be another type and we can utilize compiler to
identify all the places that need adaptation. Switch those if()
statements to switch().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The sectype member of _virDomainSecDef struct is already declared
as of virDomainLaunchSecurity type. There's no need to typecast
it to the very same type when passing it to switch().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
To avoid convolution of switch() inside of virDomainSecDefFormat() even
more (as new sectypes are added), move formatting into a separate
function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Some parts of SEV are to be shared with SEV SNP. In order to
reuse XML parsing / formatting code cleanly, let's move those
common bits into a new struct (virDomainSEVCommonDef) and adjust
rest of the code.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
While working on qemuMonitorJSONGetSEVMeasurement() and
qemuMonitorJSONGetSEVInfo() I've noticed that if these functions
fail, they do so without appropriate error set. Fill in error
reporting.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
When a VM terminates itself while it's being migrated in running state
libvirt would report wrong error:
error: cannot get locked memory limit of process 2502057: No such file or directory
rather than the proper error:
error: operation failed: domain is not running
Remember the error on error paths in qemuMigrationSrcConfirmPhase and
qemuMigrationSrcPerformPhase.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
'qemuProcessStop()' clears the 'current' job data. While the code under
the 'error' label in 'qemuMigrationSrcRun()' does check that the VM is
active before accessing the job, it also invokes multiple helper
functions to clean up the migration including
'qemuMigrationSrcNBDCopyCancel()' which calls 'qemuDomainObjWait()'
invalidating the result of the liveness check as it unlocks the VM.
Duplicate the liveness check and explain why. The rest of the code e.g.
accessing the monitor is safe as 'qemuDomainEnterMonitorAsync()'
performs a liveness check. The cleanup path just ignores the return
values of those functions.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The function is a pointless wrapper on top of
qemuMigrationDstWaitForCompletion.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Similarly to the one change in commit 4d1a1fdffda19a62d62fa2457d162362
we should be checking that the VM is not being yet destroyed if we've
invoked qemuDomainObjWait().
Use the new helper qemuDomainObjIsActive().
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The helper checks whether VM is active including the internal qemu
state. This helper will become useful in situations when an async job
is in use as VIR_JOB_DESTROY can run along async jobs thus both checks
are necessary.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Prevent the possibility that a VM could be considered as alive while
inside qemuProcessStop.
A recently fixed bug which unlocked the domain object while inside
qemuProcessStop showed that there's possibility to confuse the state of
the VM to be considered active while 'qemuProcessStop' is processing
shutdown of the VM. Ensure that this doesn't happen by clearing the
'beingDestroyed' flag only after the VM id is cleared.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
There are few function calls done while cleaning up a stopped VM which
do require the old VM id, to e.g. clean up paths containing the 'short'
domain name in the path.
Anything else, which doesn't strictly require it can be moved after
clearing the 'id' in order to decrease likelyhood of potential bugs.
This patch moves all the code which does not require the 'id' (except
for the log entry and closing the monitor socket) after the statement
clearing the id and adds a comment explaining that anything in the
section must not unlock the VM object.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
'qemuDomainObjStopWorker()' which is meant to dispose of the event loop
thread for the monitor unlocks the VM object while disposing the thread
to prevent possible deadlocks with events waiting on the monitor thread.
Unfortunately 'qemuDomainObjStopWorker()' is called *before* the VM is
marked as inactive by clearing 'vm->def->id', but at the same time it's
no longer marked as 'beingDestroyed' when we're inside
'qemuProcessStop()'.
If 'vm' would be kept locked this wouldn't be a problem. Same way it's
not a problem for anything that uses non-ASYNC VM jobs, or when the
monitor is accessed in an async job, as the 'destroy' job interlocks
with those.
It is a problem for code inside an async job which uses
'qemuDomainObjWait()' though. The API contract of qemuDomainObjWait()
ensures the caller that the VM on successful return from it, but in this
specific reason it's not the case, as both 'beingDestroyed' is already
false, and 'vm->def->id' is not yet cleared.
To fix the issue move the 'qemuDomainObjStopWorker()' call *after*
clearing 'vm->def->id' and also add a note stating what the function is
doing.
Fixes: 860a999802d3c82538373bb3f314f92a2e258754
Closes: https://gitlab.com/libvirt/libvirt/-/issues/640
Reported-by: luzhipeng <luzhipeng@cestc.cn>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Document why this function exists and meaning of return values.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Clear the 'disk' member of 'blockjob' as we're freeing the disk object
at this point. While this should not normally happen it was observed
when other bug allowed the VM to be cleared while other threads didn't
yet finish.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Similarly to other blockjob handlers, if there's no disk associated with
the blockjob the handler needs to behave correctly. This is needed as
the disk might have been de-associated on unplug or other operations.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Instead of fencing offline ccw devices add the state to the ccw
capability.
Resolves: https://issues.redhat.com/browse/RHEL-39497
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Prevent the creation of a new DASD node object when the device does not
exist.
Resolves: https://issues.redhat.com/browse/RHEL-39497
Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In newer DASD driver versions the ID_TYPE tag is supported. This tag is
missing after a system reboot but when the ccw device is set offline and
online the tag is included. To fix this version independently we need to
check if devices detected as type disk is actually a DASD to maintain
the node object consistency and not end up with multiple node objects
for DASDs.
Resolves: https://issues.redhat.com/browse/RHEL-39497
Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Refactor the storage type fixup into a reusable method.
Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The virNetworkObjSetFwRemoval() function is called at least two
times when there's a network running and network driver
initializes:
1) when loading state XMLs:
#0 virNetworkObjSetFwRemoval (obj=0x7fffd4028250, fwRemoval=0x7fffd4020ad0) at ../src/conf/virnetworkobj.c:258
#1 0x00007ffff7a69c68 in virNetworkLoadState (...) at ../src/conf/virnetworkobj.c:952
#2 0x00007ffff7a6a35d in virNetworkObjLoadAllState (...) at ../src/conf/virnetworkobj.c:1072
#3 0x00007ffff7f9625f in networkStateInitialize (...) at ../src/network/bridge_driver.c:624
2) when firewall rules are being reloaded:
#0 virNetworkObjSetFwRemoval (obj=0x7fffd4028250, fwRemoval=0x7fffd402e5b0) at ../src/conf/virnetworkobj.c:258
#1 0x00007ffff7f997b4 in networkReloadFirewallRulesHelper (obj=0x7fffd4028250, opaque=0x0) at ../src/network/bridge_driver.c:1703
#2 0x00007ffff7a6b09b in virNetworkObjListForEachHelper (payload=0x7fffd4028250, ...) at ../src/conf/virnetworkobj.c:1414
#3 0x00007ffff79287b6 in virHashForEachSafe (...) at ../src/util/virhash.c:387
#4 0x00007ffff7a6b119 in virNetworkObjListForEach (...) at ../src/conf/virnetworkobj.c:1441
#5 0x00007ffff7f99978 in networkReloadFirewallRules (...) at ../src/network/bridge_driver.c:1742
#6 0x00007ffff7f962f2 in networkStateInitialize (...) at ../src/network/bridge_driver.c:645
Since virNetworkObjSetFwRemoval() does not free the object stored
in the first call, the second call just overwrites the stored
pointer leading to a memory leak:
5,530 (48 direct, 5,482 indirect) bytes in 1 blocks are definitely lost in loss record 1,863 of 1,880
at 0x4848C43: calloc (vg_replace_malloc.c:1595)
by 0x4F1E979: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.7800.6)
by 0x4976E32: virFirewallNew (virfirewall.c:118)
by 0x4979BA9: virFirewallParseXML (virfirewall.c:1071)
by 0x4ABEB1E: virNetworkLoadState (virnetworkobj.c:938)
by 0x4ABF35C: virNetworkObjLoadAllState (virnetworkobj.c:1072)
by 0x4E9A25E: networkStateInitialize (bridge_driver.c:624)
by 0x4CB1FA6: virStateInitialize (libvirt.c:665)
by 0x15A6C6: daemonRunStateInit (remote_daemon.c:611)
by 0x49E69F0: virThreadHelper (virthread.c:256)
by 0x532B428: start_thread (in /lib64/libc.so.6)
by 0x5397373: clone (in /lib64/libc.so.6)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
As a part of parsing XML, virFirewallParseXML() calls
virXMLNodeContentString() and then passes the return value
further. But virXMLNodeContentString() is documented so that it's
the caller's responsibility to free the returned string, which
virFirewallParseXML() never does. This leads to a memory leak:
14,300 bytes in 220 blocks are definitely lost in loss record 1,879 of 1,891
at 0x4841858: malloc (vg_replace_malloc.c:442)
by 0x5491E3C: xmlBufCreateSize (in /usr/lib64/libxml2.so.2.12.6)
by 0x54C2401: xmlNodeGetContent (in /usr/lib64/libxml2.so.2.12.6)
by 0x49F7791: virXMLNodeContentString (virxml.c:354)
by 0x4979F25: virFirewallParseXML (virfirewall.c:1134)
by 0x4ABEB1E: virNetworkLoadState (virnetworkobj.c:938)
by 0x4ABF35C: virNetworkObjLoadAllState (virnetworkobj.c:1072)
by 0x4E9A25E: networkStateInitialize (bridge_driver.c:624)
by 0x4CB1FA6: virStateInitialize (libvirt.c:665)
by 0x15A6C6: daemonRunStateInit (remote_daemon.c:611)
by 0x49E69F0: virThreadHelper (virthread.c:256)
by 0x532B428: start_thread (in /lib64/libc.so.6)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Commit 23c47944882b added parsing of serial ports connected to vspc, but
the VM can also have a network serial port with an empty filename or no
filename at all. Parse these the same way, as a <serial type='null'>.
Resolves: https://issues.redhat.com/browse/RHEL-32182
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Remove unused declaration of the virDomainDiskFindByBusAndDst()
function. Removed in v5.9.0-rc1~91 and then mistakenly
re-introduced in v5.9.0-rc1~65.
Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Sometimes in release hook it is useful to know if the VM shutdown was graceful
or not. This is especially useful to do cleanup based on the VM shutdown failure
reason in release hook. This patch proposes to use the last argument 'extra'
to pass VM shutoff reason in the call to release hook.
Making this change for Qemu and LXC.
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Instead of accessing the global `driver` object pass the `udevEventData` as
parameter to the thread handler and watch callback. This has the advantage that:
1. proper refcounting
2. easier to read and test
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Use this feature in `udevEventDataNew`.
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
There is only one case where force is true, therefore let's inline that case.
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Use a worker pool for processing the events (e.g. udev, mdevctl config changes)
and the initialization instead of a separate initThread and a mdevctl-thread.
This has the large advantage that we can leverage the job API and now this
thread pool is responsible to do all the "costly-work" and emitting the libvirt
nodedev events.
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
It's better practice for all functions called by the threads to pass the driver
via parameter and not global variables. Easier to test and cleaner.
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
The documentation of gobject signals reads:
"If you are connecting handlers to signals and using a GObject instance as your
signal handler user data, you should remember to pair calls to
g_signal_connect() with calls to g_signal_handler_disconnect() or
g_signal_handlers_disconnect_by_func(). While signal handlers are automatically
disconnected when the object emitting the signal is finalised..." [1]
This means that the signal handlers are automatically disconnected as soon as
the `priv->mdevCtlMonitors` are finalised/released by `udevEventDataDispose`.
But this also means that it's possible that new work is tried to be scheduled
for the workerpool by the `mdevctlEventHandleCallback` (main thread context)
even if the workerpool has already been stopped by `nodeStateShutdownWait`. To
fully understand this, it's important to know that the main loop of the main
thread is still running for some time even after `nodeStateShutdownPrepare` has
been called. Let's avoid this situation by explicitly disconnect the signals
during `nodeStateShutdownPrepare`, which is called in the main thread, so that
no new work is attempted to be scheduled for the worker pool.
[1] https://docs.gtk.org/gobject/signals.html#memory-management-of-signal-handlers
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Introduce and use the driver functions for the node state shutdown preparation
and wait. As they're also called in the error/cleanup path of
`nodeStateInitialize`, they must be written in a way, that they can safely be
executed even if not everything is initialized.
In the next commit, these functions will be extended.
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Even if `priv->udev_monitor` was never initialized, the mdevctlLock, udevThread
were. Therefore let's match the order of releasing the resources the order of
allocating the resources in `nodeStateInitialize`.
In addition, use `g_steal_pointer` in `g_list_free_full`.
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>