35322 Commits

Author SHA1 Message Date
Sergio Durigan Junior
6fa82fd8e2 apparmor: Allow access to /sys/devices/system/node/*/cpumap for libnuma
A QEMU change (10218ae6d006f76410804cc4dc690085b3d008b5) introduced
some libnuma calls that require read access to
/sys/devices/system/node/*/cpumap, which currently is forbidden by the
standard apparmor profile.

This commit allows read-only access to the file specified above.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/515

Signed-off-by: Sergio Durigan Junior <sergio.durigan@canonical.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
2024-01-11 15:15:23 -07:00
Michal Privoznik
9cbda34e98 qemu: Be less aggressive when dropping channel source paths
In v9.7.0-rc1~130 I've shortened the path that's generated for
<channel/> source. With that, I had to adjust regex that matches
all versions of paths we have ever generated so that we can drop
them (see comment around qemuDomainChrDefDropDefaultPath()). But
as it is usually the case with regexes - they are write only. And
while I attempted to make one portion of the path optional
("/target/") I accidentally made regex accept more, which
resulted in libvirt dropping the user provided path and
generating our own instead.

Fixes: d3759d3674ab9453e5fb5a27ab6c28b8ff8d5569
Resolves: https://issues.redhat.com/browse/RHEL-20807
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-01-10 08:44:32 +01:00
Daniel P. Berrangé
7cb03e6a28 rpc: fix race in waking up client event loop
The first thread to issue a client RPC request will own the event
loop execution, sitting in the virNetClientIOEventLoop function.

It releases the client lock while running:

   virNetClientUnlock()
   g_main_loop_run()
   virNetClientLock()

If a second thread arrives with an RPC request, it will queue it
for the first thread to process. To inform the first thread that
there's a new request it calls g_main_loop_quit() to break it out
of the main loop.

This works if the first thread is in g_main_loop_run() at that
time. There is a small window of opportunity, however, where
the first thread has released the client lock, but not yet got
into g_main_loop_run(). If that happens, the wakeup from the
second thread is lost.

This patch deals with that by changing the way the wakeup is
performed. Instead of directly calling g_main_loop_quit(), the
second thread creates an idle source to run the quit function
from within the first thread. This guarantees that the first
thread will see the wakeup.

Tested by: Fima Shevrin <efim.shevrin@virtuozzo.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-01-09 11:58:41 +00:00
Daniel P. Berrangé
024d6dc263 qemu: tighten semantics of 'size' when resizing block devices
When VIR_DOMAIN_BLOCK_RESIZE_CAPACITY is set, the 'size' parameter
is currently ignored. Since applications must none the less pass a
value for this parameter, it is preferrable to declare some explicit
semantics for it.

This declare that the parameter must be 0, or the exact size of the
underlying block device. The latter gives the management application
the ability to sanity check that the block device size matches what
they think it should be.

Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2024-01-09 11:57:13 +00:00
Jiri Denemark
8d693d79c4 qemu: Enable postcopy-preempt migration capability
During post-copy migration (once it actually switches to post-copy mode)
dirty memory pages are continued to be migrated iteratively, while the
destination can explicitly request a specific page to be migrated before
the iterative process gets to it (which happens when a guest wants to
read a page that was not migrated yet). Without the postcopy-preempt
capability enabled such pages need to wait until all other pages already
queued are transferred. Enabling this capability will instruct the
hypervisor to create a separate migration channel for explicitly
requested pages so that they can preempt the queue.

The only requirement for the feature to work is running a migration over
a protocol that supports multiple connections. In other words, we can't
pre-create the connection and pass its file descriptor to QEMU (i.e.,
using MIGRATION_DEST_CONNECT_SOCKET), but we have to let QEMU open the
connections itself (using MIGRATION_DEST_SOCKET). This change is applied
to all post-copy migrations even if postcopy-preempt is not supported to
avoid making the code even uglier than it is now. There's no real
difference between the two methods with modern QEMU (which can properly
report connection failures) anyway.

This capability is enabled for all post-copy migration as long as the
capability is supported on both sides of migration.

https://issues.redhat.com/browse/RHEL-7100

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 22:41:23 +01:00
Jiri Denemark
61e34b0856 qemu: Add support for optional migration capabilities
We enable various migration capabilities according to the flags passed
to a migration API. Missing support for such capabilities results in an
error because they are required by the corresponding flag. This patch
adds support for additional optional capability we may want to enable
for a given API flag in case it is supported. This is useful for
capabilities which are not critical for the flags to be supported, but
they can make things work better in some way.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 22:39:56 +01:00
Jiri Denemark
efc26a665d qemu: Rename remoteCaps parameter in qemuMigrationParamsCheck
The migration cookie contains two bitmaps of migration capabilities:
supported and automatic. qemuMigrationParamsCheck expects the letter so
lets make it more obvious by renaming the parameter as remoteAuto.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 22:38:53 +01:00
Jiri Denemark
c941106f7c qemu: Use C99 initializers for qemuMigrationParamsFlagMap
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 22:38:36 +01:00
Jiri Denemark
ff128d3761 qemu: Document qemuMigrationParamsFlagMapItem fields
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 22:38:24 +01:00
Peter Krempa
397218c433 qemu: Implement support for configuring iothread to virtqueue mapping for disks
Add validation and formatting of the commandline arguments for
'iothread-vq-mapping' parameter. The validation logic mirrors what qemu
allows.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-08 09:27:31 +01:00
Peter Krempa
0cb7b1b2c3 conf: Add possibility to configure multiple iothreads per disk
Introduce a new <iothreads> sub-element of disk's <driver> which will
allow configuring multiple iothreads and also map them to specific
virt-queues of virtio devices.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-08 09:27:31 +01:00
Peter Krempa
ee7121ab8e qemu: capabilities: Introduce QEMU_CAPS_VIRTIO_BLK_IOTHREAD_MAPPING
The capability represents the support for mapping virtqueues to
iothreads for the 'virtio-blk' device.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-08 09:27:31 +01:00
Peter Krempa
08a7fc834c util: xml: Return GPtrArray from virXMLNodeGetSubelement
Rework the helper to use a GPtrArray structure to simplify callers.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2024-01-08 09:27:31 +01:00
Laine Stump
82e2fac297 qemu: automatically bind to a vfio variant driver, if available
Rather than always binding to the vfio-pci driver, use the new
function virPCIDeviceFindBestVFIOVariant() to see if the running
kernel has a VFIO variant driver available that is a better match for
the device, and if one is found, use that instead.

virPCIDeviceFindBestVFIOVariant() function reads the modalias file for
the given device from sysfs, then looks through
/lib/modules/${kernel_release}/modules.alias for the vfio_pci alias
that matches with the least number of wildcard ('*') fields.

The appropriate "VFIO variant" driver for a device will be the PCI
driver implemented by the discovered module - these drivers are
compatible with (and provide the entire API of) the standard vfio-pci
driver, but have additional device-specific APIs that can be useful
for, e.g., saving/restoring state for migration.

If a specific driver is named (using <driver model='blah'/> in the
device XML), that will still be used rather than searching
modules.alias; this makes it possible to force binding of vfio-pci if
there is an issue with the auto-selected variant driver.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 01:00:11 -05:00
Laine Stump
8b93d78c83 conf: support manually specifying VFIO variant driver in <hostdev> XML
This patch makes it possible to manually specify which VFIO variant
driver to use for PCI hostdev device assignment, so that, e.g. you
could force use of a VFIO "variant" driver, with e.g.

  <driver model='mlx5_vfio_pci'/>

or alternately to force use of the generic vfio-pci driver with

  <driver model='vfio-pci'/>

when libvirt would have normally (after applying a subsequent patch)
found a "better match" for a device in the active kernel's
modules.alias file. (The main potential use of this manual override
would probably be to work around a bug in a new VFIO variant driver by
temporarily not using that driver).

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-08 01:00:08 -05:00
Laine Stump
9363c1cb69 xen: explicitly set hostdev driver.name at runtime, not in postparse
Xen only supports a single type of PCI hostdev assignment, so it is
superfluous to have <driver name='xen'/> peppered throughout the
config. It *is* necessary to have the driver type explicitly set in
the hostdev object before calling into the hypervisor-agnostic "hostdev
manager" though (otherwise the hostdev manager doesn't know whether it
should do Xen-specific setup, or VFIO-specific setup).

Historically, the Xen driver has checked for "default" driver name
(i.e. not set in the XML), and set it to "xen', during the XML
postparse, thus guaranteeing that it will be set by the time the
object is sent to the hostdev manager at runtime, but also setting it
so early that a simple round-trip of parse-format results in the XML
always containing an explicit <driver name='xen'/>, even if that
wasn't specified in the original XML.

The QEMU driver *doesn't* set driver.name during postparse though;
instead, it waits until domain startup time (or device attach time for
hotplug), and sets the driver.name then. The result is that a
parse-format round trip of the XML in the QEMU driver *doesn't* add in
the <driver name='vfio'/>.

This patch modifies the Xen driver to behave similarly to the QEMU
driver - the PostParse just checks for a driver.name that isn't
supported by the Xen driver, and any explicit setting to "xen" is
deferred until domain runtime rather than during the postparse, thus
Xen domain XML also doesn't get extraneous <driver name='xen'/>.

This delayed setting of driver.name of course results in slightly
different xml2xml parse-format results, so the unit test data is
modified accordingly.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-07 23:59:00 -05:00
Laine Stump
b9a1e7c436 conf: replace virHostdevIsVFIODevice with virHostdevIsPCIDevice
virHostdevIsVFIODevice() and virDomainDefHasVFIOHostdev() are only ever
called from the QEMU driver, and in the case of the QEMU driver, any
PCI hostdev by definition uses VFIO, so really all these callers only
need to know if the device is a PCI hostdev.

(It turned out that the less specific virHostdevIsPCIDevice() already
existed in hypervisor/virhostdev.c, so I had to remove one of them;
since conf is a lower level directory than hypervisor, and the
function is called from conf, keeping the copy in hypervisor would
have required moving its caller (virDomainDefHasPCIHostdev()) into
hypervisor as well, so I just removed the copy in hypervisor.)

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-07 23:58:44 -05:00
Laine Stump
bb1acb9ca2 conf: use new common parser/formatter for hostdev driver in network XML
Now if a new attribute is added to <driver>, we only need to update
the formatting/parsing in one place.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-07 23:57:09 -05:00
Laine Stump
195522ae87 conf: split out hostdev <driver> parse/format to their own functions
This is done so that we can re-use the same parser/formatter for
<network> and <networkport>

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-07 23:57:09 -05:00
Laine Stump
8bc3f01080 conf: use virDeviceHostdevPCIDriverInfo in network and networkport objects
The next step in consolidating parsing/formatting of the <driver>
element of these objects using a common struct and common code. This
eliminates the virNetworkForwardDriverNameType enum which is nearly
identical to virDeviceHostdevPCIDriverName (the only non-identical bit
was just because they'd gotten out of sync over time) and replaces its
uses with a virDeviceHostdevPCIDriverInfo (which is a struct that
contains a virDeviceHostdevPCIDriverName).

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-07 23:57:09 -05:00
Laine Stump
e04ca000bd conf: put hostdev PCI backend into a struct
The new struct is virDeviceHostdevPCIDriverInfo, and the "backend"
enum in the hostdevDef will be replaced with a
virDeviceHostdevPCIDriverInfo named "driver'. Since the enum value in
this new struct is called "name", it means that all references to
"backend" will become "driver.name".

This will allow easily adding other items for new attributes in the
<driver> element / C struct, which will be useful once we are using
this new struct in multiple places.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-07 23:57:09 -05:00
Laine Stump
e7d31d8b00 conf: normalize hostdev <driver> parsing to simplify adding new attr
The hostdev version of the <driver> subelement appears in four places:

 * The domain XML in the <hostdev> and <interface type='hostdev'>
   elements (that's 2)

 * The network XML inside <forward> when the network is a pool of
   SRIOV VFs

 * the <networkport> XML, which is used to communicate between the
   hypervisor driver and network driver.

In order to make the pending addition of a new attribute to <driver>
in all these cases simpler, this patch refactors the parsing of
<driver> in all four places to use virXMLProp*() and
virXMLFormatElement().

Making all of the different instances of the separate parse/format for
<driver> look nearly identical will make it easier to see that the
upcoming patch that converges all four to use a common
parser/formatter is a functional NOP.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-07 23:57:09 -05:00
Laine Stump
a435e7e6c8 conf: move/rename hostdev PCI driver type enum to device_conf.h
Currently this enum is defined in domain_conf.h and named
virDomainHostdevSubsysPCIDriverType. I want to use it in parts of the
network and networkport config, so am moving its definition to
device_conf.h which is / can be included by all interested parties,
and renaming it to match the name of the corresponding XML attribute
("driver name"). The name change (which includes enum values) does cause a
lot of churn, but it's all mechanical.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-07 23:57:09 -05:00
Laine Stump
deefaf8f1c schema: consolidate RNG for all hostdev <driver> elements
The exact same element can appear in <hostdev> and <interface
type='hostdev'>, and nearly identical in <network> and <networkport>
(these latter two don't include "xen" as a possible driver, but that's
coincidental - there's no reason Xen couldn't also use the VF pools in
virtual networks, it just doesn't).

This patch modifies all 4 to use the same <ref name="hostdevDriver"/>
so that it is simpler to add something new.

A side effect of this patch is that the grammar for the <interface>
element in domain XML has been tightened up a bit - previously it was
accepted by the schema (but nonsensical) to have virtio and network
interface options specified; as a part of making the two different
<driver> choices each a complete element (rather than each being a
collection of attributes and subelements) these extra
attributes/subelements that were irrelevant to the hostdev-type
<driver> were made to be valid only for an emulated interface's
<driver>.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-07 23:57:09 -05:00
Laine Stump
568efef729 util: properly deal with VFIO module name vs. driver name
Historically libvirt hasn't differentiated between the name of a
loadable kernel module, and the name of the device driver that module
implements, but these two names can be (and usually are) at least
subtly different.

For example, the loadable module called "vfio_pci" implements a PCI
driver called "vfio-pci". We have always used the name "vfio-pci" both
to load the module (with modprobe) and to check (in
/sys/bus/pci/drivers) if the driver is available. (This has happened
to work because modprobe "normalizes" all the names it is given by
replacing "-" with "_", so "vfio-pci" works for both loading the
module and checking for the driver.)

When we recently gained the ability to manually specify the driver for
"virsh nodedev-detach", the fragility of this system became apparent -
if a user gave the "driver name" as "vfio_pci", then we would modprobe
the module correctly, but then erroneously believe it hadn't been
loaded because /sys/bus/pci/drivers/vfio_pci didn't exist. For manual
specification of the driver name, we could deal with this by telling
the user "always use the correct name for the driver, don't assume
that it has the same name as the module", but it would still end up
confusing people, especially since some drivers do use underscore in
their name (e.g. the mlx5_vfio_pci driver/module).

This will only get worse when an upcoming patch starts automatically
determining the driver to use for VFIO-assigned devices - it will look
in the kernel's modules.alias file to find "best" VFIO variant
*module* for a device, and 3 out of 4 current examples of
vfio-pci/variant drivers have a mismatch between module name and
driver name, so the current code would end up properly loading the
module, but then erroneously think that the driver wasn't available.

This patch makes the code more forgiving by

1) checking for both $drivername and underscore($drivername) in
   /sys/bus/pci/drivers

2) when we determine a module needs to be loaded, look at the link in
   /sys/module/$modulename/driver/pci:$drivername to determine the
   name of the driver we need to bind to the device(rather than just
   assuming the driver has the same name as the module

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2024-01-07 23:57:08 -05:00
Peter Krempa
1948244461 qemuxml2argvmock: Mock virNetDevSetMTU
Unfortunately the network backend commandline formatter attempts to also
setup the backend itself, which it really should not.

For now make sure qemuxml2argvtest can call virNetDevSetMTU.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-01-04 22:26:10 +01:00
Peter Krempa
2da71d8e43 qemu: process: Separate setup of network device objects
Separate the SLIRP bits from 'qemuProcessNetworkPrepareDevices' and do
the setup of the internal data when setting up domain data.

This will allow tests to use the same code path to lookup data for a
network.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-01-04 22:26:10 +01:00
Peter Krempa
b448abd972 qemuxml2argvmock: Mock qemuInterfaceBridgeConnect
Prepare for test cases which would want to call that function.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-01-04 22:26:10 +01:00
Jonathon Jongsma
9eabf14afb qemu: add runtime config option for nbdkit
Currently when we build with nbdkit support, libvirt will always try to
use nbdkit to access remote disk sources when it is available. But
without an up-to-date selinux policy allowing this, it will fail.
because the required selinux policies are not yet widely available, we
have disabled nbdkit support on rpm builds for all distributions before
Fedora 40.

Unfortunately, this makes it more difficult to test nbdkit support.
After someone updates to the necessary selinux policies, they would also
need to rebuild libvirt to enable nbdkit support. By introducing a
configure option (nbdkit_config_default), we can build packages with
nbdkit support but have it disabled by default.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Suggested-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2024-01-04 14:34:40 -06:00
Artem Chernyshev
a43fb797b5 node_device: udevGetStringSysfsAttr() to void
udevGetStringSysfsAttr() return value is invariant, so change it
type and remove all dependent checks.

Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-01-04 17:06:30 +01:00
Artem Chernyshev
0e37f55bb1 node_device: udevTranslatePCIIds() to void
udevTranslatePCIIds() return value is invariant, so change it
type and remove all dependent checks.

Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-01-04 17:06:24 +01:00
Artem Chernyshev
d05cdd1879 virprocess: virProcessGetNamespaces() to void
virProcessGetNamespaces() return value is invariant, so change it
type and remove all dependent checks.

Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-01-04 17:06:14 +01:00
Artem Chernyshev
88903f9abf conf: virDomainNetUpdate() to void
virDomainNetUpdate() return value is invariant, so change it type
and remove all dependent checks.

Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-01-04 17:06:04 +01:00
Artem Chernyshev
270e363046 lxc: virLXCControllerAddConsole() to void
virLXCControllerAddConsole() return value is invariant, so change
it type and remove all dependent checks.

Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-01-04 17:05:52 +01:00
Artem Chernyshev
bce48d99a7 rpc: virnetserver: virNetServerAddService() to void
virNetServerAddService() return value is invariant, so change it
type and remove all dependent checks.

Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-01-04 17:05:34 +01:00
Artem Chernyshev
46c9458654 cpu: : virCPUx86DataAddItem() to void
virCPUx86DataAddItem() return value is invariant, so change it
type and remove all dependent checks.

Functions changed to void:

virCPUx86DataAddItem()
x86DataAdd()
virCPUx86DataAdd()
x86DataAddSignature()
virCPUx86DataSetSignature()
libxlCapsAddCPUID()
cpuidSetLeaf4()
cpuidSetLeaf7()
cpuidSetLeafB()
cpuidSetLeafD()
cpuidSetLeafResID()
cpuidSetLeaf12()
cpuidSetLeaf14()
cpuidSetLeaf17()

Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-01-04 17:05:02 +01:00
Guoyi Tu
dd2f36d66e qemu_driver: Don't handle the EOF event if vm get restarted
Currently, libvirt creates a thread pool with only on thread to handle all
qemu monitor events for virtual machines, In the cases that if the thread
gets stuck while handling a monitor EOF event, such as unable to kill the
virtual machine process or release resources, the events of other virtual
machine will be also blocked, which will lead to the abnormal behavior of
other virtual machines.

For instance, when another virtual machine completes a shutdown operation
and the monitor EOF event has been queued but remains unprocessed, we
immediately destroy and start the virtual machine again, at a later time
when EOF event get processed, the processMonitorEOFEvent() will kill the
virtual machine that just started.

To address this issue, in the processMonitorEOFEvent(), we check whether
the current virtual machine's id is equal to the the one at the time
the event was generated. If they do not match, we immediately return.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
Signed-off-by: dengpengcheng <dengpc12@chinatelecom.cn>
2024-01-03 17:13:23 +00:00
Jonathan Wright
c9056e682a conf: Restore setting default bus for input devices
Prior to v9.3.0-rc1~30 we used to set default bus for <input/>
devices, during XML parsing. In the commit this code was moved to
a post parse callback. But somehow the line that sets the bus in
one specific case disappeared. Bring it back.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/577
Fixes: c4bc4d3b82fbe22e03c986ca896090f481df5c10
Signed-off-by: Jonathan Wright <jonathan@almalinux.org>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2024-01-03 17:22:04 +01:00
Martin Kletzander
fbf5fc0fb2 Improve error message in remoteGetUNIXSocket
By adding a link to an explanation in the kbase.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2024-01-03 15:52:33 +01:00
Han Han
b72d7c46e5 qemu: Replace the deprecated short-formed option "unix"
Change to the boolean option "unix=on"

Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-12-21 12:21:10 +01:00
Egor Makrushin
c3a8d04980 conf: fix integer overflow in virDomainControllerDefParseXML
Multiplication results in integer overflow.
Thus, replace it with ULLONG_MAX and change
def->opts.pciopts.pcihole64size type to ULL.
Update variable usage according to new type.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Egor Makrushin <emakrushin@astralinux.ru>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-12-20 15:57:07 +01:00
Ján Tomko
49f1406de8 remote: DeserializeDomainDiskErrors: remove dead code
As of commit b2d079c113a which converted this function to use g_strdup,
the error label is only reached when i = 0, rendering it useless.

Remove it.

Fixes: https://gitlab.com/libvirt/libvirt/-/issues/572
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-12-19 17:07:29 +01:00
Jim Fehlig
405f479d0e apparmor: Add capabilities for PCI passthrough to virtxend profile
When splitting out the apparmor modular daemon profiles from the
libvirtd profile, the net_admin and sys_admin capabilities were
dropped from the virtxend profile. It was not known at the time
that these capabilities were needed for PCI passthrough. Without
the capabilities, the following messages are emitted from the audit
subsystem

audit: type=1400 audit(1702939277.946:63): apparmor="DENIED" \
operation="capable" class="cap" profile="virtxend" pid=3611 \
comm="rpc-virtxend" capability=21  capname="sys_admin"
audit: type=1400 audit(1702940304.818:63): apparmor="DENIED" \
operation="capable" class="cap" profile="virtxend" pid=3731 \
comm="rpc-virtxend" capability=12  capname="net_admin"

It appears sys_admin is needed to simply read from the PCI dev's
sysfs config file. The net_admin capability is needed when setting
the MAC address of an SR-IOV virtual function.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-12-19 08:53:32 -07:00
Peter Krempa
19ce02c773 qemuDomainBlockPeek: Fix format checking logic
Recent refactor which changed the format check to use
qemuBlockStorageSourceIsRaw accidentaly inverted the condition.

Caught by the CI test suite.

Fixes: b600b69f8291226095e0fdb2eb7503baa79063fc
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2023-12-16 09:20:41 +01:00
Ján Tomko
1a4412f568 qemu: virtiofs: auto-fill idmap for unprivileged use
If the user did not specify any uid mapping, map its own
user ID to ID 0 inside the container and the rest of the IDs
to the first found user's authorized range in /etc/sub[ug]id

https://issues.redhat.com/browse/RHEL-7386
https://gitlab.com/libvirt/libvirt/-/issues/535

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-12-14 17:10:22 +01:00
Ján Tomko
2ef4be0a3e util: add virGetSubUIDs
A function for parsing /etc/sub[ug]id

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-12-14 17:10:22 +01:00
Ján Tomko
42edb10f17 qemu: allow running virtiofsd in session mode
https://gitlab.com/libvirt/libvirt/-/issues/535

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-12-14 17:10:22 +01:00
Ján Tomko
d27b6e5f49 qemu: virtiofs: do not force UID 0
Remove the explicit setting of uid 0 when running virtiofsd.

It is not required for privileged mode, where virtiofsd will be run
as root anyway. And for unprivileged mode, virtiofsd no longer requires
to be run as root.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-12-14 17:10:22 +01:00
Ján Tomko
bdf96a0f72 qemu: format uid/gid map for virtiofs
Pass the ID map to virtiofsd, which will run the suid `newuidmap`
binary for us.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-12-14 17:10:22 +01:00
Ján Tomko
6de2068dd6 conf: add idmap element to filesystem
Allow the user to manually tweak the ID mapping that will allow
virtiofsd to run unprivileged.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-12-14 17:10:22 +01:00