Rewrite the function so that it's more compact and easier to
extend as new architectures, which will likely come with
multibus support right out the gate, are introduced.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It belongs next to qemuDomainSupportsPCI().
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The way the function is currently written sort of obscures this
fact, but ultimately we already unconditionally assume PCI
support on most architectures.
Arm and RISC-V need some additional checks to maintain
compatibility with existing configurations but for all future
architectures, such as the upcoming LoongArch64, we expect PCI
support to come out of the box.
Last but not least, the functions is made const-correct.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It's no longer used anywhere.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
For all versions of QEMU that we support, the virt machine type
has a hard dependency on this device, so we can stop checking
whether the capability is present and just use it unconditionally.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The bus name for the default PHB is always "pci.0".
Fixes: 937f319536723fec57ad472b002a159d0f67a77c
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Assigning a PCI address needs to also assign any extension addresses
right away. Otherwise they'd be assigned only after subsequent
format->parse cycle and thus be potentially missing on first run after
defining the VM and thus could change.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The 'size' of a 'shmem' device is parsed and formatted as a "scaled"
value, stored in bytes, but the formatting scale is mebibytes. This
precission loss combined with the fact that the value was validated only
when starting and the size is formatted only when non-zero meant that
on first parse a value < 1 MiB would be accepted, but would be formatted
to the XML as 0 MiB as it was non-zero but truncated and a subsequent
parse would parse of such XML would parse it as 0 bytes, which in turn
would be interpreted as 'default' size.
Fix the issue by moving the validator, which ensures that the number is
a power of two and more than 1 MiB to the validator code so that it'll
be rejected at XML parsing time.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This makes it so libvirt can obtain accurate information about
guest CPUs from QEMU, and should make it possible to correctly
perform operations such as CPU hotplug.
Of course this is mostly moot at the moment: only aarch64 can use
CPU clusters, and CPU hotplug is not yet implemented on that
architecture.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The default number of CPU clusters is 1, and values other than
that one are currently rejected by all hypervisor drivers.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In v9.7.0-rc1~130 I've shortened the path that's generated for
<channel/> source. With that, I had to adjust regex that matches
all versions of paths we have ever generated so that we can drop
them (see comment around qemuDomainChrDefDropDefaultPath()). But
as it is usually the case with regexes - they are write only. And
while I attempted to make one portion of the path optional
("/target/") I accidentally made regex accept more, which
resulted in libvirt dropping the user provided path and
generating our own instead.
Fixes: d3759d3674ab9453e5fb5a27ab6c28b8ff8d5569
Resolves: https://issues.redhat.com/browse/RHEL-20807
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
When VIR_DOMAIN_BLOCK_RESIZE_CAPACITY is set, the 'size' parameter
is currently ignored. Since applications must none the less pass a
value for this parameter, it is preferrable to declare some explicit
semantics for it.
This declare that the parameter must be 0, or the exact size of the
underlying block device. The latter gives the management application
the ability to sanity check that the block device size matches what
they think it should be.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
During post-copy migration (once it actually switches to post-copy mode)
dirty memory pages are continued to be migrated iteratively, while the
destination can explicitly request a specific page to be migrated before
the iterative process gets to it (which happens when a guest wants to
read a page that was not migrated yet). Without the postcopy-preempt
capability enabled such pages need to wait until all other pages already
queued are transferred. Enabling this capability will instruct the
hypervisor to create a separate migration channel for explicitly
requested pages so that they can preempt the queue.
The only requirement for the feature to work is running a migration over
a protocol that supports multiple connections. In other words, we can't
pre-create the connection and pass its file descriptor to QEMU (i.e.,
using MIGRATION_DEST_CONNECT_SOCKET), but we have to let QEMU open the
connections itself (using MIGRATION_DEST_SOCKET). This change is applied
to all post-copy migrations even if postcopy-preempt is not supported to
avoid making the code even uglier than it is now. There's no real
difference between the two methods with modern QEMU (which can properly
report connection failures) anyway.
This capability is enabled for all post-copy migration as long as the
capability is supported on both sides of migration.
https://issues.redhat.com/browse/RHEL-7100
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
We enable various migration capabilities according to the flags passed
to a migration API. Missing support for such capabilities results in an
error because they are required by the corresponding flag. This patch
adds support for additional optional capability we may want to enable
for a given API flag in case it is supported. This is useful for
capabilities which are not critical for the flags to be supported, but
they can make things work better in some way.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The migration cookie contains two bitmaps of migration capabilities:
supported and automatic. qemuMigrationParamsCheck expects the letter so
lets make it more obvious by renaming the parameter as remoteAuto.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Add validation and formatting of the commandline arguments for
'iothread-vq-mapping' parameter. The validation logic mirrors what qemu
allows.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The capability represents the support for mapping virtqueues to
iothreads for the 'virtio-blk' device.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
virHostdevIsVFIODevice() and virDomainDefHasVFIOHostdev() are only ever
called from the QEMU driver, and in the case of the QEMU driver, any
PCI hostdev by definition uses VFIO, so really all these callers only
need to know if the device is a PCI hostdev.
(It turned out that the less specific virHostdevIsPCIDevice() already
existed in hypervisor/virhostdev.c, so I had to remove one of them;
since conf is a lower level directory than hypervisor, and the
function is called from conf, keeping the copy in hypervisor would
have required moving its caller (virDomainDefHasPCIHostdev()) into
hypervisor as well, so I just removed the copy in hypervisor.)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The new struct is virDeviceHostdevPCIDriverInfo, and the "backend"
enum in the hostdevDef will be replaced with a
virDeviceHostdevPCIDriverInfo named "driver'. Since the enum value in
this new struct is called "name", it means that all references to
"backend" will become "driver.name".
This will allow easily adding other items for new attributes in the
<driver> element / C struct, which will be useful once we are using
this new struct in multiple places.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Currently this enum is defined in domain_conf.h and named
virDomainHostdevSubsysPCIDriverType. I want to use it in parts of the
network and networkport config, so am moving its definition to
device_conf.h which is / can be included by all interested parties,
and renaming it to match the name of the corresponding XML attribute
("driver name"). The name change (which includes enum values) does cause a
lot of churn, but it's all mechanical.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Separate the SLIRP bits from 'qemuProcessNetworkPrepareDevices' and do
the setup of the internal data when setting up domain data.
This will allow tests to use the same code path to lookup data for a
network.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Prepare for test cases which would want to call that function.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Currently when we build with nbdkit support, libvirt will always try to
use nbdkit to access remote disk sources when it is available. But
without an up-to-date selinux policy allowing this, it will fail.
because the required selinux policies are not yet widely available, we
have disabled nbdkit support on rpm builds for all distributions before
Fedora 40.
Unfortunately, this makes it more difficult to test nbdkit support.
After someone updates to the necessary selinux policies, they would also
need to rebuild libvirt to enable nbdkit support. By introducing a
configure option (nbdkit_config_default), we can build packages with
nbdkit support but have it disabled by default.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Suggested-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
virProcessGetNamespaces() return value is invariant, so change it
type and remove all dependent checks.
Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
virDomainNetUpdate() return value is invariant, so change it type
and remove all dependent checks.
Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
virCPUx86DataAddItem() return value is invariant, so change it
type and remove all dependent checks.
Functions changed to void:
virCPUx86DataAddItem()
x86DataAdd()
virCPUx86DataAdd()
x86DataAddSignature()
virCPUx86DataSetSignature()
libxlCapsAddCPUID()
cpuidSetLeaf4()
cpuidSetLeaf7()
cpuidSetLeafB()
cpuidSetLeafD()
cpuidSetLeafResID()
cpuidSetLeaf12()
cpuidSetLeaf14()
cpuidSetLeaf17()
Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Currently, libvirt creates a thread pool with only on thread to handle all
qemu monitor events for virtual machines, In the cases that if the thread
gets stuck while handling a monitor EOF event, such as unable to kill the
virtual machine process or release resources, the events of other virtual
machine will be also blocked, which will lead to the abnormal behavior of
other virtual machines.
For instance, when another virtual machine completes a shutdown operation
and the monitor EOF event has been queued but remains unprocessed, we
immediately destroy and start the virtual machine again, at a later time
when EOF event get processed, the processMonitorEOFEvent() will kill the
virtual machine that just started.
To address this issue, in the processMonitorEOFEvent(), we check whether
the current virtual machine's id is equal to the the one at the time
the event was generated. If they do not match, we immediately return.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
Signed-off-by: dengpengcheng <dengpc12@chinatelecom.cn>
Multiplication results in integer overflow.
Thus, replace it with ULLONG_MAX and change
def->opts.pciopts.pcihole64size type to ULL.
Update variable usage according to new type.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Egor Makrushin <emakrushin@astralinux.ru>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Recent refactor which changed the format check to use
qemuBlockStorageSourceIsRaw accidentaly inverted the condition.
Caught by the CI test suite.
Fixes: b600b69f8291226095e0fdb2eb7503baa79063fc
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
If the user did not specify any uid mapping, map its own
user ID to ID 0 inside the container and the rest of the IDs
to the first found user's authorized range in /etc/sub[ug]id
https://issues.redhat.com/browse/RHEL-7386https://gitlab.com/libvirt/libvirt/-/issues/535
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Remove the explicit setting of uid 0 when running virtiofsd.
It is not required for privileged mode, where virtiofsd will be run
as root anyway. And for unprivileged mode, virtiofsd no longer requires
to be run as root.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Pass the ID map to virtiofsd, which will run the suid `newuidmap`
binary for us.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Until now resizing a disk with a storage slice would break in one of the
following ways:
1) for a non-raw format, the virtual size would change, but the slice
would still remain in place
2) for raw disks qemu would refuse to change the size
The only reasonable scenario we want to support is a 'raw' image with 0
offset (inside a block device), where we can just drop the slice.
Anything else comes from a non-standard storage setup that we don't want
to touch.
To facilitate the resize, we first remove the 'size' parameter in qemu
thus dropping the slice and then instructing qemu to resize the disk.
Resolves: https://issues.redhat.com/browse/RHEL-18782
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Prepare the blockdev props formatter to skip formatting the slice props
in case they are not applicable.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Rather than pulling the configuration of the storage slice into the
'format' layer make the 'slice' layer effective for raw disks with a
storage slice. This was made possible by the recent refactors which made
the 'format' layer optional if not needed.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Resizing of block-backed storage requires the user to pass the exact
capacity of the device. Implement code which will query it instead so
the user doesn't need to do that.
Closes: https://gitlab.com/libvirt/libvirt/-/issues/449
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Move the check for readonly and empty disks to the top where all other
checks will be done.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Refactor code checking whether image is raw. This fixes multiple places
where a LUKS encrypted disk could be mistreated.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Unfortunately a LUKS image to be decrypted by qemu has
VIR_STORAGE_FILE_RAW as format, but has encryption properties populated.
Many places in the code don't check it properly and also don't check
properly whether the image is indeed LUKS to be decrypted by qemu.
Introduce helpers which will simplify this task.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Spellchecked-by: Ján Tomko <jtomko@redhat.com>
QEMU's blockdev-mirror job doesn't allow copy into a destination which
isn't exactly the same size as source. This is a problem for
non-shared-storage migration when migrating into a raw block device, as
there it's very hard to ensure that the destination size will match the
source size.
Rather than failing the migration, we can add a storage slice in such
case automatically and thus make the migration pass.
To do this we need to probe the size of the block device on the
destination and if it differs form the size detected on the source we'll
install the 'slice'.
An additional handling is required when persisting the VM as we want to
propagate the slice even there to ensure that the device sizes won't
change.
Resolves: https://issues.redhat.com/browse/RHEL-4607
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Move qemuDomainStorageUpdatePhysical, qemuDomainStorageOpenStat,
qemuDomainStorageCloseStat to qemu_domain.c and export them. They'll be
reused in the migration code.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function will be used to setup storage for non-shared-storage
migration, not just precreate images.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When a user provides a migration XML via the VIR_MIGRATE_PARAM_DEST_XML
it's expected that they want to change ABI-compatible aspects of the XML
such as the disk paths or similar.
If the user requests persisting of the VM but does not provide an
explicit persistent XML libvirt would take the persistent XML from the
source of the migration as the persistent config. This usually involves
the old paths to images.
Doing this would result into failure to start the VM.
It makes more sense to take the XML used for migration and use that as
the base for persisting the config.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>