Commit Graph

12470 Commits

Author SHA1 Message Date
Michal Privoznik
46b03819ae qemu_namespace: Fix a corner case in qemuDomainGetPreservedMounts()
When setting up namespace for QEMU we look at mount points under
/dev (like /dev/pts, /dev/mqueue/, etc.) because we want to
preserve those (which is done by moving them to a temp location,
unshare(), and then moving them back). We have a convenience
helper - qemuDomainGetPreservedMounts() - that processes the
mount table and (optionally) moves the other filesystems too.
This helper is also used when attempting to create a path in NS,
because the path, while starting with "/dev/" prefix, may
actually lead to one of those filesystems that we preserved.

And here comes the corner case: while we require the parent mount
table to be in shared mode (equivalent of `mount --make-rshared /'),
these mount events propagate iff the target path exist inside the
slave mount table (= QEMU's private namespace). And since we
create only a subset of /dev nodes, well, that assumption is not
always the case.

For instance, assume that a domain is already running, no
hugepages were configured for it nor any hugetlbfs is mounted.
Now, when a hugetlbfs is mounted into '/dev/hugepages', this is
propagated into the QEMU's namespace, but since the target dir
does not exist in the private /dev, the FS is not mounted in the
namespace.

Fortunately, this difference between namespaces is visible when
comparing /proc/mounts and /proc/$PID/mounts (where PID is the
QEMU's PID). Therefore, if possible we should look at the latter.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2022-09-23 16:32:51 +02:00
Michal Privoznik
687374959e qemu_namespace: Tolerate missing ACLs when creating a path in namespace
When creating a path in a domain's mount namespace we try to set
ACLs on it, so that it's a verbatim copy of the path in parent's
namespace. The ACLs are queried upfront (by
qemuNamespaceMknodItemInit()) but this is fault tolerant so the
pointer to ACLs might be NULL (meaning no ACLs were queried, for
instance because the underlying filesystem does not support
them). But then we take this NULL and pass it to virFileSetACLs()
which immediately returns an error because NULL is invalid value.

Mimic what we do with SELinux label - only set ACLs if they are
non-NULL which includes symlinks.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2022-09-23 15:47:54 +02:00
Michal Privoznik
a8947db1a4 qemu_domain: Ignore all but SCSI hostdevs in qemuDomainDeviceHostdevDefPostParseRestoreBackendAlias()
When retiring QEMU_CAPS_BLOCKDEV_HOSTDEV_SCSI capability the
commit removed a bit too much. Previously, all other devices than
VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI were ignored in
qemuDomainDeviceHostdevDefPostParseRestoreBackendAlias(). But the
commit in question removed not only the capability check but also
this return early statement. Restore it back.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2129239
Fixes: dc8dbb27d4
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2022-09-23 15:28:34 +02:00
Peter Krempa
4b70a0519c virStateInitialize: Propagate whether running in monolithic daemon mode to stateful driver init
Upcoming patch which is fixing the opening of drivers in monolithic mode
needs to know whether we are inside 'libvirtd' but the code where the
decision needs to happen is not re-compiled per daemon. Thus we need to
pass this information to the stateful driver init function so that it
can be remebered.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-13 10:50:02 +02:00
Michal Privoznik
f14f8dff93 qemu_process: Don't require a hugetlbfs mount for memfd
The aim of qemuProcessNeedHugepagesPath() is to determine whether
a hugetlbfs mount point is required for given domain (as in
whether qemuBuildMemoryBackendProps() picks up
memory-backend-file pointing to a hugetlbfs mount point). Well,
when domain is configured to use memfd backend then that
condition can never be true. Therefore, skip creating domain's
private path under hugetlbfs mount points.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2022-09-12 12:04:55 +02:00
Peter Krempa
b0c680853a qemu: monitor: Renumber QEMU_MONITOR_MIGRATE_RESUME
Now that all preceding flags were deleted we can fix the enum value.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-09 16:10:47 +02:00
Peter Krempa
bc753aa6f7 qemu: migration: Remove QEMU_MONITOR_MIGRATE_BACKGROUND
'qemuMonitorJSONMigrate' is called from:
 - qemuMonitorMigrateToHost
 - qemuMonitorMigrateToSocket
   Both of the above function are called only from
   qemuMigrationSrcStart.

 - qemuMonitorMigrateToFd
   - called from:
     - qemuMigrationSrcToFile
       Both instances here pass QEMU_MONITOR_MIGRATE_BACKGROUND
       directly.
     - qemuMigrationSrcStart

qemuMigrationSrcStart is then called from qemuMigrationSrcRun and
qemuMigrationSrcResume, both of which always add QEMU_MONITOR_MIGRATE_BACKGROUND
to the flags.

Thus any caller always passes the flag so that we can remove the flag
altogether.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-09 16:10:47 +02:00
Peter Krempa
d5fb23bc6e qemu: monitor: Drop support for old-style non-shared storage migration
Remove the support for enabling the 'blk' and 'inc' parameters of the
'migrate' command as there are no users any more.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-09 16:10:47 +02:00
Peter Krempa
62b3f97aee qemu: migration: Don't attempt to fall back to old-style storage migration
QEMU supported the NBD server required for the new-style migration for a
long time already and when coupled with -blockdev the old style
migration doesn't even work, thus remove support for it.

This patch modifies the code to check that the destination returned data
for the NBD migration and returns an error if it did not and deletes the
fallback code paths which would not work.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-09 16:10:47 +02:00
Peter Krempa
2980268b22 qemu: capabilities: Retire QEMU_CAPS_NBD_SERVER
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-09 16:10:47 +02:00
Peter Krempa
94ff4f2f91 qemu: migration: Always assume support for QEMU_CAPS_NBD_SERVER
The NBD server (detected via 'nbd-server-start' qmp command) was added
to qemu in v1.3 and can't be compiled out.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-09 16:10:47 +02:00
Peter Krempa
83ffeae75a qemu: migration: Fix setup of non-shared storage migration in qemuMigrationSrcBeginPhase
In commit 6111b23522 removing pre-blockdev code paths I've
improperly refactored the setup of non-shared storage migration.

Specifically the code checking that there are disks and setting up the
NBD data in the migration cookie was originally outside of the loop
checking the user provided list of specific disks to migrate, but became
part of the block as it was not un-indented when a higher level block
was being removed.

The above caused that if non-shared storage migration is requested, but
the user doesn't provide the list of disks to migrate (thus implying to
migrate every appropriate disk) the code doesn't actually setup the
migration and then later on falls back to the old-style migration which
no longer works with blockdev.

Move the check that there's anything to migrate out of the
'nmigrate_disks' block.

Fixes: 6111b23522
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2125111
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/373
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-09 16:10:47 +02:00
Kristina Hanicova
ecc742126a qemu & conf: move BeginNestedJob & BeginJobNowait into src/conf
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2022-09-07 12:15:28 +02:00
Kristina Hanicova
4435c026b7 qemu & conf: move BeginAsyncJob & EndAsyncJob into src/conf
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2022-09-07 12:15:06 +02:00
Kristina Hanicova
421f1e749f qemu & conf: move BeginAgentJob & EndAgentJob into src/conf/virdomainjob
Although these and functions in the following two patches are for
now just being used by the qemu driver, it makes sense to have all
begin job functions in the same file.

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2022-09-07 12:14:43 +02:00
Kristina Hanicova
9085ccbfb4 qemu: use virDomainObjEndJob()
This patch moves qemuDomainObjEndJob() into
src/conf/virdomainjob as universal virDomainObjEndJob().

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2022-09-07 12:14:07 +02:00
Kristina Hanicova
0d22febfc6 qemu: use virDomainObjBeginJob()
This patch moves qemuDomainObjBeginJob() into
src/conf/virdomainjob as universal virDomainObjBeginJob().

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2022-09-07 12:13:30 +02:00
Kristina Hanicova
0150f7a8c1 virdomainjob: make drivers use job object in the domain object
This patch uses the job object directly in the domain object and
removes the job object from private data of all drivers that use
it as well as other relevant code (initializing and freeing the
structure).

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-07 12:13:13 +02:00
Kristina Hanicova
84e9fd068c conf: extend xmlopt with job config & add job object into domain object
This patch adds the generalized job object into the domain object
so that it can be used by all drivers without the need to extract
it from the private data.

Because of this, the job object needs to be created and set
during the creation of the domain object. This patch also extends
xmlopt with possible job config containing virDomainJobObj
callbacks, its private data callbacks and one variable
(maxQueuedJobs).

This patch includes:
* addition of virDomainJobObj into virDomainObj (used in the
  following patches)
* extending xmlopt with job config structure
* new function for freeing the virDomainJobObj

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2022-09-07 12:06:18 +02:00
Kristina Hanicova
2378f9d86e move files: hypervisor/domain_job -> conf/virdomainjob
The following patches move job object as a member into the domain
object.  Because of this, domain_conf (where the domain object is
defined) needs to import the file with the job object.

It makes sense to move jobs to the same level as the domain_conf:
into src/conf/

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2022-09-07 12:06:17 +02:00
Kristina Hanicova
3b1ad4cb17 qemu & hypervisor: move qemuDomainObjBeginJobInternal() into hypervisor
This patch moves qemuDomainObjBeginJobInternal() as
virDomainObjBeginJobInternal() into hypervisor in order to be
used by other hypervisors in the following patches.

Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2022-09-07 12:06:17 +02:00
Ján Tomko
d2e767d237 qemu: do not probe for properties of nec-usb-xhci
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-09-07 12:05:40 +02:00
Ján Tomko
8650e7a202 qemu: remove qemuValidateDomainVirtioOptions
Now that we assume all the virtio capabilities, this function does not
check anything.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-09-07 12:05:40 +02:00
Ján Tomko
6600c0ff18 qemu: retire QEMU_CAPS_VIRTIO_PACKED_QUEUES
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-09-07 12:05:40 +02:00
Ján Tomko
b710fcaff7 qemu: assume QEMU_CAPS_VIRTIO_PACKED_QUEUES
Added by QEMU commit:

commit 74b3e46630446568aecb0be1c77c4875d7a52f6d
Author:     Jason Wang <jasowang@redhat.com>
CommitDate: 2019-10-25 07:46:22 -0400

    virtio: add property to enable packed virtqueue

    Signed-off-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
    Reviewed-by: Jens Freimann <jfreimann@redhat.com>
    Message-Id: <20191025083527.30803-9-eperezma@redhat.com>
    Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

git describe: v4.1.0-1780-g74b3e46630 contains: v4.2.0-rc0~32^2~17

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-09-07 12:05:40 +02:00
Ján Tomko
3ae85b6a69 qemu: retire QEMU_CAPS_VIRTIO_SCSI_IOTHREAD
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-09-07 12:05:40 +02:00
Ján Tomko
efb3ca87d5 qemu: assume QEMU_CAPS_VIRTIO_SCSI_IOTHREAD
All the supported QEMU versions should have iothread support
on the virtio-scsi controllers if they are compiled in.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-09-07 12:05:40 +02:00
Ján Tomko
d8e274253a qemu: retire QEMU_CAPS_NEC_USB_XHCI_PORTS
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-09-07 12:05:40 +02:00
Ján Tomko
04bf98a418 qemu: assume QEMU_CAPS_NEC_USB_XHCI_PORTS
Introduced by QEMU commit 0846e6359c407e372f446723b8b7b09ac20d0f03
released in QEMU 1.3.0

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-09-07 12:05:40 +02:00
Ján Tomko
935865e057 qemu: retire QEMU_CAPS_CHARDEV_LOGFILE
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-09-07 12:05:40 +02:00
Ján Tomko
72768bde3d qemu: assume QEMU_CAPS_CHARDEV_LOGFILE
Introduced in QEMU 2.6

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-09-07 12:05:40 +02:00
Ján Tomko
be217321eb qemu: retire QEMU_CAPS_CHARDEV_FILE_APPEND
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-09-07 12:05:40 +02:00
Ján Tomko
0a5b820f8f qemu: assume QEMU_CAPS_CHARDEV_FILE_APPEND
Introduced in QEMU 2.6

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-09-07 12:05:40 +02:00
Jiri Denemark
2d7b22b561 qemu: Make qemuMigrationSrcCancel optionally synchronous
We have always considered "migrate_cancel" QMP command to return after
successfully cancelling the migration. But this is no longer true (to be
honest I'm not sure it ever was) as it just changes the migration state
to "cancelling". In most cases the migration is canceled pretty quickly
and we don't really notice anything, but sometimes it takes so long we
even get to clearing migration capabilities before the migration is
actually canceled, which fails as capabilities can only be changed when
no migration is running. So to avoid this issue, we can wait for the
migration to be really canceled after sending migrate_cancel. The only
place where we don't need synchronous behavior is when we're cancelling
migration on user's request while it is actively watched by another
thread.

https://bugzilla.redhat.com/show_bug.cgi?id=2114866

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-09-06 18:28:10 +02:00
Jiri Denemark
4e55fe21b5 qemu: Create wrapper for qemuMonitorMigrateCancel
We will need a little bit more code around qemuMonitorMigrateCancel to
make sure it works as expected. The new qemuMigrationSrcCancel helper
will avoid repeating the code in several places.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-09-06 18:28:10 +02:00
Jiri Denemark
0ff8c175f7 qemu: Rename qemuMigrationSrcCancel
Let's call this qemuMigrationSrcCancelUnattended as the function is
supposed to be used when no other thread is watching the migration.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-09-06 18:28:10 +02:00
Michal Privoznik
e4f577b25e qemu_driver: Fix order of arguments in qemuDomainGetStatsCpuProc()
Just before pushing my earlier commit I've switch order of two
arguments of virProcessGetStatInfo() (as suggested in review).
However, I forgot to swap the arguments in
qemuDomainGetStatsCpuProc() which leads to userTime and sysTime
being swapped.

Fixes: 044b8744d6
Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2022-09-06 17:24:38 +02:00
Michal Privoznik
044b8744d6 qemu: Implement qemuDomainGetStatsCpu fallback for qemu:///session
For domains started under session URI, we don't set up CGroups
(well, how could we since we're not running as root anyways).
Nevertheless, fetching CPU statistics exits early because of
lacking cpuacct controller. But with recent extension to
virProcessGetStatInfo() we can get the values we need from the
proc filesystem. Implement the fallback for the session URI as
some of virt tools rely on cpu.* stats to be reported (virt-top,
virt-manager).

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/353
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1693707
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-05 13:55:11 +02:00
Michal Privoznik
cdc22d9a21 util: Extend virProcessGetStatInfo() for sysTime and userTime
The virProcessGetStatInfo() helper parses /proc stat file for
given PID and/or TID and reports cumulative cpuTime which is just
a sum of user and sys times. But in near future, we'll need those
times separately, so make the function return them too (if caller
desires).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-05 13:55:11 +02:00
Peter Krempa
f2f5090ef1 qemuValidateDomainDef: Clarify error message when S390 PV launch security is unsupported by the kernel
Split up the condition and report a different error message when the
host or host config results in S390 PV launch security being
unavailable.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2122534
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com>
2022-09-01 13:11:10 +02:00
Peter Krempa
d34be15c6c qemu: command: Don't use deprecated chardev backend drivers 'tty' and 'parport'
The replacement is 'serial' and 'parallel' respectively introduced at
least in qemu-2.9 and the old versions are deprecated since qemu-6.0
(qemu commit 5965243641d797b22 ).

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-01 13:11:10 +02:00
Peter Krempa
45e5648350 qemu: capabilities: Retire QEMU_CAPS_VIRTIO_PCI_DISABLE_LEGACY
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-01 13:11:10 +02:00
Peter Krempa
89c40977f2 qemu: Remove extra logic around QEMU_CAPS_VIRTIO_PCI_DISABLE_LEGACY
The virtio-*-(non-)-transitional device models which replace the use of
'disable-legacy'/'disable-modern' features were introduced in qemu-4.0.

This means we can remove the specific parts of the code for formatting
the old-style device options and replace all other code to solely depend
on the QEMU_CAPS_VIRTIO_PCI_TRANSITIONAL flag.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-01 13:11:10 +02:00
Peter Krempa
d1385232a0 qemu: address: Use PCIe for virtio devices also with QEMU_CAPS_VIRTIO_PCI_TRANSITIONAL
QEMU_CAPS_VIRTIO_PCI_TRANSITIONAL is the evolution of
QEMU_CAPS_VIRTIO_PCI_DISABLE_LEGACY from qemu's point of view. Make sure
that we consider both when assesing whether a device belongs on PCIe.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-09-01 13:11:09 +02:00
Martin Kletzander
6457619d18 Rename iterface type='dummy' to type='null'
When commit bac6b266fb added this "functionality" this was the only
naming I could think of, but after discussion with Dan we found the name
'null' fits a bit better, so change it before we make a release with the
old name.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2022-08-25 13:27:04 +02:00
Martin Kletzander
3c2d06d78e qemu: Do not keep swtpm pidfile around after stopping
Just like the socket, remove the pidfile when TPM emulator is being stopped.  In
order to make this a bit cleaner, try to remove it even if swtpm_ioctl does not
exist.

https://bugzilla.redhat.com/show_bug.cgi?id=2111301

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-08-24 23:31:12 +02:00
Laine Stump
9a64c66d34 qemu: remove superfluous cleanup: labels and ret return variables
After converting virNetworkDef * to g_autoptr(virNetworkDef) the
cleanup codepath was empty, so it has been removed.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-08-24 12:22:47 -04:00
Laine Stump
175d8a0852 qemu: replace explicit virNetworkDefFree() with g_autoptr(virNetworkDef)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-08-24 12:22:46 -04:00
Michal Privoznik
da255ce831 lib: Don't check for retval for virCommandNew*()
The virCommand module is specifically designed so that no caller
has to check for retval of individual virCommand*() APIs except
for virCommandRun() where the actual error is reported. Moreover,
virCommandNew*() use g_new0() to allocate memory and thus it's
not really possible for those APIs to return NULL. Which is why
they are even marked as ATTRIBUTE_NONNULL. But there are few
places where we do check the retval which is a dead code
effectively. Drop those checks.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2022-08-23 16:14:05 +02:00
Jonathon Jongsma
8d5704e2c4 qemu: adjust memlock for multiple vfio/vdpa devices
When multiple VFIO or VDPA devices are assigned to a guest, the guest
can fail to start because the guest fails to map enough memory. For
example, the case mentioned in
https://bugzilla.redhat.com/show_bug.cgi?id=2111317 results in this
failure:

    2021-08-05T09:51:47.692578Z qemu-kvm: failed to write, fd=31, errno=14 (Bad address)
    2021-08-05T09:51:47.692590Z qemu-kvm: vhost vdpa map fail!
    2021-08-05T09:51:47.692594Z qemu-kvm: vhost-vdpa: DMA mapping failed, unable to continue

The current memlock limit calculation does not work for scenarios where
there are multiple such devices assigned to a guest. The root causes are
a little bit different between VFIO and VDPA devices.

For VFIO devices, the issue only occurs when a vIOMMU is present. In
this scenario, each vfio device is assigned a separate AddressSpace
fully mapping guest RAM. When there is no vIOMMU, the devices are all
within the same AddressSpace so no additional memory limit is needed.

For VDPA devices, each device requires the full memory to be mapped
regardless of whether there is a vIOMMU or not.

In order to enable these scenarios, we need to multiply memlock limit
by the number of VDPA devices plus the number of VFIO devices for guests
with a vIOMMU. This has the potential for pushing the memlock limit
above the host physical memory and negating any protection that these
locked memory limits are providing, but there is no other short-term
solution.

In the future, there should be have a revised userspace iommu interface
(iommufd) that the VFIO and VDPA backends can make use of. This will be
able to share locked memory limits between both vfio and vdpa use cases
and address spaces and then we can disable these short term hacks. But
this is still in development upstream.

Resolves: https://bugzilla.redhat.com/2111317

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2022-08-22 13:41:40 -05:00