Commit Graph

48711 Commits

Author SHA1 Message Date
Michal Privoznik
e53291514c qemu_hotplug: Temporarily allow emulator thread to access other NUMA nodes during mem hotplug
Again, this fixes the same problem as one of previous commits,
but this time for memory hotplug. Long story short, if there's a
domain running and the emulator thread is restricted to a subset
of host NUMA nodes, but the memory that's about to be hotplugged
requires memory from a host NUMA node that's not in the set we
need to allow emulator thread to access the node, temporarily.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-23 17:21:16 +02:00
Michal Privoznik
3ec6d586bc qemu: Start emulator thread with more generous cpuset.mems
Consider a domain with two guest NUMA nodes and the following
<numatune/> setting :

  <numatune>
    <memory mode="strict" nodeset="0"/>
    <memnode cellid="0" mode="strict" nodeset="1"/>
  </numatune>

What this means is the emulator thread is pinned onto host NUMA
node #0 (by setting corresponding cpuset.mems to "0"), and two
memory-backend-* objects are created:

  -object '{"qom-type":"memory-backend-ram","id":"ram-node0", .., "host-nodes":[1],"policy":"bind"}' \
  -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 \
  -object '{"qom-type":"memory-backend-ram","id":"ram-node1", .., "host-nodes":[0],"policy":"bind"}' \
  -numa node,nodeid=1,cpus=2-3,memdev=ram-node1 \

Note, the emulator thread is pinned well before QEMU is even
exec()-ed.

Now, the way memory allocation works in QEMU is: the emulator
thread calls mmap() followed by mbind() (which is sane, that's
how everybody should do it). BUT, because the thread is already
restricted by CGroups to just NUMA node #0, calling:

  mbind(host-nodes:[1]); /* made up syntax (TM) */

fails. This is expected though. Kernel was instructed to place
the memory at NUMA node "0" and yet, process is trying to place
it elsewhere.

We used to solve this by not restricting emulator thread at all
initially, and only after it's done initializing (i.e. we got the
QMP greeting) we placed it onto desired nodes. But this had its
own problems (e.g. QEMU might have locked pieces of its memory
which were then unable to migrate onto different NUMA nodes).

Therefore, in v5.1.0-rc1~282 we've changed this and set cgroups
upfront (even before exec()-ing QEMU). And this used to work, but
something has changed (I can't really put my finger on it).

Therefore, for the initialization start the thread with union of
all configured host NUMA nodes ("0-1" in our example) and fix the
placement only after QEMU is started.

NB, the memory hotplug suffers the same problem, but that will
be fixed in the next commit.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2138150
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-23 17:21:16 +02:00
Michal Privoznik
c4a7f8007c qemuProcessSetupPid: Use @numatune variable more
Inside of qemuProcessSetupPid() there's @numatune variable which
is set to vm->def->numa, but it lives only in one block. In the
rest of places the expanded form (vm->def->numa) is used instead.
Move the variable declaration at the beginning of the function
and use it instead of the expanded form.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-23 17:21:16 +02:00
Martin Kletzander
1bb439e4b0 qemu: Use thread-context even with numatune's restrictive mode
We cannot use host-nodes attribute for it, but there is no reason for us
to skip the preallocation optimisation using thread-context in such
case.  Thankfully returning the proper nodemask from
qemuBuildMemoryBackendProps is enough to trigger this.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-23 17:04:08 +02:00
Tim Wiederhake
1716ec3d36 cpu-data.py: Filter out apic current logical processor
Commit 10b5e789c5 attempts to filter out the logical processor id
in the generated data to remove noise and irrelevant changes in the
output.

cpuid-leaf 0x0B may have more than two sub-leaves though. Filter out
logical processor id from all sub-leaves of 0x0B and 0x1F (superset
of the information in 0x0B).

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2023-05-23 16:25:12 +02:00
Jiri Denemark
17e92b4305 NEWS: Mention support for compressing parallel migration
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2023-05-23 12:32:10 +02:00
Andrea Bolognani
3b6d69237f Revert "conf: Introduce MTE domain feature"
The QEMU interface is still in a state of flux, and KVM support
has been pulled shortly after having been merged. Let's not
commit to a stable interface in libvirt just yet.

Reverts: 720e8f13ff
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2023-05-22 15:13:19 +02:00
Andrea Bolognani
4fd5f0d660 Revert "qemu:: Introduce QEMU_CAPS_MACHINE_VIRT_MTE capability"
The QEMU interface is still in a state of flux, and KVM support
has been pulled shortly after having been merged. Let's not
commit to a stable interface in libvirt just yet.

Reverts: 1347a19f75
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2023-05-22 15:13:18 +02:00
Andrea Bolognani
178a66f9af Revert "qemu: Validate MTE feature"
The QEMU interface is still in a state of flux, and KVM support
has been pulled shortly after having been merged. Let's not
commit to a stable interface in libvirt just yet.

Reverts: c6c9b5d251
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2023-05-22 15:13:17 +02:00
Andrea Bolognani
167138a525 Revert "qemu: Generate command line for MTE feature"
The QEMU interface is still in a state of flux, and KVM support
has been pulled shortly after having been merged. Let's not
commit to a stable interface in libvirt just yet.

Reverts: b10bc8f7ab
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2023-05-22 15:12:51 +02:00
Andrea Bolognani
4850a9a39b rpm: Explain BuildRequires on qemu-img
It's not used as part of the build process or searched for at
build time, and the QEMU driver detects its path at runtime,
so one could think that the BuildRequires is unnecessary. But
we actually need it to be present at build time in order to
run the full test suite.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
2023-05-22 15:12:29 +02:00
Michal Privoznik
17c8a173b6 numa_conf: Deny other memory modes than 'restrictive' if a memnode is 'restrictive'
We already do check that if there's <memory mode='restrictive'/>
then all <memnode/> have to be of 'restrictive' mode too. But
what we are missing the reverse: if there is <memnode/> with
'restrictive' mode, then the <memory/> has to be of the same mode
too.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2208946
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-22 13:58:03 +02:00
Michal Privoznik
f6ba9fc12a numa_conf: Move memnode mode validation into virDomainNumaDefValidate()
When parsing a <memnode/> we also check whether the @mode
argument fulfills some requirements wrt 'restrictive' mode. This
is not the right place though. There's virDomainNumaDefValidate()
which contains other checks.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-22 13:57:44 +02:00
Michal Privoznik
a152d856c3 virDomainNumatuneNodeSpecified: Fix const correctness
The virDomainNumatuneNodeSpecified() function does not write into
passed @numatune pointer, it just reads from it. Therefore, the
argument should be const, which allows this function to be called
from places where virDomainNuma is already const (e.g. domain
validation code).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-22 13:49:34 +02:00
Laszlo Ersek
90404c5368 docs: make isa-debugcon example more useful / directly applicable
The type='pty' attribute in the <serial> element causes a Pseudo TTY to be
allocated on the host side via "/dev/ptmx", which is meant to be
interacted with via "virsh console" or similar.

That's not how a firmware log is typically viewed or saved. Replace
type='pty' with type='file', and also provide an example <source> element
(with the pathname of the logfile), similarly to how the <serial> example
just above provides a <source> element too.

Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Andrea Bolognani <abologna@redhat.com>
Updates: 654968381d
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-18 16:24:59 +02:00
Laszlo Ersek
f06d15b512 docs: fix typo in isa-debugcon example
The <serial> opening tag is paired with the </console> closing tag; that's
a mismatch. The question is then whether to modify the former to
<console>, or the latter to </serial>.

Per section "Relationship between serial ports and consoles", <serial> is
used for emulated (not paravirt) consoles, and it's the type that's
suitable for early debug output (such as from firmware). Thus, change
</console> to </serial>.

Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Andrea Bolognani <abologna@redhat.com>
Fixes: 654968381d
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-18 16:24:59 +02:00
Jiang Jiacheng
ffa258a39d qemu: support set parallel migration compression method
Add new compress methods zlib and zstd for parallel migration,
these method should be used with migration option --comp-methods
and will be processed in 'qemuMigrationParamsSetCompression'.
Note that only one compress method could be chosen for parallel
migration and they cann't be used in compress migration.

Signed-off-by: Jiang Jiacheng <jiangjiacheng@huawei.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2023-05-18 15:47:30 +02:00
Jiang Jiacheng
4ab5591c95 virsh: Add migrate options to set parallel compress level
Add migrate options: --compression-zlib-level
                     --compression-zstd-level
These options are used to set compress level for "zlib"
or "zstd" during parallel migration if the compress method
is specified.

Signed-off-by: Jiang Jiacheng <jiangjiacheng@huawei.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2023-05-18 15:47:30 +02:00
Jiang Jiacheng
150ae3e62b Add public API for parallel compression method
Add description for VIR_MIGRATE_PARAM_COMPRESSION, it will
be reused in choosing compression method during parallel migration.
Add public API VIR_MIGRATE_PARAM_COMPRESSION_ZLIB_LEVEL,
VIR_MIGRATE_PARAM_COMPRESSION_ZSTD_LEVEL for migration APIs
to support set compress level during parallel migration.

Signed-off-by: Jiang Jiacheng <jiangjiacheng@huawei.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2023-05-18 15:47:30 +02:00
Peter Krempa
5ee27c37e6 docs: xsl: Simplify templating XSL
Wrap the auto-generated pages (API ref and hvsupport.html) in the proper
top level element similarly to what the pages generated from RST have to
remove the extra case when templating our web.

(Best viewed with 'git show -w')

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-18 10:06:51 +02:00
Peter Krempa
f11c773014 docs: newapi.xsl: Remove support for generating index page
Since we need to generate API docs for multiple input files the index
page is not useful for us and was replaced by a manual one. Drop the XSL
for generating it.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-18 10:06:51 +02:00
Peter Krempa
7aa2706d3b docs: html: Add a manually written index page
The auto-generated index contains only references to one run of the
generator but we in total run it 4 times missing the admin, lxc, and
qemu specific apis.

Rewrite it manually so that we can drop the generator for it.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-18 10:06:51 +02:00
Peter Krempa
02e7f8d709 css: Remove override of width for 'hvsupport' page
Now that the table is not so wide we can treat it as any other page.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-18 10:06:51 +02:00
Peter Krempa
dc9c6c5405 hvsupport: Split out common APIs from hypervisor API section
Common APIs such as virConnectOpen/Close and similar which are used by
the non-hypervisor drivers in libvirt are grouped together with
hypervisor drivers, which makes the table very wide.

Split them out into a separate group and clean up the list of hypervisor
drivers.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-18 10:06:51 +02:00
Peter Krempa
eca6846376 scripts: hvsupport: Properly register virConnectOpenAuth/virConnectOpenReadOnly APIs
Use the proper driver struct member names for the aforementioned APIs so
that the fixup of the versions works properly.

Currently we reported that no of the drivers supported the APIs despite
being only shims above 'open'.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-18 10:06:51 +02:00
Peter Krempa
ef01df4a5c docs: Remove XSLT table of contents generator
The only remaining page was 'hvsupport.html' which is generated by
'scripts/hvsupport.py'. The script already has all the data to generate
the table of contents internally so we can remove the whole complicated
template.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-18 10:06:51 +02:00
Peter Krempa
5ff58a0ce7 docs: index: Convert to 'rst'
Final piece of conversion of our non-generated pages to 'rst'.

Special raw HTML is used for adding the appropriate code to fetch the
blog planet.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-17 16:54:24 +02:00
Peter Krempa
c0a06c081c docs: acl: Convert to 'rst'
The only special bit about the 'acl' page was the inclusion of the
objects and permissions tables. We can do that by the '.. raw::'
directive.

One reference from 'aclpolkit.rst' needed to be updated to go with the
new header anchor naming.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-17 16:54:23 +02:00
Peter Krempa
0f1d6ef6e7 css: Fix styling of the "3 panel" pages
Use the same 'margin-bottom' bot for the normal and mobile layout fixing
one of the panels touching the footer.

Use same font size both for <h1> and <h2> used as the column titles as
rst2html5 based on version can generate either of them.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-17 16:54:23 +02:00
Peter Krempa
82db6fb765 css: mobile: Make colums in "3 column" mobile layout wider
Use the full width of the parent box and drop the unnecessarily bigger
margin.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-17 16:54:23 +02:00
Peter Krempa
2b9d96fcac css: mobile: Fix hiding of big logo in mobile layout
Use the '#index' id to select the proper page as the body element
doesn't have 'index' class.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-17 10:08:36 +02:00
Peter Krempa
0e8f61beba css: mobile: Fix responsive design of 'docs' and 'knowledgebase' pages
When the pages were converted to rST it required changes to how the
panels are created. This change was not reproduced in the specific media
override for narrow displays and thus made those pages unusable.

Note that two lines per document are needed as some rst2html5 versions
format a <div class='section'> and others do a <section> element
instead.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-17 10:08:36 +02:00
Peter Krempa
1a39a07879 css: mobile: Replace tabs with spaces
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-17 10:08:36 +02:00
Peter Krempa
e51922335c css: Drop styles for '.gitmirror' class
Last use was removed in 11850158bd

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-17 10:08:36 +02:00
Peter Krempa
e21b32ed4f css: Drop styles for '.mail' class
Use was removed in 5042a5def6

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-17 10:08:36 +02:00
Peter Krempa
08de356e1d css: Drop style for 'p.image' selector
Last use was removed in b51afd97e5

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-17 10:08:36 +02:00
Peter Krempa
79e1853186 css: Drop style for '#changelog' id
The corresponding element was removed in 5e0211e0d3

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-17 10:08:36 +02:00
Peter Krempa
e28fe28b04 css: Drop styles for '#projects' id
There's nothing with such element id. The last mention was removed in
2818359075

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-17 10:08:36 +02:00
Peter Krempa
af621caa6b conf: numa: Allow formatting 'none' values for 'associativity' and 'policy' of cache
The parser makes the values mandatory and also the qemu code implements
actions for those values. The formatter skips them though. Since
format+parse is used to copy the XML at startup a definition with those
values can't be started.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2203709
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-05-17 10:07:18 +02:00
Peter Krempa
0d5fc7219a virDomainNumaDefNodeCacheParseXML: Refactor parsing of cache XML
Use virXMLProp* helpers to simplify the code.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-05-17 10:07:18 +02:00
Peter Krempa
a8a63587ff qemuxml2xmltest: Modernize all 'audio-' cases
Use DO_TEST_CAPS_LATEST to run with the latest capapbilities.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-17 10:02:19 +02:00
Peter Krempa
c051fa874f qemuxml2argvtest: Use real caps instead of fake caps for 'audio-default-*' cases
Convert all of the 'audio-default-*' cases to use capabilities from
qemu-4.2 instead of the fake caps.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-17 10:02:19 +02:00
Peter Krempa
36d7d87c87 qemuxml2xmlout: Replace symlinks of all 'audio-' tests by real files
Symlinks are hard to maintain and especially un-cool when attempting to
test against real capapbilities.

Replace symlinks by real files first so that we can switch to real caps
and see the difference.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-05-17 10:02:19 +02:00
Michal Privoznik
b10bc8f7ab qemu: Generate command line for MTE feature
This is pretty trivial, just append "mte=on/off" to -machine
arguments.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-16 17:43:05 +02:00
Michal Privoznik
c6c9b5d251 qemu: Validate MTE feature
The MTE feature is not supported by all QEMUs, only those with
QEMU_CAPS_MACHINE_VIRT_MTE capability.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-16 17:43:03 +02:00
Michal Privoznik
1347a19f75 qemu:: Introduce QEMU_CAPS_MACHINE_VIRT_MTE capability
The MTE feature (introduced in QEMU commit of v5.1.0-rc1~8^2~11)
is detectable via 'qom-list-properties' for 'virt' machine type.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-16 17:43:00 +02:00
Michal Privoznik
720e8f13ff conf: Introduce MTE domain feature
The Memory Tagging Extensions are hardware acceleration present
in some ARM processors that allow memory error detection [1].
Introduce a domain XML knob that turns them on or off.

1: https://www.arm.com/blogs/blueprint/memory-safety-arm-memory-tagging-extension
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-16 17:42:58 +02:00
Michal Privoznik
37e41b7f16 qemu: Drop @forceVFIO argument of qemuDomainGetMemLockLimitBytes()
After previous cleanup, there's not a single caller that would
call qemuDomainGetMemLockLimitBytes() with @forceVFIO set. All
callers pass false.

Drop the unneeded argument from the function.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-16 14:43:43 +02:00
Michal Privoznik
4f355fa5b7 qemu: Drop @forceVFIO argument of qemuDomainAdjustMaxMemLock()
After previous cleanup, there's not a single caller that would
call qemuDomainAdjustMaxMemLock() with @forceVFIO set. All callers
pass false.

Drop the unneeded argument from the function.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-16 14:43:43 +02:00
Michal Privoznik
c925bb9273 qemu_domin: Account for NVMe disks when calculating memlock limit on hotplug
During hotplug of a NVMe disk we need to adjust the memlock
limit. The computation of the limit is handled by
qemuDomainGetMemLockLimitBytes() which looks at given domain
definition and accounts for various device types (as different
types require different amounts). But during disk hotplug the
disk is not added to domain definition until the very last
moment. Therefore, qemuDomainGetMemLockLimitBytes() has this
@forceVFIO argument which tells it to assume VFIO even if there
are no signs of VFIO in domain definition. And this kind of
works, until the amount needed for NVMe disks changed (in
v9.3.0-rc1~52). What's missing in the commit is making @forceVFIO
behave the same as if there was an NVMe disk present in the
domain definition.

But, we can do even better - just mimic whatever we're doing for
hostdevs. IOW - introduce qemuDomainAdjustMaxMemLockNVMe() that
behaves the same as qemuDomainAdjustMaxMemLockHostdev().

There are subtle differences though:

1) qemuDomainAdjustMaxMemLockHostdev() can afford placing hostdev
   right at the end of vm->def->hostdevs, because the array was
   already reallocated (at the beginning of
   qemuDomainAttachHostPCIDevice()). But
   qemuDomainAdjustMaxMemLockNVMe() doesn't have that luxury.

2) qemuDomainAdjustMaxMemLockHostdev() places a
   virDomainHostdevDef pointer into domain definition, while
   qemuDomainStorageSourceAccessModifyNVMe() (which calls
   qemuDomainAdjustMaxMemLock()) sees a virStorageSource pointer
   but domain definition contains virDomainDiskDef. But that's
   okay, we can create a dummy disk definition and append it into
   the domain definition.

After this, qemuDomainAdjustMaxMemLock() can be called with
@forceVFIO = false, as the disk is now part of domain definition
(when computing the new limit).

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2014030#c28
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-16 14:43:42 +02:00