Memory slots are required only for DIMM-like devices, but the maximum
memory address space is relevant also for other non-DIMM memory devices
such as virtio-mem. Allow configurations where no slots are added.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Memory slots are required only for DIMM-like devices, while other
devices defined via <memory> such as virtio-mem may use the PCI bus and
thus do not require/consume a memory slot.
Fix the validation code to calculate the required count of memory
devices only for DIMMs and NVDIMMs.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This is fairly trivial. Just set .memaddr attribute if a value
was set in the XML.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2180679
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
These are allegedly necessary to keep the output consistent,
but now that we're using a privileged config for the driver we
get the desired behavior out of the box, and as a bonus the
paths match what you would actually see on a regular host.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
QEMU deprecated the '-no-acpi' option, thus we should switch to the
modern way to use '-machine'.
Certain ARM machine types don't support ACPI. Given our historically
broken design of using '<acpi/>' without attribute to enable ACPI and
qemu's default of enabling it without '-no-acpi' such configurations
would not work.
Now when qemu reports whether given machine type supports ACPI we can do
a better decision and un-break those configs. Unfortunately not
retroactively.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/297
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Currently, the ThreadContext object is generated whenever we see
.host-nodes attribute for a memory-backend-* object. The idea was
that when the backend is pinned to a specific set of host NUMA
nodes, then the allocation could be happening on CPUs from those
nodes too. But this may not be always possible.
Users might configure their guests in such way that vCPUs and
corresponding guest NUMA nodes are on different host NUMA nodes
than emulator thread. In this case, ThreadContext won't work,
because ThreadContext objects live in context of the emulator
thread (vCPU threads are moved around by us later, when emulator
thread finished its setup and spawned vCPU threads - see
qemuProcessSetupVcpus()). Therefore, memory allocation is done by
emulator thread which is pinned to a subset of host NUMA nodes,
but tries to create a ThreadContext object with a disjoint subset
of host NUMA nodes, which fails.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2154750
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When generating memory for memory devices memory-backend-* might
be used. This means, we may need to generate thread-context
objects too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
This is a perfectly valid configuration that we need to keep
working, so add test coverage for it.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Currently, memory device (def->mems) part of cmd line is
generated before any controller. In majority of cases it doesn't
matter because neither of memory devices live on a bus that's
created by an exposed controller (e.g. there's no DIMM
controller, at least not exposed). Except for virtio-mem and
virtio-pmem, which do have a PCI address. And if it so happens
that the device goes onto non-default bus (pci.0) starting such
guest fails, because the controller that creates the desired bus
wasn't processed yet. QEMU processes arguments in order.
For instance, if virtio-mem has address with bus='0x01' QEMU
refuses to start with the following message:
Bus 'pci.1' not found
Similarly for virtio-pmem. I've successfully tested migration and
changing the order does not affect migration stream.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2047271
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
There are a some scenarios in which we want to prealloc guest
memory (e.g. when requested in domain XML, when using hugepages,
etc.). With 'regular' <memory/> models (like 'dimm', 'nvdimm' or
'virtio-pmem') or regular guest memory it is corresponding
memory-backend-* object that ends up with .prealloc attribute
set. And that's desired because neither of those devices can
change its size on the fly. However, with virtio-mem model things
are a bit different. While one can set .prealloc attribute on
corresponding memory-backend-* object it doesn't make much sense,
because virtio-mem can inflate/deflate on the fly, i.e. change
how big of a portion of the memory-backend-* object is exposed to
the guest. For instance, from a say 4GiB module only a half can
be exposed to the guest. Therefore, it doesn't make much sense to
preallocate whole 4GiB and keep them allocated. But we still want
the part exposed to the guest preallocated (when conditions
described at the beginning are met).
Having said that, with new enough QEMU the virtio-mem-pci device
gained new attribute ".prealloc" which instructs the device to
talk to the memory backend object and allocate only the requested
portion of memory.
Now, that our algorithm for setting .prealloc was isolated in a
single function, the function can be called when constructing cmd
line for virtio-mem-pci device.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that qemu fixed device unplug when JSON syntax is used with -device
we can re-enable the feature.
Since the old capability string representation is condemned by
suggesting filtering it as a workaround we must introduce a new string.
To achieve this the original capability position is renamed to
X_QEMU_CAPS_DEVICE_JSON_BROKEN_HOTPLUG and a new position with the
original name QEMU_CAPS_DEVICE_JSON is introduced to prevent us having
to change the rest of the code.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When -device is configured via JSON a bug [1] is triggered in qemu were
the DEVICE_DELETED event for the removal of the device frontend is no
longer delivered to libvirt. Without the DEVICE_DELETED event we don't
remove the corresponding entries in the VM XML.
Until qemu will be fixed we must stop using the JSON syntax for -device.
This patch removes the detection of the capability. The capability is
used only during startup of a fresh VM so we don't need to consider any
compaitibility steps for existing VMs.
For users who wish to use 'libvirt-7.9' and 'libvirt-7.10' with
'qemu-6.2' there are two possible workarounds:
- filter out the 'device.json' qemu capability '/etc/libvirt/qemu.conf':
capability_filters = [ "device.json" ]
- filter out the 'device.json' qemu capability via qemu namespace XML:
<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
[...]
<qemu:capabilities>
<qemu:del capability='device.json'/>
</qemu:capabilities>
</domain>
We must never again use the same capability name as we are now
instructing users to filter it as a workaround so once qemu is fixed
we'll need to pick a new capability value for it.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=2036669
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2035237
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ani Sinha <ani@anisinha.ca>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Upcoming patches will modify how we populate the capability cache in
tests to be more saner. This also means that the emulator binary and
architecture used in the test files using real capabilities must match
what the real capabilities have.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We currently use -machine accel=XXX which is just a syntax sugar
for -accel XXX. The former doesn't allow specifying arguments for
accelerator, because all arguments passed to -machine are
treated as arguments of machine itself.
The -accel argument was introduced in QEMU commit
v2.9.0-rc0~70^2~19 and since our minimum required version is
newer (2.11.0) we can safely assume its existence and use it
without any capability.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/233
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
'-audiodev' as a modern implementation based on QAPI already takes JSON
as the argument. Convert our code to use it directly.
The declaration of the QAPI types can be found in
'qemu.git/qapi/audio.json'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Starting with QEMU-6.2 started accepting a JSON object as argument for
'-device' which will also become the only syntax considered stable by
qemu in the future.
Since libvirt was recently converted to generate the properties via JSON
to begin wit we can start using it on the commandline as well, by simply
enabling the QEMU_CAPS_DEVICE_JSON capability, which we do by probing
for the 'json-cli' feature flag of 'device_add'.
Normally a change which changes a commandline output should be happening
only after the impacted real-caps test files are forked in the version
preceding the change, but in this case it's not necessary as the logic
for generating the device properties stays identical and we just change
the output format (avoid conversion). Additionally we still have a lot
of tests validating the conversion to the old commandline options.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Nothing special is happening here. All important changes were
done when for 'virtio-pmem' (adjusting the code to put virtio
memory on PCI bus, generating alias using
qemuDomainDeviceAliasIndex(). The only bit that might look
suspicious is no prealloc for virtio-mem. But if you think about
it, the whole purpose of this device is to change amount of
memory exposed to guest on the fly. There is no point in locking
the whole backend in memory.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>