qemuInterfaceVDPAConnect() was a helper function for connecting to the
vdpa device file. But in order to support other vdpa devices besides
network interfaces (e.g. vdpa block devices) make this function a bit
more generic.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The function doesn't modify it. Fix the argument declaration so that the
function can be used in a context where we have a 'const' disk
definition.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When a thread-context object is specified on the cmd line, then
QEMU spawns a thread and sets its affinity to the list of NUMA
nodes specified in .node-affinity attribute. And this works just
fine, until the main QEMU thread itself is not restricted.
Because of v5.3.0-rc1~18 we restrict the main emulator thread
even before QEMU is executed and thus then it tries to set
affinity of a thread-context thread, it inevitably fails with:
Setting CPU affinity failed: Invalid argument
Now, we could lift the pinning temporarily, let QEMU spawn all
thread-context threads, and enforce pinning again, but that would
require some form of communication with QEMU (maybe -preconfig?).
But that would still be wrong, because it would circumvent
<emulatorpin/>.
Technically speaking, thread-context is an internal
implementation detail of QEMU, and if it weren't for it, the main
emulator thread would be doing the allocation. Therefore, we
should honor the pinning and prune the list of node so that
inaccessible ones are dropped.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2154750
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
When building a thread-context object (inside of
qemuBuildThreadContextProps()) we look at given memory-backend-*
object and look for .host-nodes attribute. This works, as long as
we need to just copy the attribute value into another
thread-context attribute. But soon we will need to adjust it.
That's the point where having the value in virBitmap comes handy.
Utilize the previous commit, which made
qemuBuildMemoryBackendProps() set the argument and pass it into
qemuBuildThreadContextProps().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
While it's true that anybody who's interested in getting
.host-nodes attribute value can just use
virJSONValueObjectGetArray() (and that's exactly what
qemuBuildThreadContextProps() is doing, btw), if somebody is
interested in getting the actual virBitmap, they would have to
parse the JSON array.
Instead, introduce an argument to qemuBuildMemoryBackendProps()
which is set to corresponding value used when formatting the
attribute.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
This consists of (1) adding the necessary args to the qemu commandline
netdev option, and (2) starting a passt process prior to starting
qemu, and making sure that it is terminated when it's no longer
needed. Under normal circumstances, passt will terminate itself as
soon as qemu closes its socket, but in case of some error where qemu
is never started, or fails to startup completely, we need to terminate
passt manually.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The aim of thread-context object is to set affinity on threads
that allocate memory for a memory-backend-* object. For instance:
-object '{"qom-type":"thread-context","id":"tc-ram-node0","node-affinity":[3]}' \
-object '{"qom-type":"memory-backend-memfd","id":"ram-node0","hugetlb":true,\
"hugetlbsize":2097152,"share":true,"prealloc":true,"prealloc-threads":8,\
"size":15032385536,"host-nodes":[3],"policy":"preferred",\
"prealloc-context":"tc-ram-node0"}' \
allocates 14GiB worth of memory, backed by 2MiB hugepages from
host NUMA node 3, using 8 threads. If it weren't for
thread-context these threads wouldn't have any affinity and thus
theoretically could be scheduled to run on CPUs of different NUMA
node (which is what I saw occasionally).
Therefore, whenever we are pinning memory (IOW setting host-nodes
attribute), we can generate thread-context object with the same
affinity.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The G_GNUC_NO_INLINE macro will eventually be marked as
deprecated [1] and we are recommended to use G_NO_INLINE instead.
Do the switch now, rather than waiting for compile time warning
to occur.
1: 15cd0f0461
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The commandline generated from our XML->native convertor is the majority
of cases not usable without libvirt anyways and the situation will not
improve any more.
As of such there's no much utility of avoiding the use of stopped CPUs
flag in such case.
Remove the QEMU_BUILD_COMMAND_LINE_CPUS_RUNNING flag and the associated
logic.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Now that we store the state of the host FIPS mode setting in the qemu
driver object, we don't need to outsource the logic into
'qemuCheckFips'.
Additionally since we no longer support very old qemu's which would not
yet have --enable-fips we can drop the part of the comment about very
old qemus.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Introduce 'qemuBuildCommandLineFlags' and use it instead of specific
flag booleans.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
We already format a commandline using FD passing for the tap devices so
formatting the 'vhost' file descriptors won't make it any less usable
directly.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Populate the 'slirpfd' qemuFDPass structure inside the private data for
passing the fd to qemu rather than using out-of-band variables.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use the new infrastructure which stores the fds inside 'qemuFDPass'
objects in the private data.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
All callers effectively pass 'net->driver.virtio.queues'. In case of the
code in 'qemu_hotplug.c' this value was set to '1' if it was 0 before.
Since 'qemuBuildNicDevProps' only uses it if it's greater than 1 we can
remove all the extra complexity.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Move the setup of the 'vdpa' netdev into the new helper shared between
commandline and hotplug code.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The helper will aggregate code that is used to connect the network
backend to the corresponding host portion.
This will be used to refactor the duplicated code between the cold-start
and hotplug helper functions.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
All supported QEMUs now accept werror/rerror as argument for the
frontend disk device, so we can remove the old code.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Since 'cancel_path' is constructed from the 'tpmdev' argument, we can
push it down into the function opening the FDs.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This mostly overlaps with virDomainAudioType, but in a couple of
cases the string representations are different.
Right now we're doing that in a somewhat sketchy way, in that we
store values of one enumeration and then convert them to strings
using TypeToString() implementation for the other enumeration;
when converting from string, we open-code the handling of the
special values mentioned above.
Drop the second enumeration and introduce two helpers to deal
with conversion. Most calling sites don't need to be changed, and
one can even be simplified significantly.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
There are a some scenarios in which we want to prealloc guest
memory (e.g. when requested in domain XML, when using hugepages,
etc.). With 'regular' <memory/> models (like 'dimm', 'nvdimm' or
'virtio-pmem') or regular guest memory it is corresponding
memory-backend-* object that ends up with .prealloc attribute
set. And that's desired because neither of those devices can
change its size on the fly. However, with virtio-mem model things
are a bit different. While one can set .prealloc attribute on
corresponding memory-backend-* object it doesn't make much sense,
because virtio-mem can inflate/deflate on the fly, i.e. change
how big of a portion of the memory-backend-* object is exposed to
the guest. For instance, from a say 4GiB module only a half can
be exposed to the guest. Therefore, it doesn't make much sense to
preallocate whole 4GiB and keep them allocated. But we still want
the part exposed to the guest preallocated (when conditions
described at the beginning are met).
Having said that, with new enough QEMU the virtio-mem-pci device
gained new attribute ".prealloc" which instructs the device to
talk to the memory backend object and allocate only the requested
portion of memory.
Now, that our algorithm for setting .prealloc was isolated in a
single function, the function can be called when constructing cmd
line for virtio-mem-pci device.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The @mem agrument of qemuBuildMemoryDeviceProps() function is
only read from. Make this fact obvious from the function
declaration too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function is already returning JSON properties, rename it
accordingly.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The 'vhost-user-fs-pci' has following properties we control:
chardev=<str> - ID of a chardev to use as a backend
queue-size=<uint16> - (default: 128)
tag=<str>
bootindex=<int32>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Build the properties of 'vhost-vsock' device via JSON. In comparison to
previous similar refactors this also modifies the hotplug code to attach
the vhost fd handle explicitly rather than using
'qemuMonitorAddDeviceWithFd'.
The properties of vhost-vsock have the following types according to
QEMU:
guest-cid=<uint64> - (default: 0)
vhostfd=<str>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Build the properties of 'vhost-scsi' device via JSON. In comparison to
previous similar refactors this also modifies the hotplug code to attach
the vhost fd handle explicitly rather than using
'qemuMonitorAddDeviceWithFd'.
The 'vhost-scsi' device doesn't have any special (non-string) properties.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Build commandlines for character devices via JSON.
For devices using 'VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_SERIAL' address
type 'qemuBuildDeviceAddressProps' will now generate the address. The
only special property is 'nr'. QEMU declares it as:
nr=<uint32> - (default: 4294967295)
The test fallout is caused by formatting addresses as decimal numbers
instead of hex as described in the commit which added
'qemuBuildDeviceAddressProps'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The handlers for PCI, SCSI and USB controllers already use JSON
internally. This patch converts 'virtio-serial', 'ccid' and 'sata' to do
the same and passes out the JSON directly so that it can be used in
monitor code to avoid conversion.
From the controllers converted in this patch only 'virtio-serial' has
special properties. QEMU thinks they have the following types:
max_ports=<uint32> - (default: 31)
vectors=<uint32> - (default: 2)
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The types for the special fields of the 'virtio-blk-pci' according to
QEMU are:
iothread=<link<iothread>>
ioeventfd=<bool> - on/off (default: true)
event_idx=<bool> - on/off (default: true)
scsi=<bool> - on/off (default: false)
num-queues=<uint16> - (default: 65535)
queue-size=<uint16> - (default: 256)
For all disks we also use the following properties (based on 'scsi-hd'):
device_id=<str>
share-rw=<bool> - (default: false)
drive=<str> - Node name or ID of a block device to use as a backend
chardev=<str> - ID of a chardev to use as a backend <- vhost-user-blk-pci
bootindex=<int32>
logical_block_size=<size> - A power of two between 512 B and 2 MiB (default: 0)
physical_block_size=<size> - A power of two between 512 B and 2 MiB (default: 0)
wwn=<uint64> - (default: 0)
rotation_rate=<uint16> - (default: 0)
vendor=<str>
product=<str>
removable=<bool> - on/off (default: false)
write-cache=<OnOffAuto> - on/off/auto (default: "auto")
cyls=<uint32> - (default: 0)
heads=<uint32> - (default: 0)
secs=<uint32> - (default: 0)
bios-chs-trans=<BiosAtaTranslation> - Logical CHS translation algorithm, auto/none/lba/large/rechs (default: "auto") <- ide-hd
serial=<str>
werror=<BlockdevOnError> - Error handling policy, report/ignore/enospc/stop/auto (default: "auto")
rerror=<BlockdevOnError> - Error handling policy, report/ignore/enospc/stop/auto (default: "auto")
The 'wwn' field is changed from a hex string to a number since qemu
actually treats it as a number.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Since 'qemuBuildDeviceAddressProps' now also builds 'drive' addresses
the generator is way simpler and doesn't use any special fields.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
For properties we use these are the QEMU types:
host=<str> - Address (bus/device/function) of the host device, example: 04:10.0
bootindex=<int32>
failover_pair_id=<str>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
For 'usb-mouse'/'usb-tablet'/'usb-kbd' we don't use any special
property.
For 'virtio-input-pci' we only use the 'evdev' argument which is a
string so this conversion doesn't impact anything.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The 'usb-redir' device has the following types according to QEMU for
properties we control:
chardev=<str> - ID of a chardev to use as a backend
filter=<str>
bootindex=<int32>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The 'usb-host' device has the following types according to QEMU for
properties we control:
hostdevice=<str>
hostbus=<uint32> - (default: 0)
hostaddr=<uint32> - (default: 0)
bootindex=<int32>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The 'vfio-pci-nohotplug' device has the following property types
according to QEMU:
display=<OnOffAuto> - on/off/auto (default: "off")
sysfsdev=<str>
ramfb=<bool>
bootindex=<int32>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The 'virtio-rng' has the following property types according to QEMU:
rng=<link<rng-backend>>
max-bytes=<uint64> - (default: 9223372036854775807)
period=<uint32> - (default: 65536)
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Note that the legacy 'ivshmem' device was already removed upstream, but
it's converted so that the code is identical.
For the two modern devices QEMU considers the properties being of
following types:
'ivshmem-doorbell'
chardev=<str> - ID of a chardev to use as a backend
ioeventfd=<bool> - on/off (default: true)
master=<OnOffAuto> - on/off/auto (default: "off")
vectors=<uint32> - (default: 1)
'ivshmem-plain'
master=<OnOffAuto> - on/off/auto (default: "off")
memdev=<link<memory-backend>>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The watchdog doesn't have any special properties.
Convert the command line generator and hotplug code.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Remove the now unused boot-index related attributes and the code which
is assigning it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>