When a thread-context object is specified on the cmd line, then
QEMU spawns a thread and sets its affinity to the list of NUMA
nodes specified in .node-affinity attribute. And this works just
fine, until the main QEMU thread itself is not restricted.
Because of v5.3.0-rc1~18 we restrict the main emulator thread
even before QEMU is executed and thus then it tries to set
affinity of a thread-context thread, it inevitably fails with:
Setting CPU affinity failed: Invalid argument
Now, we could lift the pinning temporarily, let QEMU spawn all
thread-context threads, and enforce pinning again, but that would
require some form of communication with QEMU (maybe -preconfig?).
But that would still be wrong, because it would circumvent
<emulatorpin/>.
Technically speaking, thread-context is an internal
implementation detail of QEMU, and if it weren't for it, the main
emulator thread would be doing the allocation. Therefore, we
should honor the pinning and prune the list of node so that
inaccessible ones are dropped.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2154750
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
When building a thread-context object (inside of
qemuBuildThreadContextProps()) we look at given memory-backend-*
object and look for .host-nodes attribute. This works, as long as
we need to just copy the attribute value into another
thread-context attribute. But soon we will need to adjust it.
That's the point where having the value in virBitmap comes handy.
Utilize the previous commit, which made
qemuBuildMemoryBackendProps() set the argument and pass it into
qemuBuildThreadContextProps().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
While it's true that anybody who's interested in getting
.host-nodes attribute value can just use
virJSONValueObjectGetArray() (and that's exactly what
qemuBuildThreadContextProps() is doing, btw), if somebody is
interested in getting the actual virBitmap, they would have to
parse the JSON array.
Instead, introduce an argument to qemuBuildMemoryBackendProps()
which is set to corresponding value used when formatting the
attribute.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
There are two compound conditions in
qemuBuildMemoryBackendProps() and each one checks for nodemask
for NULL first. Join them into one bigger block.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
The order of pinning priority (at least for emulator thread) was
set by v1.2.15-rc1~58 (for cgroup code). But later, when
automatic placement was implemented into
qemuDomainGetEmulatorPinInfo(), the priority was not honored.
Now that we have this priority code in a separate function, we
can just call that and avoid this type of error.
Fixes: 776924e376
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
The set of if()-s that determines the preference in cpumask used
for setting things like emulatorpin, vcpupin, etc. is going to be
re-used. Separate it out into a function.
You may think that this changes behaviour, but
qemuProcessPrepareDomainNUMAPlacement() ensures that
priv->autoCpuset is set for VIR_DOMAIN_CPU_PLACEMENT_MODE_AUTO.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Since qemuxml2argvtest is now using virnumamock, there's no need
for qemuxml2argvmock to offer reimplementation of virNuma*()
functions. Also, the comment about CLang and FreeBSD (introduced
in v4.3.0-40-g77ac204d14) is no longer true. Looks like noinline
attribute was the missing culprit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
So far, the memory-hotplug-dimm-addr.xml test case pins its vCPUs
onto CPUs 0-1 which correspond to NUMA node #0 (per
tests/vircaps2xmldata/linux-basic/system/node/node0). Place vCPUs
onto nodes #1 and #2 too so that DIMM <memory/> device can
continue using thread-context after future patches. This
configuration, as-is currently, would make QEMU error out anyway.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
We have couple of qemuxml2argvtest cases where up to 8 NUMA nodes
are assumed. These are used to check whether disjoint ranges of
host-nodes= is generated properly. Without prejudice to the
generality, we can rewrite corresponding XML files to use up to 4
NUMA nodes and still have disjoint ranges.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
While no part of cmd line building process currently depends on a
host NUMA configuration, this will change soon. Use freshly
changed virnumamock from qemuxml2argvtest and make the mock read
NUMA data from vircaps2xmldata which seems to have the most rich
NUMA configuration.
This also means, we have to start building virnumamock
unconditionally. But this is not a problem, since nothing inside
of the mock relies on Linux specificity. The whole mock is merely
just reading files and parsing them.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Introduce a mock of virNumaGetNodeOfCPU() because soon we will
need virNumaCPUSetToNodeset() to return predictable results.
Also, fill in missing symlinks in vircaps2xmldata/.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
So far, we have a function that expands given list of NUMA nodes
into list of CPUs. But soon, we are going to need the inverse -
expand list of CPUs into list of NUMA nodes. Introduce
virNumaCPUSetToNodeset() for that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Technically, there's nothing libnuma specific about
virNumaNodesetToCPUset(). It just implements a generic algorithm
over virNumaGetNodeCPUs() (which is then libnuma dependant).
Nevertheless, there's no need to have this function living inside
WITH_NUMACTL block. Any error returned from virNumaGetNodeCPUs()
(including the one that !WITH_NUMACTL stub returns) is propagated
properly.
Move the function out of the block into a generic one and drop
the !WITH_NUMACTL stub.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
We have this crazy backwards compatibility when it comes to
serial and console devices. Basically, in same cases the very
first <console/> is just an alias to the very first <serial/>
device. This is to be seen at various places:
1) virDomainDefFormatInternalSetRootName() - when generating
domain XML, the <console/> configuration is basically ignored
and corresponding <serial/> config is formatted,
2) virDomainDefAddConsoleCompat() - which adds a copy of
<serial/> or <console/> into virDomainDef in post parse.
And when talking to QEMU we need a special handling too, because
while <serial/> is generated on the cmd line, the <console/> is
not. And in a lot of place we get it right. Except for generating
device aliases. On domain startup the 'expected' happens and
devices get "serial0" and "console0" aliases, correspondingly.
This ends up in the status XML too. But due to aforementioned
trick when formatting domain XML, "serial0" ends up in both
'virsh dumpxml' and the status XML. But internally, both devices
have different alias. Therefore, detaching the device using
<console/> fails as qemuDomainDetachDeviceChr() tries to detach
"console0".
After the daemon is restarted and status XML is parsed, then
everything works suddenly. This is because in the status XML both
devices have the same alias.
Let's generate correct alias from the beginning.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2156300
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Other APIs that internally use QEMU migration and need to temporarily
suspend a domain already report failure to resume vCPUs by setting
VIR_DOMAIN_PAUSED_API_ERROR state reason and emitting
VIR_DOMAIN_EVENT_SUSPENDED event with
VIR_DOMAIN_EVENT_SUSPENDED_API_ERROR.
Let's do the same in qemuMigrationSrcRestoreDomainState for consistent
behavior.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Some APIs (migration, save/restore, snapshot, ...) require a domain to
be suspended temporarily. In case resuming the domain fails, the domain
will be unexpectedly left paused when the API finishes. This situation
is reported via VIR_DOMAIN_EVENT_SUSPENDED event with
VIR_DOMAIN_EVENT_SUSPENDED_API_ERROR detail. But we do not have a
corresponding reason for VIR_DOMAIN_PAUSED state and the reason would
remain set to the value used when the domain was paused. So the state
reason would suggest the operation is still running.
This patch changes the state reason to a new VIR_DOMAIN_PAUSED_API_ERROR
to make it clear the API that paused the domain already finished, but
failed to resume the domain.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
For some vhostuser daemons, we validate that the guest memory is shared
with the host.
With earlier versions of QEMU, it was only possible to mark memory
as shared by defining an explicit NUMA topology. Later, QEMU exposed
the name of the default memory backend (defaultRAMid) so we can mark
that memory as shared.
Since libvirt commit:
commit bff2ad5d6b
qemu: Relax validation for mem->access if guest has no NUMA
we already check for the case when user requests shared memory,
but QEMU did not expose defaultRAMid.
Drop the duplicit check from vhostuser device validation, to make
it pass on hotplug even after libvirtd restart.
This avoids the need to store the defaultRAMid, since we don't really
need it for anything after the VM has been already started.
https://bugzilla.redhat.com/show_bug.cgi?id=2078693https://bugzilla.redhat.com/show_bug.cgi?id=2177701
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Now that we have SELinux support for passt, we want things to
work out of the box and that requires having the passt-specific
SELinux bits installed.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Currently the 'Releases' column pointed to the generic page about the
specific go module. Change the link to point to the respective
pkg.go.dev page for the module.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
The releases directory is empty. Don't advertise it on our downloads
page.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
The directory doesn't exist. The project also doesn't have any releases
on gitlab so there's nothing to replace it with.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
We split off the downloads into a new subdomain. Link directly to it
instead of relying on redirects.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
- drop the link to the FTP server which doesn't exist any more
- change links to libvirt.org/source to download.libvirt.org
- change link to the maven repository to point to download.libvirt.org
- change link to javadoc to the documentation generated via gitlab job
in the libvirt-java project
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Conversion of the wiki to static pages means that the integrated search
no longer functions. Use the same approach we have for other search to
simply defer to google.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The platform check which determines when to apply the fixups mentions
all officially supported build targets (per docs/platforms.rst) thus
it's not really necessary.
Additionally while not explicitly written as supported the check does
not work properly when building with the MinGW toolchain on Windows as
it does not apply the needed transformations. They are necessary
there the same way as with MinGW on Linux.
https://gitlab.com/libvirt/libvirt/-/issues/453
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In a few places we still use the good old:
sizeof(var) / sizeof(var[0])
sizeof(var) / sizeof(int)
The G_N_ELEMENTS() macro is preferred though. In a few places we
don't link with glib, so provide the macro definition.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
set useBinarySpecificLabel = true when calling qemuSecurityCommandRun
for the passt process, so that the new process context will include
the binary-specific label that should be used for passt (passt_t)
rather than svirt_t (as would happen if useBinarySpecificLabel was
false). (The MCS part of the label, which is common to all child
processes related to a particular qemu domain instance, is also set).
Resolves: https://bugzilla.redhat.com/2172267
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Normally when a child process is started by libvirt, the SELinux label
of that process is set to virtd_t (plus an MCS range). In at least one
case (passt) we need for the SELinux label of a child process label to
match the label that the binary would have transitioned to
automatically if it had been run standalone (in the case of passt,
that label is passt_t).
This patch modifies virSecuritySELinuxSetChildProcessLabel() (and all
the functions above it in the call chain) so that the toplevel
function can set a new argument "useBinarySpecificLabel" to true. If
it is true, then virSecuritySELinuxSetChildProcessLabel() will call
the new function virSecuritySELinuxContextSetFromFile(), which uses
the selinux library function security_compute_create() to determine
what would be the label of the new process if it had been run
standalone (rather than being run by libvirt) - the MCS range from the
normally-used label is added to this newly derived label, and that is
what is used for the new process rather than whatever is in the
domain's security label (which will usually be virtd_t).
In order to easily verify that nothing was broken by these changes to
the call chain, all callers currently set useBinarySpecificPath =
false, so all behavior should be completely unchanged. (The next
patch will set it to true only for the case of running passt.)
https://bugzilla.redhat.com/2172267
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Neither of these are modified anywhere in the function, and the
function will soon be called with an arg that actually is a const.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The binary to be exec'ed by virExec() is stored in
virCommand::args[0], and is resolved to a full absolute path (stored
in a local of virExec() just prior to execve().
Since we will have another use for the full absolute path, lets make
an API to resolve/retrieve the absolute path, and cache it in
virCommand::binaryPath so we only have to do the resolution once.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
libxl added support for specifying custom firmware paths long ago. The
functionality exists in all Xen version supported by libvirt. This patch
adds support for user-specified efi firmware paths in the libxl driver.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
passt provides an AppArmor abstraction that covers all the
inner details of its operation, so we can simply import that
and add the libvirt-specific parts on top: namely, passt
needs to be able to create a socket and pid file, while
the libvirt daemon needs to be able to kill passt.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Currently it's only possible to set this parameter during domain
creation via QEMU commandline passthrough feature.
With the new delay attribute it's also possible to set this
parameter if you want to attach a new NBD disk
using "virsh attach-device domain device.xml" e.g.:
<disk type='network' device='disk'>
<driver name='qemu' type='raw'/>
<source protocol='nbd' name='foo'>
<host name='example.org' port='6000'/>
<reconnect delay='10'/>
</source>
<target dev='vdb' bus='virtio'/>
</disk>
Signed-off-by: Christian Nautze <christian.nautze@exoscale.ch>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Commit 54fa1b44af ("conf: Add loadparm boot option for a boot device")
added the ability to specify a loadparm parameter on a <boot/> tag, while
commit 29ba41c2d4 ("qemu: Add loadparm to qemu command line string")
added that value to the QEMU "-machine" command line parameters.
Unfortunately, the latter commit only looked at disks and network
devices for boot information, even though anything with
VIR_DOMAIN_DEF_FORMAT_ALLOW_BOOT could potentially have this tag.
In practice, a <hostdev> tag pointing to a passthrough (SCSI or DASD)
disk device can be used in this way, which means the loadparm is
accepted, but not given to QEMU.
Correct this, and add some XML/argv tests.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Good to have for debugging in case something wrong happens during
incoming migration.
Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
For shutoff VMs we don't have the storage source backing chain
populated so it will fail this check and error out. Move it to
part that is done only when VM is running.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>