Links between NUMA nodes can have different latencies and
bandwidths. This info is newly defined in ACPI 6.2 under
Heterogeneous Memory Attribute Table (HMAT) table. Linux kernel
learned how to report these values under sysfs and thus we can
expose them in our capabilities XML. The sysfs interface is
documented in kernel's Documentation/admin-guide/mm/numaperf.rst.
Long story short, two nodes can be in initiator-target
relationship. A node can be initiator if it has a CPU or a device
that's capable of initiating memory transfer. Therefore a node
that has just memory can only be target. An initiator-target link
can then have any combination of {bandwidth, latency} - {access,
read, write} attribute (6 in total). However, the standard says
access is applicable iff read and write values are the same.
Therefore, we really have just four combinations of attributes:
bandwidth-read, bandwidth-write, latency-read, latency-write.
This is the combination that kernel reports anyway.
Then, under /sys/system/devices/node/nodeX/acccessN/initiators we
find values for those 4 attributes and also symlinks named
"nodeN" which then represent initiators to nodeX. For instance:
/sys/system/node/node1/access1/initiators/node0 -> ../../node0
/sys/system/node/node1/access1/initiators/read_bandwidth
/sys/system/node/node1/access1/initiators/read_latency
/sys/system/node/node1/access1/initiators/write_bandwidth
/sys/system/node/node1/access1/initiators/write_latency
This means that node0 is initiator and node1 is target and values
of the interconnect can be read.
In theory, there can be separate links to memory side caches too
(e.g. one link from node X to node Y's main memory, another from
node X to node Y's L1 cache, another one to L2 cache and so on).
But sysfs does not express this relationship just yet.
The "accessN" means either "access0" or "access1". The difference
is that while the former expresses the best interconnect between
two nodes including CPUS and I/O devices (such as GPUs and NICs),
the latter includes only CPUs and thus is what we need.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1786309
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Expose virNumaInterconnect XML formatter so that it can be
re-used by other parts of the code.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
There's nothing domain specific about NUMA interconnects. Rename
the virDomainNumaInterconnect* structures and enums to
virNumaInterconnect*.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Memory on a NUMA node can have a side caches. Configuring these
for a domain was implemented in v6.6.0-rc1~249 and friends.
However, up until now mgmt applications did not really know what
values to pass because we were not exposing caches of the host.
With recent enough kernel these are exposed under sysfs and with
a bit of parsing we can extend our capabilities XML. The sysfs
structure is documented in kernel's
Documentation/admin-guide/mm/numaperf.rst and basically maps in
1:1 fashion to our virNumaCache structure.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Expose virNumaCache XML formatter so that it can be re-used by
other parts of the code.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
There's nothing domain specific about NUMA memory caches. Rename the
virDomainCache* structures and enums to virNumaCache*.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The way we format <cpu/> element for capabilities is not ideal,
because if there are no CPUs, i.e. no child elements, we still
output opening and closing element. To solve this,
virXMLFormatElement() could be used but that would introduce more
variables into the loop. Therefore, move the formatter into a
separate function and use virXMLFormatElement().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
It may happen that a NUMA node has no CPUs associated with it. We
allow this for domains since v6.6.0-rc1~250. Let's update our
capabilities schema to match that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Ideally, turning pointers into g_auto* would be done in one step
and dropping cleanup label and unused @ret variable in second
step, but since this is a test we don't care that much, do we?
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
When using firmware auto-selection and user enables AMD SEV-ES we need
to pick correct firmware that actually supports it. This can be detected
by having `amd-sev-es` in the firmware JSON description.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In a few places we take 1 and shift it left repeatedly. So much
that it won't longer fit into signed integer. The problem is that
this is undefined behaviour. Switching to 1U makes us stay within
boundaries.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
In a few places it may happen that the array we want to sort is
still NULL (e.g. because there were no leases found, no paths for
secdriver to lock or no cache banks). However, passing NULL to
qsort() is undefined and even though glibc plays nicely we
shouldn't rely on undefined behaviour.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
meson supports the following sanitizers: "address" (e.g. out-of-bounds
memory access, use-after-free, etc.), "thread" (data races), "undefined"
(e.g. signed integer overflow), and "memory" (use of uninitialized
memory). Note that not all sanitizers are supported by all compilers,
and that more sanitizers exist.
Not all sanitizers can be enabled at the same time, but "address" and
"undefined" can. Both thread and memory sanitizers require an instrumented
build of all dependencies, including libc.
gcc and clang use different implementations of these sanitizers and
have proven to find different issues. Create CI jobs for both.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
"virt-aa-helper" links, amongst others, against "datatypes.o" and
"libvirt.so". The latter links against "libvirt_driver.a" which in turn
also links against "datatypes.o", leading to a One-Definition-Rule
violoation for "virConnectClass" et al. in "datatypes.c".
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
"openvzutilstest" links, amongst others, against "libvirt_openvz.a" and
"libvirt.so". The latter also links against "libvirt_openvz.a", leading
to a One-Definition-Rule violation for "openvzLocateConfFile" in
"openvz_conf.c".
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When other preloaded libraries wrap and / or make calls to `realpath`
(e.g. LLVM's AddessSanitizer), the second parameter is no longer
guaranteed to be NULL.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When enabling sanitizers, clang adds some function symbols when
instrumenting the code. The exact names of those functions are an
implementation detail and should therefore not be added to any
syms file. This patch prevents build failures due to those symbols
not present in the syms file when building with sanitizers enabled.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When enabling sanitizers, gcc adds some instrumentation to the code
that may enlarge stack frames. Some function's stack frames are already
close to the limit of 4096 and are enlarged past that threshold,
e.g. virLXCProcessStart which reaches a frame size of 4624 bytes.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Now we have everything prepared so that @model doesn't have to be
rewritten. The correct model can be chosen right from the
beginning.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
We want to call qemuBuildVirtioDevStr() from
qemuBuildDeviceVideoStr() but only for some models (currently
"virtio-gpu" and "vhost-user-gpu"), not all of them. Move this
logic into qemuDeviceVideoGetModel() because this logic will be
refined.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This may look like a step backwards, but it isn't. The point is
that in near future the chosen model will depend on more than
just video type.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This may look like a step backwards, but it isn't. The point is
that in near future the chosen model will depend on more than
just video type.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
There is the same check written twice (whether given video card
is primary one and whether it supports VGA mode). Write it just
once and store it in a boolean variable.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The code that decides video card model is going to be reworked
and expanded. Separate it out into a function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This function doesn't modify passed video definition. Make the
argument const.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
QEMU 6.1 will replace the virgl property of virtio-vga device to
virtio-vga-gl device. Adapt to that update.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/167
Signed-off-by: Han Han <hhan@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
QEMU 6.1 will add virtio-gpu-gl-pci device to replace the virgl property
of virtio-gpu-pci device. Adapt to that change.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1967356
Signed-off-by: Han Han <hhan@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The devices virtio-gpu-gl-pci and virtio-vga-gl, aimed to replace the
virgl property, are valid for 3d accerlation as well.
Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This flag will be used for the device virtio-gpu-gl-pci which is introduced
since QEMU 6.1.
Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
"avx-vvni" was introduced to qemu in commit
c1826ea6a052084f2e6a0bae9dd5932a727df039, adding it Cooperlake.
This feature is currently not used by any libvirt CPU models, but its
addition silences a warning from sync_qemu_i386.py:
```
warning: Unknown feature 'CPUID_7_1_EAX_AVX_VNNI'
warning: Feature unknown to libvirt: CPUID_7_1_EAX_AVX_VNNI
```
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Connecting a tap device to an Open vSwitch is done by adding a "port"
to the switch with the ovs-vsctl "add-port" command. The port will
have the same name as the tap device, but it is a separate entity, and
can survive beyond the destruction of the tap device (although under
normal circumstances the port will be deleted around the same time the
tap device is deleted).
This makes it possible for a port of a particular name to already
exist at the time libvirt calls ovs-vsctl to add that port. The
original commit of Open vSwitch support (commit df81004632, libvirt
0.9.10, Feb. 2012) used the "--may-exist" option to the add-port
command to indicate that a port of the desired name might already
exist, and that it was okay to simply re-use this port (rather than
failing with an error message).
Then in commit 33445ce844 (libvirt 1.2.7, April 2014) the command
was changed to use "--if-exists del-port blah" instead of
"--may-exist". The reason given was that there was a bug in OVS where
a stale port would be unusable even though it still existed; the
workaround was to forcibly delete any existing port prior to adding
the new port (of the same name). This is the ovs-vsctl command still
in use by libvirt today.
It recently came up in the discussion of a bug concerning guest packet
loss during OpenStack upgrades (https://bugzilla.redhat.com/1963164)
that the bug in OVS that necessitated the del-port workaround was
fixed quite a long time ago (August 2015):
e21c6643a0
thus rendering the workaround in libvirt unnecessary. The assertion in
that discussion is that this workaround is now the cause of the packet
loss being experienced during OpenStack upgrades. I'm not convinced
this is the case, but it does appear that there is no reason to carry
this workaround in libvirt any longer, so this patch reverts the code
back to the original behavior (using "--may-exist" instead of
"--if-exists del-port").
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
I've identified some places (mostly by looking for
virBufferUse()) that can use virXMLFormatElement() instead of
open coded version of it. I'm sure there are many more places
that could use the same treatment. Let's cure them some other
time.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The semicolon in question makes the pipeline fail over a style checker
complaint.
Introduced-in: 360b8eb2d2
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
virt-host-validate should print "Checking for device assignment IOMMU
support" for all architectures, not only for Intel / AMD.
This is the output without the patch:
```
[fidencio@dentola libvirt]$ virt-host-validate
QEMU: comprobando if device /dev/kvm exists : PASA
QEMU: comprobando if device /dev/kvm is accessible : PASA
QEMU: comprobando if device /dev/vhost-net exists : PASA
QEMU: comprobando if device /dev/net/tun exists : PASA
QEMU: comprobando for cgroup 'cpu' controller support : PASA
QEMU: comprobando for cgroup 'cpuacct' controller support : PASA
QEMU: comprobando for cgroup 'cpuset' controller support : PASA
QEMU: comprobando for cgroup 'memory' controller support : PASA
QEMU: comprobando for cgroup 'devices' controller support : ADVERTENCIA (Enable 'devices' in kernel Kconfig file or mount/enable cgroup controller in your system)
QEMU: comprobando for cgroup 'blkio' controller support : PASA
ADVERTENCIA (Unknown if this platform has IOMMU support)
QEMU: comprobando for secure guest support : ADVERTENCIA (Unknown if this platform has Secure Guest support)
```
This is the output with the patch:
```
[fidencio@dentola libvirt]$ ./build/tools/virt-host-validate
QEMU: Checking if device /dev/kvm exists : PASS
QEMU: Checking if device /dev/kvm is accessible : PASS
QEMU: Checking if device /dev/vhost-net exists : PASS
QEMU: Checking if device /dev/net/tun exists : PASS
QEMU: Checking for cgroup 'cpu' controller support : PASS
QEMU: Checking for cgroup 'cpuacct' controller support : PASS
QEMU: Checking for cgroup 'cpuset' controller support : PASS
QEMU: Checking for cgroup 'memory' controller support : PASS
QEMU: Checking for cgroup 'devices' controller support : WARN (Enable 'devices' in kernel Kconfig file or mount/enable cgroup controller in your system)
QEMU: Checking for cgroup 'blkio' controller support : PASS
QEMU: Checking for device assignment IOMMU support : WARN (Unknown if this platform has IOMMU support)
QEMU: Checking for secure guest support : WARN (Unknown if this platform has Secure Guest support)
```
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This was introduced in qemu commit
e11fd68996fb27c040552320f01a7d30a15a7cc1.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This member is unused (apart from only being set in
virCHDriverConfigNew()), and never freed really (leading to a
memleak).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
If the chStateInitialize method fails, we call chStateCleanup
which free's all global state. It fails to set the global
'ch_driver' to NULL, however, so a later attempt to open the
cloud hypervisor driver will succeed and then crash attempting
to access freed memory.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>