Commit Graph

30676 Commits

Author SHA1 Message Date
Jonathon Jongsma
2a615af38f nodedev: Handle NULL command variable
In commit 68580a51, I removed the checks for NULL cmd variables because
virCommandRun() already handles the case where it is called with a NULL
cmd. Unfortunately, it handles this case by raising a generic error
which is both unhelpful and overwrites our existing error message. So
for example, when I attempt to create a mediated device with an invalid
parent, I get the following output:

    virsh # nodedev-create mdev-test.xml
    error: Failed to create node device from mdev-test.xml
    error: internal error: invalid use of command API

With this patch, I now get a useful error message again:

    virsh # nodedev-create mdev-test.xml
    error: Failed to create node device from mdev-test.xml
    error: internal error: unable to find parent device 'pci_0000_00_03_0'

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
2021-07-01 16:34:03 +02:00
Jonathon Jongsma
a96df6424f nodedev: Remove useless device name from error message
At the point where the error message is emitted, the field def->name is
still set to "new device", so the error message becomes:

  Unable to start mediated device 'new device': ...

Since the name doesn't contain anything useful, just omit it from the
error message altogether.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
2021-07-01 16:34:03 +02:00
Jonathon Jongsma
e9b534905f nodedev: handle mdevs from multiple parents
Due to a rather unfortunate misunderstanding, we were parsing the list
of defined devices from mdevctl incorrectly. Since my primary
development machine only has a single device capable of mdevs, I
apparently neglected to test multiple parent devices and made some
assumptions based on reading the mdevctl code. These assumptions turned
out to be incorrect, so the parsing failed when devices from more than
one parent device were returned.

The details: mdevctl returns an array of objects representing the
defined devices. But instead of an array of multiple objects (with each
object representing a parent device), the array always contains only a
single object. That object has a separate property for each parent
device.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-07-01 16:34:03 +02:00
Shalini Chellathurai Saroja
9c3b6b7a82 conf: verify for duplicate hostdevs
It is possible to define/edit(in shut off state) a domain XML with
same hostdev device repeated more than once, as shown below. This
behavior is not expected. So, this patch fixes it.

vser1:
<domain type='kvm'>
[...]
  <devices>
 [...]
    <hostdev mode='subsystem' type='mdev' managed='no' model='vfio-ccw'>
      <source>
        <address uuid='8e782fea-e5f4-45fa-a0f9-024cf66e5009'/>
      </source>
      <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0005'/>
    </hostdev>
    <hostdev mode='subsystem' type='mdev' managed='no' model='vfio-ccw'>
      <source>
        <address uuid='8e782fea-e5f4-45fa-a0f9-024cf66e5009'/>
      </source>
      <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0006'/>
    </hostdev>
[...]
  </devices>
</domain>

$ virsh define vser1
Domain 'vser1' defined from vser1

Signed-off-by: Shalini Chellathurai Saroja <shalini@linux.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-07-01 16:34:03 +02:00
Andrea Bolognani
d8a1c059e0 qemu: Reject TPM 1.2 for ARM virt guests
We already reject TPM 1.2 in a number of scenarios; let's add
ARM virt guests to the list.

https://bugzilla.redhat.com/show_bug.cgi?id=1970310

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Tested-by: Liu Yiding <liuyd.fnst@fujitsu.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-07-01 16:15:05 +02:00
Andrea Bolognani
7ace0fd221 qemu: Default to TPM 2.0 for ARM virt guests
The TPM 2.0 specification predates ARM virtualization, and so
implementing TPM 1.2 support on ARM was not considered a useful
endeavor.

This is technically a breaking change, but TPM support on ARM was
only introduced fairly recently (libvirt 7.1.0) and the previous
default resulted in non working TPM devices; anyone who has a
working configuration is not going to be affected.

https://bugzilla.redhat.com/show_bug.cgi?id=1970310

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Tested-by: Liu Yiding <liuyd.fnst@fujitsu.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-07-01 16:15:05 +02:00
Daniel P. Berrangé
15073504db security: fix SELinux label generation logic
A process can access a file if the set of MCS categories
for the file is equal-to *or* a subset-of, the set of
MCS categories for the process.

If there are two VMs:

  a) svirt_t:s0:c117
  b) svirt_t:s0:c117,c720

Then VM (b) is able to access files labelled for VM (a).

IOW, we must discard case where the categories are equal
because that is a subset of many other valid category pairs.

Fixes: https://gitlab.com/libvirt/libvirt/-/issues/153
CVE-2021-3631
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-06-30 14:51:42 +01:00
Michal Privoznik
438b50dda8 virSetUIDGIDWithCaps: Don't drop CAP_SETPCAP right away
There are few cases where we execute a virCommand with all caps
cleared (virCommandClearCaps()). For instance
dnsmasqCapsRefreshInternal() does just that. This means, that
after fork() and before exec() the virSetUIDGIDWithCaps() is
called. But since the caller did not want to change anything,
just drop capabilities, these are the values of arguments:

  virSetUIDGIDWithCaps (uid=-1, gid=-1, groups=0x0, ngroups=0,
                        capBits=0, clearExistingCaps=true)

This means that indeed all capabilities will be dropped,
including CAP_SETPCAP. But this capability controls whether
capabilities can be set, IOW whether capng_apply() succeeds.

There are two calls of capng_apply() in the function. The
CAP_SETPCAP is dropped after the first call and thus the other
call (capng_apply(CAPNG_SELECT_BOUNDS);) fails.

The solution is to keep the capability for as long as needed
(just like CAP_SETGID and CAP_SETUID) and drop it only at the
very end (just like CAP_SETGID and CAP_SETUID).

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1949388
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2021-06-29 08:52:12 +02:00
Jim Fehlig
64ae7635e6 Apparmor: Allow reading /etc/ssl/openssl.cnf
I noticed the following denial when running confined VMs with the QEMU
driver

type=AVC msg=audit(1623865089.263:865): apparmor="DENIED" operation="open" \
profile="virt-aa-helper" name="/etc/ssl/openssl.cnf" pid=12503 \
comm="virt-aa-helper" requested_mask="r" denied_mask="r" fsuid=0 ouid=0

Allow reading the file by including the openssl abstraction in the
virt-aa-helper profile.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
2021-06-24 13:54:47 -06:00
Jim Fehlig
f552e68d9f Apparmor: Allow reading libnl's classid file
I noticed the following denial messages from apparmor in audit.log when
starting confined VMs via the QEMU driver

type=AVC msg=audit(1623864006.370:837): apparmor="DENIED" operation="open" \
profile="virt-aa-helper" name="/etc/libnl/classid" pid=11265 \
comm="virt-aa-helper" requested_mask="r" denied_mask="r" fsuid=0 ouid=0

type=AVC msg=audit(1623864006.582:849): apparmor="DENIED" operation="open" \
profile="libvirt-0ca2720d-6cff-48bb-86c2-61ab9a79b6e9" \
name="/etc/libnl/classid" pid=11270 comm="qemu-system-x86" \
requested_mask="r" denied_mask="r" fsuid=107 ouid=0

It is possible for site admins to assign names to classids in this file,
which are then used by all libnl tools, possibly those used by libvirt.
To be on the safe side, allow read access to the file in the virt-aa-helper
profile and the libvirt-qemu abstraction.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
2021-06-24 13:54:42 -06:00
Daniel P. Berrangé
fdaddd910e rpc: prefer SHA256 host key fingerprint with new libssh
The host key fingerprint for SSH servers is used in a scenario where
cryptographic strength is important. We should thus be defaulting to
use of SHA256 where available. We only need SHA1 for Ubuntu 18.04
which does not have libssh >= 0.8.1

Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-06-23 18:43:22 +01:00
Tim Wiederhake
b683978f1f virDomainFeaturesDefParse: Simplify APIC parsing
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-06-23 13:45:56 +02:00
Tim Wiederhake
f1a65a8163 virDomainFeaturesCapabilitiesDefParse: Remove ctxt
Iterating over all child elements of a node does not require xpath.
By doing away with xpath for this code, the code can be simplified.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-06-23 13:45:54 +02:00
Tim Wiederhake
2afc9fdc82 virDomainFeaturesDefParse: Factor out capabilities parsing into separate function
Cleanup to follow. This removes the last re-use of `nodes` in this function,
eliminating two VIR_FREEs.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-06-23 13:45:52 +02:00
Tim Wiederhake
2c2fe23bef virDomainFeaturesDefParse: Inline MSRS parsing
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-06-23 13:45:49 +02:00
Tim Wiederhake
eeb94215b0 virDomainFeaturesDefParse: Inline SMM parsing
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-06-23 13:45:47 +02:00
Tim Wiederhake
6e872ab3f4 virDomainFeaturesXENDefParse: Remove tautological "if"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-06-23 13:45:45 +02:00
Tim Wiederhake
f1149b8d3a virDomainFeaturesXENDefParse: Remove ctxt
Iterating over all child elements of a node does not require xpath.
By doing away with xpath for this code, the code can be simplified.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-06-23 13:45:43 +02:00
Tim Wiederhake
6b45c61e88 virDomainFeaturesDefParse: Factor out XEN parsing into separate function
Only moving code, cleanup to follow.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-06-23 13:45:40 +02:00
Tim Wiederhake
b194a21a9e virDomainFeaturesKVMDefParse: Remove tautological "if"
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-06-23 13:45:38 +02:00
Tim Wiederhake
e2bce45829 virDomainFeaturesKVMDefParse: Remove tautological "switch"
`feature` is always one of the values listed in the switch,
ensured by `virDomainKVMTypeFromString` above.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-06-23 13:45:36 +02:00
Tim Wiederhake
3c5e607b24 virDomainFeaturesKVMDefParse: Remove ctxt
Iterating over all child elements of a node does not require xpath.
By doing away with xpath for this code, the code can be simplified.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-06-23 13:45:34 +02:00
Tim Wiederhake
947204c1a2 virDomainFeaturesDefParse: Factor out KVM parsing into separate function
Only moving code, cleanup to follow.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-06-23 13:45:32 +02:00
Tim Wiederhake
95ef93f2a3 virDomainFeaturesHyperVDefParse: Remove tautological "if"
Fix some line wrapping in the process.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-06-23 13:45:29 +02:00
Tim Wiederhake
70a4ac857c virDomainFeaturesHyperVDefParse: Remove ctxt
Iterating over all child elements of a node does not require xpath.
By doing away with xpath for this code, the code can be simplified.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-06-23 13:45:27 +02:00
Tim Wiederhake
7b82efcf46 virDomainFeaturesHyperVDefParse: Inline hyperv/stimer parsing
Iterating over all child elements of a node does not require xpath.
By doing away with xpath for this code, the code can be inlined and
simplified. This also removes the re-use of `nodes`, elimininating
two VIR_FREEs.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-06-23 13:45:24 +02:00
Tim Wiederhake
9489700da1 virDomainFeaturesDefParse: Factor out HyperV parsing into separate function
Only moving code, cleanup to follow.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-06-23 13:45:21 +02:00
Peter Krempa
73985cacf4 chValidateDomainDeviceDef: Remove per-device-type error messages
Vast majority of device types is not supported by the Cloud-Hypervisor
driver. Simplify the error reporting by using
virDomainDeviceTypeToString.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-06-22 13:25:23 +02:00
Jim Fehlig
e58004d70a Xen: Remove unneeded LIBXL_HAVE_* ifdefs
Now that the minimum supported Xen version has bumped to 4.9, all
uses of LIBXL_HAVE_* that are included in Xen 4.9 can be removed
from the libxl driver.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2021-06-21 10:43:04 -06:00
Pavel Hrdina
36d6da4ebf virresctrl: fix starting VMs with cputune.memorytune specified
When removing check for return value of VIR_EXPAND_N this place was
incorrectly modified causing failure to start a VM with cputune
memorytune configured with useless error message:

    error: Failed to start domain 'vm1'
    error: An error occurred, but the cause is unknown

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1973094
Fixes: 7d2fd6ef01
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-21 13:17:18 +02:00
Peter Krempa
71012d7164 virStorageBackendISCSIDirectFindPoolSources: Rework cleanup
virISCSIDirectScanTargets now returns a GStrv, so we can use automatic
cleanup for it and get rid of the cleanup section.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-21 10:46:35 +02:00
Peter Krempa
e51ffd2e33 virISCSIDirectUpdateTargets: Rework to simplify cleanup and return GStrv
Count the elements in advance rather than using VIR_APPEND_ELEMENT and
ensure that there's a NULL terminator for the string list so it's GStrv
compatible.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-21 10:46:35 +02:00
Peter Krempa
80b7e03ce5 virStorageBackendISCSIDirectFindPoolSources: Use allocated virStoragePoolSourceList
Using an allocated version together with copying the
host/initiator/device portions into it allows us to switch to automatic
clearing rather than open-coding it.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-21 10:46:35 +02:00
Peter Krempa
3776b6a93d conf: storage: Introduce virStoragePoolSourceListFree
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-21 10:46:35 +02:00
William Douglas
ff8557b433 ch_domain: Add handler for virDomainDeviceDefValidateCallback
Instead of trying to match devices passed in based on the monitor
detecting the number of devices that were used in the domain
definition, use the deviceValidateCallback to evaluate if
unsupported devices are used.

This allows the compiler to detect when new device types are added
that need to be checked.

Signed-off-by: William Douglas <william.douglas@intel.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-06-21 09:34:42 +02:00
Laine Stump
54b602019d qemu_hotplug: don't forget to add hostdev interfaces to the interface list
Originally qemuDomainAttachNetDevice() would wait until the cleanup at
the very end of the function to add newly hotplugged interfaces to the
domain's nets list. commit 7b8bec4560 modified it to add the new
interface to the nets list earlier (but not all the way at the
beginning of the function either, because there are some operations
(PCI address assignment in particular) that need the new device to not
yet be visible in the domaindef).

But hostdev interfaces short-circuit past most of the body of
qemuDomainAttachNetDevice() (since none of it applies to hostdev
interfaces). In the past that was okay, but since the line that adds
the new interface to the domaindef's nets list is in that "most of the
body", after that commit hotplugged hostdev interfaces are no longer
being properly added to the domaindef nets list, so they don't show up
in the status XML or the virsh domiflist output.

It really *is* important to add interfaces to the nets list earlier,
so we can't revert commit 7b8bec4560, and we also can't move the
insert to common code *earlier* in the function, so instead this patch
duplicates the VIR_APPEND_ELEMENT_COPY() just before the code path for
hostdev interfaces jumps to cleanup.

Resolves: https://bugzilla.redhat.com/1972468
Fixes: 7b8bec4560
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-06-18 19:04:40 -04:00
Daniel P. Berrangé
05bd8db60b remote: remove probing logic from virtproxyd dispatcher
Now that the remote driver itself can probe for listening sockets /
running daemons, virtproxyd doesn't need to probe URIs itself. Instead
it can just delegate to the remote driver.

Tested-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-06-18 17:13:11 +01:00
Daniel P. Berrangé
3e9b561139 remote: add support for probing drivers with modular daemons
With the traditional libvirtd, the virConnectOpen call will probe active
drivers server side to find which one to use when the URI is NULL/empty.

With the modular daemons though, the remote client does not know which
daemon to connect in the first place, so we can't rely on virConnectOpen
probing. Currently the virtproxyd daemon has code to probe for a
possible driver by looking at which sockets are listening or which
binaries are installed. The remote client can thus connect to virtproxyd
which in turn can connect to a real hypervisor driver.

The virtproxyd probing code though isn't something that needs to live in
virtproxyd. By moving it into the remote client we can get probing
client side in all scenarios and avoid the extra trip via virtproxyd in
the common case.

Tested-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-06-18 16:19:53 +01:00
Daniel P. Berrangé
191b3b81b1 remote: extract logic for probing for modular daemons
When virtproxyd gets a NULL URI, it needs to implement probing logic
similar to that found in virConnectOpen. The latter can't be used
directly since it relied on directly calling into the internal drivers
in libvirtd. virtproxyd approximates this behaviour by looking to see
what modular daemon sockets exist, or what daemon binaries are installed.

This same logic is also going to be needed when the regular libvirt
remote client switches to prefer modular daemons by default, as we
don't want to continue spawning libvirtd going forward.

Tested-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-06-18 16:19:42 +01:00
Daniel P. Berrangé
ce410b6ea9 remote: fix prefix for libxl Xen driver
The libxl driver supports xen:///system URLs and the daemon socket
uses 'virtxend' as the socket prefix.

Reported-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-06-18 15:24:56 +01:00
Peter Krempa
b396e9dd9d qemuSnapshotCreateActiveExternal: Don't unlink memory snapshot image if it was existing before
When writing the memory snapshot into an existing file don't remove it
if the snapshot fails later.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-18 09:16:16 +02:00
Peter Krempa
b30a8ee67d conf: snapshot: rename variable holding memory snapshot file location
'file' is too generic to know what's going on. Rename it to
'memorysnapshotfile'.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-18 09:16:16 +02:00
Peter Krempa
308aafe289 qemuSnapshotPrepareDiskExternal: Refactor existing file check
Use the snapshot disk type from the definition now that we validate that
it matches.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-18 09:16:16 +02:00
Peter Krempa
919b129603 qemuSnapshotPrepareDiskExternal: Enforce match between snapshot type and existing file type
The code executed later when creating a snapshot makes all decisions
based on the configured type rather than the actual type of the existing
file, while the check whether the file exists is based solely on the
on-disk type.

Since a block device is allowed to exist even when not reusing existing
files in contrast to regular files this creates a potential for a block
device to squeak past the check but then be influenced by other code
executed later. Specifically this is a problem when creating a snapshot
with the following XML:

  <domainsnapshot>
    <disks>
      <disk name='vdb' type='file'>
        <source file='/dev/sdb'/>
      </disk>
    </disks>
  </domainsnapshot>

If the snapshot creation fails, '/dev/sdb' will be removed because it's
considered to be a regular file by the cleanup code.

Add a check that will force that the configured type matches the on-disk
state.

Additional supporting reason is that qemu stopped to accept block
devices with the 'file' backend, thus the above configuration will not
work any more. This allows us to fail sooner.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1972145
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-18 09:16:16 +02:00
Peter Krempa
66adff17a8 qemuSnapshotPrepareDiskExternal: Reject creation of block devices sooner
In case when the snapshot target is of VIR_STORAGE_TYPE_BLOCK type and
doesn't exist libvirt won't be able to create it. Reject such a config
sooner.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-18 09:16:16 +02:00
Peter Krempa
a96cc845d7 qemuSnapshotPrepareDiskExternal: Avoid condition squashing
Separate the 'else if' branches into nested conditions so that it's more
obvious when we'll be adding additional checks later.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-18 09:16:16 +02:00
Peter Krempa
006821a809 qemuSnapshotPrepareDiskExternal: Move temp variables into the block using them
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-18 09:16:16 +02:00
Peter Krempa
c3e578b2ef qemu: capabilities: Fill egl-headless graphics support only when it's really supported
virQEMUCapsFillDomainDeviceGraphicsCaps fills data needed both for
validation of the graphics type and also for correct display in the
(dom)capablities XML.

Signal the support for egl-headless only when qemu has the capability.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2021-06-18 09:16:16 +02:00
Peter Krempa
4808323994 qemu: capabilities: Un-retire QEMU_CAPS_EGL_HEADLESS
egl-headless graphics can be compiled out in qemu so we need to be able
to know whether the given qemu version support it.

Base the capability on the presence of the 'egl-headless' member in
'query-display-options' or imply it if 'query-display-options' is not
supported as we implied it before for all versions.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2021-06-18 09:16:16 +02:00
Michal Privoznik
70a2b618bb qemu: Deduplicate code in qemuSecurityChownCallback()
The DAC security driver has an option to register a callback that
is called instead of chown(). So far QEMU is the only user of
this feature and it's used to set labels on non-local disks (like
gluster), where exists notion of owners but regular chown() can't
be used.

However, this callback (if set) is called always, even for local
disks. And thus the QEMU's implementation duplicated parts of the
DAC driver to deal with chown().

If the DAC driver would call the callback only for non-local
disks then the QEMU's callback can be shorter.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-17 15:49:17 +02:00
Michal Privoznik
6fba030fed virSecurityDACSetOwnershipInternal: Fix WIN32 code
I must admit, I have no idea why we build such POSIX dependent
code as DAC driver for something such not POSIX as WIN32. Anyway,
the code which is supposed to set error is not doing that. The
proper way is to mimic what chown() does:

  On error, -1 is returned, and errno is set to indicate the error.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-17 15:49:12 +02:00
Michal Privoznik
b332c2cf89 virSecurityDACSetOwnershipInternal: Don't overwrite @path argument
As shown in the previous commit, @path can be NULL. However, in
that case @src->path is also NULL. Therefore, trying to "fix"
@path to be not NULL is not going to succeed. The real value of
NULLSTR() is in providing a non-NULL string for error reporting.
Well, that can be done in the error reporting without overwriting
argument.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-17 15:49:07 +02:00
Michal Privoznik
5cfb3369b1 virSecurityDACSetOwnershipInternal: Drop dead code
The virSecurityDACSetOwnershipInternal() function accepts two
arguments (among others): @path and @src. The idea being that in
some cases @path is NULL and @src is not and then @path is filled
from @src->path. However, this is done in both callers already
(because of seclabel remembering/recall). Therefore, this code in
virSecurityDACSetOwnershipInternal() is dead, effectively.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-17 15:49:02 +02:00
Michal Privoznik
4ac78b95d3 security_dac: Don't check for !priv in virSecurityDACSetOwnershipInternal()
The virSecurityDACSetOwnershipInternal() has two callers and in
both the private data (@priv) is obtained via
virSecurityManagerGetPrivateData(). But in case of DAC driver the
private data can never be NULL. This is because the private data
is allocated in virSecurityManagerNewDriver() according to
.privateDataLen attribute of secdriver. In case of DAC driver the
attribute is set to sizeof(virSecurityDACData).

NB, no other function within DAC driver checks for !priv.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-17 15:48:56 +02:00
Michal Privoznik
1740f33bc8 security_dac: Introduce g_autoptr for virSecurityDACChownList
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-17 15:48:52 +02:00
Michal Privoznik
0782c4dcb3 security_dac: Introduce virSecurityDACChownItemFree()
Introduce a function that frees individual items on the chown
list and declare and use g_autoptr() for it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-17 15:48:44 +02:00
Michal Privoznik
91b5ced2f7 security_dac: Use g_autofree
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-17 15:48:10 +02:00
Ján Tomko
e8863b91fb conf: require target for external virtiofsd
When adding support for externally launched virtiofsd,
I was too liberal and did not require a target.

But the target is required, because it's passed to the
QEMU device, not to virtiofsd.

https://bugzilla.redhat.com/show_bug.cgi?id=1969232

Fixes: 12967c3e13
Fixes: 56dcdec1ac
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-06-17 10:51:24 +02:00
Ján Tomko
2dabd16588 conf: move filesystem target validation
Check the presence of the target in the validation phase.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-06-17 10:51:24 +02:00
Michal Privoznik
fb1289c155 qemu: Don't set NVRAM label when creating it
The NVRAM label is set in qemuSecuritySetAllLabel(). There's no
need to set its label upfront. In fact, setting it twice creates
an imbalance because it's unset only once which mangles seclabel
remembering. However, plain removal of the
qemuSecurityDomainSetPathLabel() undoes the fix for the original
bug (when dynamic ownership is off then the NVRAM is not created
with cfg->user and cfg->group but as root:root). Therefore, we
have to switch to virFileOpenAs() and pass cfg->user and
cfg->group and VIR_FILE_OPEN_FORCE_OWNER flag. There's no need to
pass VIR_FILE_OPEN_FORCE_MODE because the file will be created
with the proper mode.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1969347
Fixes: bcdaa91a27
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2021-06-17 09:15:09 +02:00
Ján Tomko
56dcdec1ac conf: reject duplicate virtiofs tags
https://gitlab.com/libvirt/libvirt/-/issues/178

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-06-16 16:57:57 +02:00
Lee Yarwood
b722f36e92 qemu_hotplug: Report VIR_ERR_DEVICE_MISSING when device is not found
126db34a81 had previously switched various
flows over to this from VIR_ERR_OPERATION_FAILED.

This change simply does the same for qemuDomainDetachPrepDisk,
qemuDomainDetachPrepInput and qemuDomainDetachPrepVsock to allow
management apps to centralise their error handling on just
VIR_ERR_DEVICE_MISSING for missing devices during a detach.

Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-06-16 13:01:36 +02:00
Michal Privoznik
9a51edebf8 virFindFileInPath: Don't pass NULL to g_canonicalize_filename()
If given file is not found in $PATH then g_find_program_in_path()
returns NULL. However, g_canonicalize_filename() does not accept
NULL as input.

Fixes: 65c2901906
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-06-15 21:14:03 +02:00
Peter Krempa
49d47342b3 virStorageBackendRBDGetVolNames: Refactor cleanup in 'rbd_list' version
Use automatic memory freeing for the string list so that we can remove
the cleanup section.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
361a18f405 virStorageBackendRBDGetVolNames: Fix memory leak in 'rbd_list2' version
The 'rbd_image_spec_t' struct has two string members 'id' and
'name'. We only stole the 'name' members thus the 'id's as well as the
whole list would be leaked on success.

Restructure the code so that we copy out the image names and call
rbd_image_spec_list_cleanup on success rather than on error.

The error path is then handled by using g_autofree for 'images'.

Since we no longer have a error path after allocating the returned
string list we can completely remove its cleanup.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
7d50abb805 qemuMonitorJSONGetStringListProperty: Don't return element count
The only caller doesn't care about the number of elements in the string
list so we don't have to calculate it.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
99908b930d qemuMonitorJSONGetStringArray: Don't return element count
There's just one caller who cares (testQemuMonitorJSONGetTPMModels). Fix
it and remove the counting of elements.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
a5bc5f0ecf virQEMUCapsProbeQMPTPM: Refactor handling of string lists
This refactors multiple aspects of the function:

1) Use automatic memory freeing
2) Remove need to check element count in the returned arrays
3) Fixes questionable code linebreaks
4) Removes reuse of variables

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
b20ef5e6de virQEMUCapsProcessStringFlags: Don't require 'nvalues'
All callers pass in NULL-terminated string lists. Remove the 'nvalues'
argument and fix all callers.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
651f77f898 qemu: capabilities: Use g_auto(GStrv) instead of virStringListFreeCount
All the capability getters which return a string list do in fact return
a NULL-terminated list so we can use g_auto(GStrv) to free it.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
ed4c75c4da qemuMonitorJSONGetObjectTypes: Refactor cleanup
Use automatic memory clearing to simplify the control flow.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
1a468c01a8 qemuMonitorJSONGetStringArray: Refactor cleanup
Use automatic memory clearing to simplify the control flow.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
ea0b164367 qemuMonitorJSONGetCommands: Refactor cleanup
Use automatic memory freeing to simplify the control flow.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
675755e044 qemuMonitorJSONGetMigrationCapabilities: Refactor cleanup
Use automatic memory clearing and remove the cleanup section.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
d0f60b89f3 qemuMonitorJSONGetObjectProps: Refactor cleanup
Use 'g_autoptr' for the two temporary JSON objects and remove the
cleanup section.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
b408580960 qemuMonitorJSONParsePropsList: Refactor cleanup
Use 'g_auto' for @proplist and remove @ret.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
599b17d580 qemu: capabilities: Fill SDL graphics support only when it's really supported
virQEMUCapsFillDomainDeviceGraphicsCaps fills data needed both for
validation of the graphics type and also for correct display in the
(dom)capablities XML.

Signal the support for SDL only when qemu has the capability.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
f9dda2805f qemu: capabilities: Un-retire QEMU_CAPS_SDL
SDL graphics can be compiled out in qemu so we need to be able to know
whether the given qemu version support it.

Base the capability on the presence of the 'sdl' member in
'query-display-options' or imply it if 'query-display-options' is not
supported as we implied it before for all versions.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
55ead2333f qemu: capabilities: Introduce QEMU_CAPS_QUERY_DISPLAY_OPTIONS
The command allows to query various display-related options. The absence
of the command will be used to imply certain video-related capabilities
before we would be able to detect them.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
c29bb0fbb6 qemu: validate: Don't check bus type in qemuValidateDomainDeviceDefDiskIOThreads
IOThreads are supported with all 3 currently supported buses which can
have virtio devices (PCI, CCW, MMIO) , so there's no need for this check.

Additionally this check was buggy in the current location as on e.g.
hotplug cases the address may not yet be assigned for the disk and thus
a bogus error would be printed.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1970277
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
e637d34277 qemuDomainCheckCCWS390AddressSupport: Remove duplicated checker
For validation of explicitly configured addresses we already ported the
same style of checks to qemuValidateDomainDeviceDefAddress and implicit
address assignment should do the right thing in the first place, thus
the function is redundant and can be removed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
7a8895463b qemuValidateDomainDeviceDefAddress: Add validation of CCW address
Base the check on the logic from qemuDomainCheckCCWS390AddressSupport,
which will be removed later.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
1f645c10c1 qemu: Drop handling of devices with VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390
We don't support any qemu which would support the 'virtio-s390'
addressing, thus we can drop all code related to it.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
a6aab07787 qemu: capabilities: Retire QEMU_CAPS_VIRTIO_S390
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
69da676aa3 qemu: Remove last uses of QEMU_CAPS_VIRTIO_S390
Modify the code in the last two instances in the code to behave as if
the flag is not asserted.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
3dc7a0e934 qemu: Always reject 'virtio-s390' addresses
QEMU_CAPS_VIRTIO_S390 can never be asserted any more, add an explicit
check that will reject the 'virtio-s390' address type and remove the
code which would auto-fill them.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:23 +02:00
Peter Krempa
dde77d1cf6 qemu: capabilities: Don't probe device properties for 'virtio-*-s390' devices
The devices no longer exist in qemu since the 2.6 release. Drop the
probing of the device properties and fix the data for
qemucapabilitiestest.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:22 +02:00
Peter Krempa
b5a945209d qemu: capabilities: Remove probing of 'virtio-*-s390' devices
QEMU commit 7b3fdbd9a826791bd98e649cf44c0a6129a44179 released in 2.6
dropped the legacy s390 virtio machine and it's devices. Remove our
probing based on the devices.

The probing of properties of the appropriate devices will be removed
subsequently.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:22 +02:00
Peter Krempa
5d83508fe8 qemu: domain: Remove hack for 's390-virtio' machine
qemuDomainDefAddDefaultDevices skipped adding the memballoon for the
's390-virtio' machine type, but since it was removed in qemu 2.6 we can
remove the hack now.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:22 +02:00
Peter Krempa
e8a396682b bhyveConnectDomainXMLToNative: Fix memory leak in incorrect virCommandToString usage
virCommandToString returns an allocated buffer, so using it directly as
argument of virBufferAdd which doesn't consume the string causes it to
be leaked. Switch to virBufferToStringBuf since we are already using a
buffer.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:58:07 +02:00
Peter Krempa
2d018bf769 util: command: Introduce virCommandToStringBuf
The new version allows passing a virBuffer to format the string into.
This will be helpful in solving a memory lean in wrong usage of
virCommandToString and also in tests where we need to add a newline
after the command in certain cases.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-06-15 16:27:35 +02:00
Martin Kletzander
50261966fd syntax-check: Only prohibit empty first lines in non-empty files
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2021-06-15 14:15:42 +02:00
Luke Yue
69f469ea83 test_driver: Implement virDomainGetSecurityLabel
Signed-off-by: Luke Yue <lukedyue@gmail.com>
2021-06-15 14:15:13 +02:00
Luke Yue
0af05dffb8 test_driver: Implement virNodeGetSecurityModel
Signed-off-by: Luke Yue <lukedyue@gmail.com>
2021-06-15 14:15:13 +02:00
Luke Yue
65c2901906 virfile: Simplify virFindFileInPath() with g_find_program_in_path()
Signed-off-by: Luke Yue <lukedyue@gmail.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2021-06-15 14:15:01 +02:00
Luke Yue
d2b6bab11c Replace virFileAbsPath() with g_canonicalize_filename()
Signed-off-by: Luke Yue <lukedyue@gmail.com>
2021-06-15 12:42:02 +02:00
Pavel Hrdina
241969d465 qemu_command: use confidential-guest-support if available
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-06-15 11:33:25 +02:00
Pavel Hrdina
b560d1c876 qemu_capabilities: detect if confidential-guest-support is available
virQEMUCapsProbeQMPMachineProps currently skips any not supported
machine type which includes `none` as well.

In order to start probing that machine type we need to add an exception
to not skip it when probing QEMU capabilities.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-06-15 11:33:17 +02:00
Pavel Hrdina
af5828bc91 qemu_capabilities: introduce confidential-guest-support capability
In libvirt we already use `query-command-line-options` QMP command but
that is useless as it doesn't provide correct data for `-machine`
option. So we need a new and better way to get that data.

We already use `qom-list-properties` to get options for specific machine
types so we can reuse it to get options for special `none` machine type
as a generic arch independent machine type.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-06-15 11:32:41 +02:00
Michal Privoznik
0cc6f8931f capabilities: Expose NUMA interconnects
Links between NUMA nodes can have different latencies and
bandwidths. This info is newly defined in ACPI 6.2 under
Heterogeneous Memory Attribute Table (HMAT) table. Linux kernel
learned how to report these values under sysfs and thus we can
expose them in our capabilities XML. The sysfs interface is
documented in kernel's Documentation/admin-guide/mm/numaperf.rst.

Long story short, two nodes can be in initiator-target
relationship. A node can be initiator if it has a CPU or a device
that's capable of initiating memory transfer. Therefore a node
that has just memory can only be target. An initiator-target link
can then have any combination of {bandwidth, latency} - {access,
read, write} attribute (6 in total). However, the standard says
access is applicable iff read and write values are the same.
Therefore, we really have just four combinations of attributes:
bandwidth-read, bandwidth-write, latency-read, latency-write.

This is the combination that kernel reports anyway.

Then, under /sys/system/devices/node/nodeX/acccessN/initiators we
find values for those 4 attributes and also symlinks named
"nodeN" which then represent initiators to nodeX. For instance:

  /sys/system/node/node1/access1/initiators/node0 -> ../../node0
  /sys/system/node/node1/access1/initiators/read_bandwidth
  /sys/system/node/node1/access1/initiators/read_latency
  /sys/system/node/node1/access1/initiators/write_bandwidth
  /sys/system/node/node1/access1/initiators/write_latency

This means that node0 is initiator and node1 is target and values
of the interconnect can be read.

In theory, there can be separate links to memory side caches too
(e.g. one link from node X to node Y's main memory, another from
node X to node Y's L1 cache, another one to L2 cache and so on).
But sysfs does not express this relationship just yet.

The "accessN" means either "access0" or "access1". The difference
is that while the former expresses the best interconnect between
two nodes including CPUS and I/O devices (such as GPUs and NICs),
the latter includes only CPUs and thus is what we need.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1786309
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2021-06-15 11:03:25 +02:00