Commit Graph

41639 Commits

Author SHA1 Message Date
Luyao Zhong
6213d52384 conf, docs, schema: Add support for 'restrictive' mode in numatune
This allows users to restrict memory nodes without setting any specific
memory policy, then 'restrictive' mode is useful.

Signed-off-by: Luyao Zhong <luyao.zhong@intel.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-04-19 11:39:13 +02:00
Michal Privoznik
69a4cd9249 lxc: Format --handshakefd for controller cmd fully
The command line argument is called --hanshakefd (check out
lxc_controller.c:main()). But the command line builder puts only
--handshake. This works, because there is no other argument
sharing the prefix.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-19 11:21:40 +02:00
Michal Privoznik
ea7d0ca37c vircgroup: Fix virCgroupKillRecursive() wrt nested controllers
I've encountered the following bug, but only on Gentoo with
systemd and CGroupsV2. I've started an LXC container successfully
but destroying it reported the following error:

  error: Failed to destroy domain 'amd64'
  error: internal error: failed to get cgroup backend for 'pathOfController'

Debugging showed, that CGroup hierarchy is full of surprises:

/sys/fs/cgroup/machine.slice/machine-lxc\x2d861\x2damd64.scope/
└── libvirt
    ├── dev-hugepages.mount
    ├── dev-mqueue.mount
    ├── init.scope
    ├── sys-fs-fuse-connections.mount
    ├── sys-kernel-config.mount
    ├── sys-kernel-debug.mount
    ├── sys-kernel-tracing.mount
    ├── system.slice
    │   ├── console-getty.service
    │   ├── dbus.service
    │   ├── system-getty.slice
    │   ├── system-modprobe.slice
    │   ├── systemd-journald.service
    │   ├── systemd-logind.service
    │   └── tmp.mount
    └── user.slice

For comparison, here's the same container on recent Rawhide:

/sys/fs/cgroup/machine.slice/machine-lxc\x2d13550\x2damd64.scope/
└── libvirt

Anyway, those nested directories should not be a problem, because
virCgroupKillRecursiveInternal() removes them recursively, right?
Sort of. The function really does remove nested directories, but
it assumes that every directory has the same controller as the
rest. Just take a look at virCgroupV2KillRecursive() - it gets
'Any' controller (the first one it found in ".scope") and then
passes it to virCgroupKillRecursiveInternal().

This assumption is not true though. The controllers found in
".scope" are the following:

  cpuset cpu io memory pids

while "libvirt" has fewer:

  cpuset cpu io memory

Up until now it's not problem, because of how we order
controllers internally - "cpu" is the first and thus picking
"Any" controller returns just that. But the rest of directories
has no controllers, their "cgroup.controllers" is just empty.

What fixes the bug is dropping @controller argument from
virCgroupKillRecursiveInternal() and letting each iteration work
pick its own controller.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-04-19 11:21:40 +02:00
Michal Privoznik
a0815484b1 vircgroupbackend: Extend error messages in VIR_CGROUP_BACKEND_CALL()
The VIR_CGROUP_BACKEND_CALL() macro gets a backend for controller
and calls corresponding callback in it. If either is NULL then an
error message is printed out. However, the error message contains
only the intended callback func and not controller or backend
found.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-04-19 11:21:40 +02:00
Michal Privoznik
edce157f11 vircgroup: Debug print all arguments of virCgroupKillRecursiveInternal()
Currently, only a subset of virCgroupKillRecursiveInternal()
arguments is printed into debug logs. Print all of them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-04-19 11:21:40 +02:00
Peter Krempa
c2558e78d4 cmdDomBlkError: Fix crash when initial call to virDomainGetDiskErrors fails
virDomainGetDiskErrors uses the weird semantics where we make the
caller query for the number of elements and then pass pre-allocated
structure.

The cleanup section errorneously used the 'count' variable to free the
allocated elements for the API but 'count' can be '-1' in cases when the
API returns failure, thus attempting to free beyond the end of the
array.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/155
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2021-04-19 11:04:53 +02:00
Peter Krempa
ac87f612ba conf: domain: Convert virDomainDiskDef's 'startupPolicy' to virDomainStartupPolicy
Use the appropriate type for the variable and refactor the XML parser to
parse it correctly using virXMLPropEnum.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
2021-04-16 17:28:06 +02:00
Peter Krempa
56be92b473 conf: domain: Convert virDomainDiskDef's 'tray_status' to virDomainDiskTray
Use the appropriate type for the variable and refactor the XML parser to
parse it correctly using virXMLPropEnum.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
2021-04-16 17:28:06 +02:00
Peter Krempa
f1c9fed2ca virsh: snapshot: Don't validate schema of XML generated by 'virsh snapshot-create-as'
Commit 95f8e3237e which introduced XML schema validation
for snapshot XMLs always asserted the validation for the XML generated
by 'virsh snapshot-create-as' on the basis that it's libvirt-generated,
thus valid.

This unfortunately isn't true as users can influence certain bits of the
XML such as the disk image path which must be a full path. Thus if a
user tries to invoke virsh as:

 $ virsh snapshot-create-as upstream --diskspec vda,file=relative.qcow2
 error: XML document failed to validate against schema: Unable to validate doc against /path/to/domainsnapshot.rng
 Extra element disks in interleave
 Element domainsnapshot failed to validate content

They get a rather useless error from the libxml2 RNG validator.

With this fix applied, we get to the XML parser in libvirtd which has a
more reasonable error:

 $ virsh snapshot-create-as upstream --diskspec vda,file=relative.qcow2
 error: XML error: disk snapshot image path 'relative.qcow2' must be absolute

Instead users can force validation of the XML generated by 'virsh
snapshot-create-as' by passing the '--validate' flag.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2021-04-16 17:27:39 +02:00
Tim Wiederhake
f0379bdd14 virCPUDefParseXML: Use virXMLProp*
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 13:22:35 +02:00
Tim Wiederhake
324f6f5826 virDomainIOThreadIDDefParseXML: Use virXMLProp*
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 13:22:35 +02:00
Tim Wiederhake
593140dabd virNetworkForwardNatDefParseXML: Use virXMLProp*
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 13:22:35 +02:00
Tim Wiederhake
ab5d2776c9 virxml: Add virXMLPropEnum
Convenience function to return the value of an enum XML attribute.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 13:22:32 +02:00
Tim Wiederhake
68cda45b57 virxml: Add virXMLPropUInt
Convenience function to return the value of an unsigned integer XML attribute.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 13:22:11 +02:00
Tim Wiederhake
de17e0d30d virxml: Add virXMLPropInt
Convenience function to return the value of an integer XML attribute.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 13:21:55 +02:00
Tim Wiederhake
8861d96c88 virxml: Add virXMLPropTristateSwitch
Convenience function to return the value of an on / off XML attribute.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 13:21:27 +02:00
Tim Wiederhake
c8726ede83 virxml: Add virXMLPropTristateBool
Convenience function to return the value of a yes / no XML attribute.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 13:21:07 +02:00
Peter Krempa
638007f916 virXMLParseHelper: Refactor cleanup
Switch @xml and @pctxt to g_autofree and get rid of the "error" and
"cleanup" labels.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
2021-04-16 13:17:35 +02:00
Peter Krempa
e87eeefb3e virXMLParseHelper: Rework error reporting
Move the reporting of parsing error on the error path of the parser as
other code paths report their own errors already.

Additionally prefer printing the 'url' as document name if provided
instead of "[inline data]" as that usually gives a better hint at least
which kind of XML is being parsed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
2021-04-16 13:17:35 +02:00
Peter Krempa
5339ecf6b9 util: xml: Register autoptr cleanup function for 'xmlParserCtxt'
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
2021-04-16 13:17:35 +02:00
Peter Krempa
7a77556e60 virXMLParseHelper: Sync argument names between declaration and definition
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
2021-04-16 13:17:35 +02:00
Peter Krempa
6f29230a46 util: virxml: Fix formatting of virxml.h
Remove the "block" formatting of function declarations and use uniform
spacing.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
2021-04-16 13:17:35 +02:00
Tim Wiederhake
876f994db1 conf: Use virTristateXXX in virPCIDeviceAddress
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 09:48:42 +02:00
Tim Wiederhake
975e2cb39d conf: Use virTristateXXX in virStoragePoolSourceDevice
Note that the comment for virStoragePoolSourceDevice::part_separator was wrong.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 09:48:42 +02:00
Tim Wiederhake
62f06ffe8a conf: Use virTristateXXX in virStorageAdapterFCHost
Note that the comment for virStorageAdapterFCHost::managed was wrong.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 09:48:42 +02:00
Tim Wiederhake
cc6557ae04 conf: Use virTristateXXX in virDomainDef
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 09:48:42 +02:00
Tim Wiederhake
2259b8d1fd conf: Use virTristateXXX in virDomainLoaderDef
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 09:48:42 +02:00
Tim Wiederhake
f940ec5f36 conf: Use virTristateXXX in virDomainMemballoonDef
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 09:48:42 +02:00
Tim Wiederhake
108ec08b1b conf: Use virTristateXXX in virDomainGraphicsDef
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 09:48:42 +02:00
Tim Wiederhake
b96527751f conf: Use virTristateXXX in virDomainChrSourceDef
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 09:48:42 +02:00
Tim Wiederhake
6609b64701 conf: Use virTristateXXX in virDomainNetDef
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 09:48:42 +02:00
Tim Wiederhake
f1d4cd5ab3 conf: Use virTristateXXX in virDomainActualNetDef
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 09:48:41 +02:00
Tim Wiederhake
a9ef3272c5 conf: Use virTristateXXX in virDomainDiskDef
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 09:48:41 +02:00
Tim Wiederhake
5cbc83774a conf: Use virTristateXXX in virDomainDeviceInfo
Note that the wrong "VIR_TRISTATE_*_ABSENT" was used in qemuDomainChangeNet.

Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 09:48:41 +02:00
Tim Wiederhake
e949edeec8 conf: Use virTristateXXX in virStorageSourceNVMeDef
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 09:48:41 +02:00
Tim Wiederhake
c33c482df4 conf: Use virTristateXXX in virStorageSource
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-16 09:48:41 +02:00
Andrea Bolognani
a0491637e1 ci: Refresh contents
Notable changes:

  * cross-building container images are smaller because they
    no longer include the native compilers;

  * ccache is enabled for clang builds.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
2021-04-15 19:07:16 +02:00
Jonathon Jongsma
5c4b2bf770 nodedev: handle null return from GetIOMMUGroupDev()
Coverity reported that this function can return NULL, so it should be
handled properly.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2021-04-15 08:51:37 -05:00
Jonathon Jongsma
12850ed257 nodedev: refactor virMediatedDeviceGetIOMMUGroupNum()
Currently virMediatedDeviceGetIOMMUGroupDev() looks up the iommu group
number and uses that to construct a path to the iommu group device.
virMediatedDeviceGetIOMMUGroupNum() then uses that device path and takes
the basename to get the group number. That's unnecessary extra string
manipulation for *GroupNum(). Reverse the implementations and make
*GroupDev() call *GroupNum().

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2021-04-15 08:51:37 -05:00
Jonathon Jongsma
e8794b911c qemu: remove unnecessary null check
virMediatedDeviceGetSysfsPath() (via g_strdup_printf()) is guaranteed to
return a non-NULL value, so remove the unnecessary checks for NULL.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2021-04-15 08:51:37 -05:00
Tim Wiederhake
e7a999364e virlog: Remove stray "todo" in comment
Fixes: 8fe30b2167
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2021-04-15 15:42:21 +02:00
Tim Wiederhake
5729d94917 Fix spelling
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2021-04-15 15:42:21 +02:00
Jim Fehlig
27e1779f08 libxl: Add debug statements
Over several years of debugging reports related to VM shutdown, destruction,
and cleanup, I've found that logging of all events received from libxl and
logging the entry of libxlDomainCleanup has proven useful. Add the these
debug messages upstream to aid in future debugging.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2021-04-14 10:10:26 -06:00
Michal Privoznik
3bf8dfd56f qemu: Expose disk serial in virDomainGetGuestInfo()
When querying guest info via virDomainGetGuestInfo() the
'guest-get-disks' agent command is called. It may report disk
serial number which we parse, but never report nor use for
anything else.

As it turns out, it may help management application find matching
disk in their internals.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-By: Tomáš Golembiovský <tgolembi@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2021-04-14 13:56:09 +02:00
Pavel Hrdina
07497fc6da vircgroupv2devices: refactor virCgroupV2DevicesRemoveProg
When running on systemd host the cgroup itself is removed by machined
so when we reach this code the directory no longer exist. If libvirtd
was running the whole time between starting and destroying VM the
detection is skipped because we still have both FD in memory. But if
libvirtd was restarted and no operation requiring cgroup devices
executed the FDs would be 0 and libvirt would try to detect them using
the cgroup directory. This results in reporting following errors:

    libvirtd[955]: unable to open '/sys/fs/cgroup/machine.slice/machine-qemu\x2d1\x2dguest.scope/': No such file or directory
    libvirtd[955]: Failed to remove cgroup for guest

When running on non-systemd host where we handle cgroups manually this
would not happen.

When destroying VM it is not necessary to detect the BPF prog and map
because the following code only closes the FDs without doing anything
else. We could run code that would try to detach the BPF prog from the
cgroup but that is not necessary as well. If the cgroup is removed and
there is no other FD open to the prog kernel will cleanup the prog and
map eventually.

Reported-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-04-14 12:06:16 +02:00
Pavel Hrdina
6960a895ab vircgroupv2: properly free BPF prog and map FDs
When nested cgroup was introduced it did not properly free file
descriptors for BPF prog and map. With nested cgroups we create the BPF
bits in the nested cgroup instead of the VM root cgroup.

This would leak the FDs which would be the last reference to the prog
and map so kernel would not remove the resources as well. It would only
happen once libvirtd process exits.

Fixes: 184245f53b
Reported-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2021-04-14 12:04:35 +02:00
Michal Privoznik
8674faaf32 nodedev: Don't fail device enumeration if MDEVCTL is missing
After all devices were enumerated, the enumeration thread call
nodeDeviceUpdateMediatedDevices() to refresh the state of
mediated devices. This means that 'mdevctl' will be executed. But
it may be missing on some systems (e.g. mine) in which case we
should just skip the update of mdevs instead of failing whole
device enumeration.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2021-04-14 10:17:41 +02:00
Michal Privoznik
54d97f020b nodedev: Mark device initialization complete even in case of an error
To speed up nodedev driver initialization, the device enumeration
is done in a separate thread. Once finished, the thread sets a
boolean variable that allows public APIs to be called (instead of
waiting for the thread to finish).

However, if there's an error in the device enumeration thread
then the control jumps over at the 'error' label and the boolean
is never set. This means, that any virNodeDev*() API is stuck
forever. Mark the initialization as complete (the thread is
quitting anyway) and let the APIs proceed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2021-04-14 10:17:32 +02:00
Michal Privoznik
77a13eb9ac nodedev: Wait for device initialization in all public API callbacks
Although I have not experienced this in real life, there is a
possible race condition when creating new device, getting its XML
or parent or listing its capabilities.  If the nodedev driver is
still enumerating devices (in a separate thread) and one of
virNodeDeviceGetXMLDesc(), virNodeDeviceGetParent(),
virNodeDeviceNumOfCaps(), virNodeDeviceListCaps() or
virNodeDeviceCreate() is called then it can lead to spurious
results because the device enumeration thread is removing devices
from or adding them to the internal list of devices (among with
their states).

Therefore, wait for things to settle down before proceeding with
any of the APIs.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2021-04-14 10:16:48 +02:00
Michal Privoznik
5b56a288ca nodedev: Signal initCond with driver locked
This is more academic dispute than a real bug, but this is taken
from pthread_cond_broadcast(3p) man:

  The pthread_cond_broadcast() or pthread_cond_signal() functions
  may be called by a thread whether or not it currently owns the
  mutex that threads calling pthread_cond_wait() or
  pthread_cond_timedwait() have associated with the condition
  variable during their waits; however, if predictable scheduling
  behavior is required, then that mutex shall be locked by the
  thread calling pthread_cond_broadcast() or
  pthread_cond_signal().

Therefore, broadcast the initCond while the nodedev driver is
still locked.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2021-04-13 17:34:42 +02:00