And replace all calls with virObjectEventStateQueue such that:
qemuDomainEventQueue(driver, event);
becomes:
virObjectEventStateQueue(driver->domainEventState, event);
And remove NULL checking from all callers.
Signed-off-by: Anya Harter <aharter@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
QEMU uses /dev/sev device while creating the SEV guest, lets add /dev/sev
in the list of devices allowed to be accessed by the QEMU.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Add the external swtpm to the emulator cgroup so that upper limits of CPU
usage can be enforced on the emulated TPM.
To enable this we need to have the swtpm write its process id (pid) into a
file. We then read it from the file to configure the emulator cgroup.
The PID file is created in /var/run/libvirt/qemu/swtpm:
[root@localhost swtpm]# ls -lZ /var/run/libvirt/qemu/swtpm/
total 4
-rw-r--r--. 1 tss tss system_u:object_r:qemu_var_run_t:s0 5 Apr 10 12:26 1-testvm-swtpm.pid
srw-rw----. 1 qemu qemu system_u:object_r:svirt_image_t:s0:c597,c632 0 Apr 10 12:26 1-testvm-swtpm.sock
The swtpm command line now looks as follows:
root@localhost testvm]# ps auxZ | grep swtpm | grep socket | grep -v grep
system_u:system_r:virtd_t:s0:c597,c632 tss 18697 0.0 0.0 28172 3892 ? Ss 16:46 0:00 /usr/bin/swtpm socket --daemon --ctrl type=unixio,path=/var/run/libvirt/qemu/swtpm/1-testvm-swtpm.sock,mode=0600 --tpmstate dir=/var/lib/libvirt/swtpm/485d0004-a48f-436a-8457-8a3b73e28568/tpm1.2/ --log file=/var/log/swtpm/libvirt/qemu/testvm-swtpm.log --pid file=/var/run/libvirt/qemu/swtpm/1-testvm-swtpm.pid
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This patch adds support for an external swtpm TPM emulator. The XML for
this type of TPM looks as follows:
<tpm model='tpm-tis'>
<backend type='emulator'/>
</tpm>
The XML will currently only define a TPM 1.2.
Extend the documentation.
Add a test case testing the XML parser and formatter.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Just like in previous commit, qemu-pr-helper might want to open
/dev/mapper/control under certain circumstances. Therefore we
have to allow it in cgroups.
The change virdevmapper.c might look spurious but it isn't. After
6dd84f6850ca437 any path that we're allowing in deivces CGroup is
subject to virDevMapperGetTargets() inspection. And libdevmapper
returns ENXIO for the path from subject.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1557769
Problem with device mapper targets is that there can be several
other devices 'hidden' behind them. For instance, /dev/dm-1 can
consist of /dev/sda, /dev/sdb and /dev/sdc. Therefore, when
setting up devices CGroup and namespaces we have to take this
into account.
This bug was exposed after Linux kernel was fixed. Initially,
kernel used different functions for getting block device in
open() and ioctl(). While CGroup permissions were checked in the
former case, due to a bug in kernel they were not checked in the
latter case. This changed with the upstream commit of
519049afead4f7c3e6446028c41e99fde958cc04 (v4.16-rc5~11^2~4).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The virresctrl will use this as well and we need to have that info after restart
to properly clean up /sys/fs/resctrl.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
All calls to virDomainAuditCgroupPath() were passing 'rc == 0' as
argument, when it was supposed to pass the 'rc' value directly.
As a consequence, the audit events that were supposed to be
logged (actual cgroup changes) were never being logged, and bogus
audit events were logged when using regular files as disk image.
Fix all calls to use the return value of
virCgroup{Allow,Deny}Device*() directly as the 'rc' argument.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add helpers that will simplify checking if a backing file is valid or
whether it has backing store. The helper virStorageSourceIsBacking
returns true if the given virStorageSource is a valid backing store
member. virStorageSourceHasBacking returns true if the virStorageSource
has a backing store child.
Adding these functions creates a central points for further refactors.
Since commit 2e6ecba1bcac, the pointer to the qemu driver is saved in
domain object's private data and hence does not have to be passed as
yet another parameter if domain object is already one of them.
This is a first (example) patch of this kind of clean up, others will
hopefully follow.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
It is more related to a domain as we might use it even when there is
no systemd and it does not use any dbus/systemd functions. In order
not to use code from conf/ in util/ pass machineName in cgroups code
as a parameter. That also fixes a leak of machineName in the lxc
driver and cleans up and de-duplicates some code.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Some users might want to pass a blockdev or a chardev as a
backend for NVDIMM. In fact, this is expected to be the mostly
used configuration. Therefore libvirt should allow the device in
devices CGroup then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When a domain needs an access to some device (be it a disk, RNG,
chardev, whatever), we have to allow it in the devices CGroup (if
it is available), because by default we disallow all the devices.
But some of the functions that are responsible for setting up
devices CGroup are lacking check whether there is any CGroup
available. Thus users might be unable to hotplug some devices:
virsh # attach-device fedora rng.xml
error: Failed to attach device from rng.xml
error: internal error: Controller 'devices' is not mounted
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When enabling virgl, qemu opens /dev/dri/render*. So far, we are
not allowing that in devices CGroup nor creating the file in
domain's namespace and thus requiring users to set the paths in
qemu.conf. This, however, is suboptimal as it allows access to
ALL qemu processes even those which don't have virgl configured.
Now that we have a way to specify render node that qemu will use
we can be more cautious and enable just that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So far, qemuDomainGetHostdevPath has no knowledge of the reasong
it is called and thus reports /dev/vfio/vfio for every VFIO
backed device. This is suboptimal, as we want it to:
a) report /dev/vfio/vfio on every addition or domain startup
b) report /dev/vfio/vfio only on last VFIO device being unplugged
If a domain is being stopped then namespace and CGroup die with
it so no need to worry about that. I mean, even when a domain
that's exiting has more than one VFIO devices assigned to it,
this function does not clean /dev/vfio/vfio in CGroup nor in the
namespace. But that doesn't matter.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
So far, we are allowing /dev/vfio/vfio in the devices cgroup
unconditionally (and creating it in the namespace too). Even if
domain has no hostdev assignment configured. This is potential
security hole. Therefore, when starting the domain (or
hotplugging a hostdev) create & allow /dev/vfio/vfio too (if
needed).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Since these two functions are nearly identical (with
qemuSetupHostdevCgroup actually calling virCgroupAllowDevicePath)
we can have one function call the other and thus de-duplicate
some code.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
There's no need for this function. Currently it is passed as a
callback to virSCSIVHostDeviceFileIterate(). However, SCSI host
devices have just one file path. Therefore we can mimic approach
used in qemuDomainGetHostdevPath() to get path and call
virCgroupAllowDevicePath() directly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
There's no need for this function. Currently it is passed as a
callback to virSCSIDeviceFileIterate(). However, SCSI devices
have just one file path. Therefore we can mimic approach used in
qemuDomainGetHostdevPath() to get path and call
virCgroupAllowDevicePath() directly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
There's no need for this function. Currently it is passed as a
callback to virUSBDeviceFileIterate(). However, USB devices have
just one file path. Therefore we can mimic approach used in
qemuDomainGetHostdevPath() to get path and call
virCgroupAllowDevicePath() directly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This is a list of devices that qemu needs for its run (apart from
what's configured for domain). The devices on the list are
enabled in the CGroups by default so they will be good candidates
for initial /dev for new qemu.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If libvirt is compiled without NUMACTL support starting libvirtd
reports a libvirt internal error "NUMA isn't available on this host"
without checking if NUMA support is compiled into the libvirt binaries.
This patch adds the missing NUMA support check to prevent the internal error.
It also includes a check if the cgroup controller cpuset is available before
using it.
The error was noticed when libvirtd was restarted with running domains and
on libvirtd start the qemuConnectCgroup gets called during qemuProcessReconnect.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Open /dev/vhost-scsi, and record the resulting file descriptor, so that
the guest has access to the host device outside of the libvirt daemon.
Pass this information, along with data parsed from the XML file, to build
a device string for the qemu command line. That device string will be
for either a vhost-scsi-ccw device in the case of an s390 machine, or
vhost-scsi-pci for any others.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
We already have a "scsi" hostdev subsys type, which refers to a single
LUN that is passed through to a guest. But what of things where
multiple LUNs are passed through via a single SCSI HBA, such as with
the vhost-scsi target? Create a new hostdev subsys type that will
carry this.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Just like in the previous commit, we are not updating CGroups on
chardev hot(un-)plug and thus leaving qemu unable to access any
non-default device users are trying to hotplug.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If users try to hotplug RNG device with a backend different to
/dev/random or /dev/urandom the whole operation fails as qemu is
unable to access the device. The problem is we don't update
device CGroups during the operation.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
As was suggested in an earlier review comment[1], we can
catch some additional code points by cleaning up how we use the
hostdev subsystem type in some switch statements.
[1] End of https://www.redhat.com/archives/libvir-list/2016-September/msg00399.html
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Change the virDomainChrDef to use a pointer to 'source' and allocate
that pointer during virDomainChrDefNew.
This has tremendous "fallout" in the rest of the code which mainly
has to change source.$field to source->$field.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Name it virNumaGetHostMemoryNodeset and return only NUMA nodes which
have memory installed. This is necessary as the kernel is not very happy
to set the memory cgroup setting for nodes which do not have any memory.
This would break vcpu hotplug with following message on such
configruation:
Invalid value '0,8' for 'cpuset.mems': Invalid argument
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1375268
When hot-adding vcpus qemu needs to allocate some structures in the DMA
zone which may be outside of the numa pinning. Extract the code doing
this in a set of helpers so that it can be reused.
QEMU needs access to the /dev/dri/render* device for
virgl to work.
Allow access to all /dev/dri/* devices for domains with
<video>
<model type='virtio' heads='1' primary='yes'>
<acceleration accel3d='yes'/>
</model>
</video>
https://bugzilla.redhat.com/show_bug.cgi?id=1337290
When adding disk images to ACL we may call those functions on NFS
shares. In that case we might get an EACCES, which isn't really relevant
since NFS would not hold a block device. This patch adds a flag that
allows to stop reporting an error on EACCES to avoid spaming logs.
Currently there's no functional change.
Since commit 47e5b5ae virCgroupAllowDevice allows to pass -1 as either
the minor or major device number and it automatically uses '*' in place
of that. Reuse the new approach through the code and drop the duplicated
functions.
Rather than iterating 3 times for various settings this function
aggregates all the code into single place. One of the other advantages
is that it can then be reused for properly setting IOThread info on
hotplug.
Rather than iterating 3 times for various settings this function
aggregates all the code into single place. One of the other advantages
is that it can then be reused for properly setting vCPU info on hotplug.
With this approach autoCpuset is also used when setting the process
affinity rather than just via cgroups.