Commit Graph

35477 Commits

Author SHA1 Message Date
Pavel Hrdina
4c0398b528 qemu_process: fix starting VMs if machine group has limited cpuset.cpus
Commit <f136b83139c63f20de0df3285d9e82df2fb97bfc> reworked process
affinity setting but did not take cgroups into account which introduced
an issue when starting VM with custom cpuset.cpus for the whole machine
group.

If the machine group is limited to some pCPUs libvirt should not try to
set a VM to run on all pCPUs as it will result in permission denied when
writing to cpuset.cpus.

To fix this the affinity has to be set separately from cgroups cpuset.

Resolves: <https://bugzilla.redhat.com/show_bug.cgi?id=1746517>

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-11-18 10:41:44 +01:00
Michal Privoznik
02bf7cc68b virbpf: Fix typecast to __aligned_u64 type
In functions implemented here we fill this attr union (type of
bpf_attr) and just pass it to syscall(2). Thing is that some of
the union members are type of __aligned_u64. This is not regular
uint64_t. This one is explicitly aligned to 8 bytes, while
uint64_t can be aligned to 4 bytes (on 32 bits). We've used
explicit typecast to uint64_t to shut compiler which would
otherwise complain of assigning a pointer into an integer. Well,
we have uintptr_t just for that.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2019-11-18 08:59:13 +01:00
Michal Privoznik
c10b78370d vircgroupv2devices: Fix format string for size_t variable
In virCgroupV2DevicesReallocMap() we are debug printing both
arguments passed to the function. However, the @size argument is
type of size_t but '%lu' is used to format it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2019-11-18 08:53:30 +01:00
Jonathon Jongsma
2de5e131b9 news: mention 'ramfb' mdev attribute
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
2019-11-17 20:13:52 -05:00
Michal Privoznik
c07a33bef9 virbpf: Check if syscall() is available
There are some OSes which don't have syscall() nor
<sys/syscall.h>. We already check for the header file in
configure phase, so we just need to add check for
HAVE_SYS_SYSCALL_H to HAVE_DECL_BPF_PROG_QUERY.

While I'm at it, some header files we are including are not
needed, so their includes can be safely dropped.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2019-11-16 06:39:23 +01:00
Jim Fehlig
5a5e92000d spec: Remove build-time list of edk2 firmwares
Fedora now advertises supported firmwares via descriptor files.
Since the upstream spec file assumes recent Fedora, remove the
build-time list of firmwares, which can produce a warning after
commit 75597f022a.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-11-15 16:49:30 -07:00
Jonathon Jongsma
889cd827ae conf: validate video resolution
Ensure that both x and y are non-zero when resolution is specified for a
video device.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
2019-11-15 13:30:56 -05:00
Jonathon Jongsma
026c2ffb50 conf: report errors when parsing video acceleration
Since this function is now only called when an 'acceleration' element is
present in the xml, any failure to parse the element will be considered
an error.

Previously, we detected some types of errors, but we would only log an
error (virReportError()), but still return a partially-specified accel
object to the caller. This patch returns NULL for all parsing errors and
reports that error back up to the caller.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
2019-11-15 13:30:56 -05:00
Jonathon Jongsma
754e4c24ec conf: report errors when parsing video resolution
The current code doesn't properly handle errors when parsing a video
device's resolution.  We were returning a NULL structure for the case
where 'x' or 'y' were missing. But for the other error cases, we were
logging an error (virReportError()), but still returning an
under-specified structure. That under-specified structure was used by
the calling function rather than properly reporting an error.

This patch changes the parse function to return NULL on any parsing
error and changes the calling function to report an error when NULL is
returned.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
2019-11-15 13:30:56 -05:00
Jonathon Jongsma
333cca0bfc conf: iterate video model children in parent function
Previously, we were passing the video "model" node to the "acceleration"
and "resolution" parsing functions and requiring them to iterate over
the children to discover and parse the appropriate node. It makes more
sense to move this responsibility up to the parent function and just
pass these functions the node that needs to be parsed.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
2019-11-15 13:30:55 -05:00
Miguel Ángel Arruga Vivas
a74df786a2 vircgroup: Ensure /machine group is associated with its parent
Call first virCgroupNew on the parent group virCgroupNewPartition if
it is available on before the creation of the child group.  This
ensures that the creation of a first level group on the unified
architecture, as the check at virCgroupV2ParseControllersFile as the
parent file is there.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1760233

Signed-off-by: Miguel Ángel Arruga Vivas <rosen644835@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-11-15 15:45:25 +01:00
Miguel Ángel Arruga Vivas
ddcb33bdc0 doc: cgroups: Remove unwanted references to systemd
The non-systemd configurations do not create system neither user
control groups.  The title of the diagram referenced systemd too.

Signed-off-by: Miguel Ángel Arruga Vivas <rosen644835@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-11-15 15:45:20 +01:00
Gregor Kopka
98f931de7c Allow a zfs pool or dataset as source for zfs storage backend
Enables hosting a pool on an existing zfs pool without affecting
other datasets there.
Specify dataset instead of pool as source to use.
Parent of dataset must exist for pool-build to succeed.
Beware that pool-delete destroys the source dataset and all children.

Solves: https://www.redhat.com/archives/libvirt-users/2017-April/msg00041.html

Signed-off-by: Gregor Kopka <gregor@kopka.net>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-11-15 15:25:53 +01:00
Pavel Hrdina
43b01ef2d6 replace use of gnulib snprintf by g_snprintf
Glib implementation follows the ISO C99 standard so it's safe to replace
the gnulib implementation.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-11-15 15:07:40 +01:00
Pavel Hrdina
8addef2bef vircgroupmock: mock virCgroupV2DevicesAvailable
We need to mock virCgroupV2DevicesAvailable() in order to remove any
dependency on kernel as BPF devices might not be available.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:43 +01:00
Pavel Hrdina
c359cb9aee vircgroup: workaround devices in hybrid mode
So the issue here is that you can end up with configuration where
you have cgroup v1 and v2 enabled at the same time and the devices
controllers is enabled for cgroup v1.

In cgroup v2 there is no devices controller, the device access is
controlled using BPF and since it is not a cgroup controller both
of them can exists at the same time and both of them are applied while
resolving access to devices.

In order to avoid configuring both BPF and cgroup v1 devices we will
use BPF if possible and otherwise fallback to cgroup v1 devices.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:43 +01:00
Pavel Hrdina
884479b42b vircgroup: introduce virCgroupV2DenyAllDevices
If we want to deny all devices we just need to replace any existing
program with new program with empty map.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:42 +01:00
Pavel Hrdina
285aefb31c vircgroup: introduce virCgroupV2AllowAllDevices
If we want to allow all devices with all permissions we need to replace
any existing program that has any rule configured, otherwise we just
need to add new rule which will for example allow read access to all
devices.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:41 +01:00
Pavel Hrdina
d5b09ce5d9 vircgroup: introduce virCgroupV2DenyDevice
In order to deny device we need to check if there is any entry in BPF
map and we need to load the current value from map if there is already
entry for that device.  If both values are same we can remove that entry
but if they are different we need to update the entry because we don't
have to deny all access, but for example only write access.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:40 +01:00
Pavel Hrdina
5d49651912 vircgroup: introduce virCgroupV2AllowDevice
In order to allow device we need to create key and value which will be
used to update BPF map.  virBPFUpdateElem() can override existing
entries in BPF map so we need to check if that entry exists in order to
track number of entries in our map.

This can add rule for specific device but major and minor can be both
-1 which follows the same behavior as in cgroup v1.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:39 +01:00
Pavel Hrdina
b18b0ce609 vircgroup: introduce virCgroupV2DevicesGetKey
Device rules are stored in BPF map that is a hash type, this function
will create a key based on major and minor id of device.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:38 +01:00
Pavel Hrdina
63cfe7b84d vircgroup: introduce virCgroupV2DeviceGetPerms
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:37 +01:00
Pavel Hrdina
6a24bd75ed vircgroup: introduce virCgroupV2DevicesRemoveProg
We need to close our FD that we have for BPF program and map in order
to let kernel remove all resources once the cgroup is removed as well.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:34 +01:00
Pavel Hrdina
ef747499a5 vircgroup: introduce virCgroupV2DevicesPrepareProg
This function will be called for every virCgroup(Allow|Deny)* API in
order to prepare BPF program for guest.  Since libvirtd can be restarted
at any point we will first try to detect existing progam, if there is
none we will create a new empty BPF program and lastly if we don't have
any space left in the existing BPF map we will create a new copy of the
BPF map with more space and attach a new program with that map into the
guest cgroup.

This solution allows us to start with reasonably small BPF map consuming
only small amount of memory and if needed we can easily extend the BPF
map if there is a lot of host devices used in guest or if user wants to
hot-plug a lot of devices once the guest is running.

Since there is no way how to reallocate existing BPF map we need to
create a new copy if we run out of space in current BPF map.

This overcomes all the limitations in BPF:

    - map used in program has to be created before the program is loaded
      into kernel

    - once map is created you cannot change its size

    - you cannot replace map in existing program

    - you cannot use an array of maps because it can store FD to maps
      of one specific size so we would not be able to use it to overcome
      the second issue

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:33 +01:00
Pavel Hrdina
afa2788662 vircgroup: introduce virCgroupV2DevicesCreateProg
This function creates new BPF program with new empty BPF map with the
default size and attaches it to the guest cgroup.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:32 +01:00
Pavel Hrdina
ce11a5c59f vircgroup: introduce virCgroupV2DevicesDetectProg
This function will be called if libvirtd was restarted while some
domains were running.  It will try to detect existing programs attached
to the guest cgroup.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:31 +01:00
Pavel Hrdina
48423a0b5d vircgroup: introduce virCgroupV2DevicesAttachProg
This function loads the BPF prog with prepared map into kernel and
attaches it into guest cgroup.  It can be also used to replace existing
program in the cgroup if we need to resize BPF map to store more rules
for devices. The old program will be closed and removed from kernel.

There are two possible ways how to create BPF program:

    - One way is to write simple C-like code which can by compiled into
      BPF object file which can be loaded into kernel using elfutils.

    - The second way is to define macros which look like assembler
      instructions and can be used directly to create BPF program that
      can be directly loaded into kernel.

Since the program is not too complex we can use the second option.

If there is no program, all devices are allowed, if there is some
program it is executed and based on the exit status the access is
denied for 0 and allowed for 1.

Our program will follow these rules:

    - first it will try to look for the specific key using major and
      minor to see if there is any rule for that specific device

    - if there is no specific rule it will try to look for any rule that
      matches only major of the device

    - if there is no match with major it will try the same but with
      minor of the device

    - as the last attempt it will try to look for rule for all devices
      and if there is no match it will return 0 to deny that access

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:05 +01:00
Pavel Hrdina
30b6ddc44c vircgroup: introduce virCgroupV2DevicesAvailable
There is no exact way how to figure out whether BPF devices support is
compiled into kernel.  One way is to check kernel configure options but
this is not reliable as it may not be available.  Let's try to do
syscall to which will list BPF cgroup device programs.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:04 +01:00
Pavel Hrdina
07946d6e39 util: introduce virbpf helpers
In order to implement devices controller with cgroup v2 we need to
add support for BPF programs, cgroup v2 doesn't have devices controller.

This introduces required helpers wrapping linux syscalls.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:00 +01:00
Michal Privoznik
9e4445ebc3 tests: Mock access to /dev/kvm
Some of our tests try to validate domain XMLs they are working
with (not intentionally, simply because they call top level
domain XML parse function). Anyway, this implies that we build
domain capabilities also - see
virQEMUDriverGetDomainCapabilities(). And since some domain XMLs
are type of 'kvm' the control gets through
virQEMUCapsFillDomainCaps() and virHostCPUGetKVMMaxVCPUs() to
opening /dev/kvm which may be missing on the machine we're
running 'make check'.

Previously, we did not see this issue, because it was masked. If
building domain capabilities failed for whatever reason, we
ignored the failure. Only v5.9.0-207-gc69e6edea3 uncovered the
problem (it changed reval from 0 to -1 if
virQEMUDriverGetDomainCapabilities() fails). Since the referenced
commit is correct, we need to mock access to /dev/kvm in our
tests.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2019-11-15 11:56:46 +01:00
Jiri Denemark
7bd41cb62c virsh: Fix typo in the man page
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2019-11-15 09:34:20 +01:00
Jonathon Jongsma
95f5ac9ae5 Add API to change qemu agent response timeout
Some layered products such as oVirt have requested a way to avoid being
blocked by guest agent commands when querying a loaded vm. For example,
many guest agent commands are polled periodically to monitor changes,
and rather than blocking the calling process, they'd prefer to simply
time out when an agent query is taking too long.

This patch adds a way for the user to specify a custom agent timeout
that is applied to all agent commands.

One special case to note here is the 'guest-sync' command. 'guest-sync'
is issued internally prior to calling any other command. (For example,
when libvirt wants to call 'guest-get-fsinfo', we first call
'guest-sync' and then call 'guest-get-fsinfo').

Previously, the 'guest-sync' command used a 5-second timeout
(VIR_DOMAIN_QEMU_AGENT_COMMAND_DEFAULT), whereas the actual command that
followed always blocked indefinitely
(VIR_DOMAIN_QEMU_AGENT_COMMAND_BLOCK). As part of this patch, if a
custom timeout is specified that is shorter than
5 seconds,  this new timeout is also used for 'guest-sync'. If there is
no custom timeout or if the custom timeout is longer than 5 seconds, we
will continue to use the 5-second timeout.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-11-14 19:10:01 +01:00
Ján Tomko
954f36e078 syntax-check: prefer g_mkstemp_full and g_mkdtemp
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-11-14 19:02:31 +01:00
Ján Tomko
ef88698668 Use g_mkdtemp instead of mkdtemp
Prefer the GLib version to the one from gnulib.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-11-14 19:02:31 +01:00
Ján Tomko
4ac4773040 Use g_mkstemp_full instead of mkostemp(s)
With g_mkstemp_full, there is no need to distinguish between
mkostemp and mkostemps (no suffix vs. a suffix of a fixed length),
because the GLib function looks for the XXXXXX pattern everywhere
in the string.

Use S_IRUSR | S_IWUSR for the permissions and do not pass O_RDWR
in flags since it's implied.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-11-14 19:02:31 +01:00
Ján Tomko
c4ae19d1ec tests: use GRegex in vboxsnapshotxmltest
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-11-14 17:45:40 +01:00
Ján Tomko
b96e0dbba9 util: use GRegex in virStringMatch
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-11-14 17:45:40 +01:00
Ján Tomko
9c76dd3a2e util: use GRegex in virStringSearch
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-11-14 17:45:40 +01:00
Ján Tomko
514b2b272b util: use GRegex for virLogRegex
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-11-14 17:45:40 +01:00
Ján Tomko
039d26fcb0 util: use GRegex in virCommandRunRegex
This saves us from allocating vars upfront, since GLib deals with
that for us.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-11-14 17:45:40 +01:00
Ján Tomko
70d6994679 storage: use GRegex virStorageBackendLogicalParseVolExtents
Using GRegex simplifies the code since g_match_info_fetch will
copy the matched substring for us.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-11-14 17:45:40 +01:00
Ján Tomko
815db3ea58 libxl: remove 'ret' from xenParseSxprVifRate
Now that the cleanup section is empty, the ret variable is no longer
necessary.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-11-14 17:45:40 +01:00
Ján Tomko
c4ac8e4168 libxl: use GRegex in xenParseSxprVifRate
Use GRegex from GLib instead of regcomp.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-11-14 17:45:40 +01:00
Ján Tomko
5c98d442df libxl: use g_autofree in xenParseSxprVifRate
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-11-14 17:45:40 +01:00
Ján Tomko
77d228468d libxl: use GRegex in libxlGetAutoballoonConf
Replace the use of regcomp with GRegex.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-11-14 17:45:40 +01:00
Ján Tomko
5c89468ff2 remove unused regex.h includes
The code using regexes got moved, but the include stayed.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-11-14 17:45:40 +01:00
Ján Tomko
8aa0f8e6dc libxl: do not use G_REGEX_EXTENDED
This flag is not needed to use extended regular expression syntax
with GRegex and it makes GRegex ignore whitespace in the regex.

Remove the unintended usage, even though it should not matter in this
case.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-11-14 17:45:40 +01:00
Jonathon Jongsma
4b95738c8f qemu: add 'ramfb' attribute for mediated devices
The 'ramfb' attribute provides a framebuffer to the guest that can be
used as a boot display for the vgpu

For example, the following configuration can be used to provide a vgpu
with a boot display:

    <hostdev mode='subsystem' type='mdev' model='vfio-pci' display='on' ramfb='on'>
        <source>
            <address uuid='$UUID'/>
        </source>
    </hostdev>

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
2019-11-14 11:37:50 -05:00
Jonathon Jongsma
c66f2be6f1 qemu: use domain caps to validate video device model
As suggested by Cole, this patch uses the domain capabilities to
validate the supported video model types. This allows us to remove the
model type validation from qemu_process.c and qemu_domain.c and
consolidates it all in a single place that will automatically adjust
when new domain capabilities are added.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
2019-11-14 11:37:50 -05:00
Jonathon Jongsma
42cc3eb912 qemu: move validation of video accel to qemu_domain.c
Continue consolidation of video device validation started in previous
patch.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
2019-11-14 11:37:50 -05:00