Commit Graph

90 Commits

Author SHA1 Message Date
Pavel Hrdina
c359cb9aee vircgroup: workaround devices in hybrid mode
So the issue here is that you can end up with configuration where
you have cgroup v1 and v2 enabled at the same time and the devices
controllers is enabled for cgroup v1.

In cgroup v2 there is no devices controller, the device access is
controlled using BPF and since it is not a cgroup controller both
of them can exists at the same time and both of them are applied while
resolving access to devices.

In order to avoid configuring both BPF and cgroup v1 devices we will
use BPF if possible and otherwise fallback to cgroup v1 devices.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:43 +01:00
Pavel Hrdina
884479b42b vircgroup: introduce virCgroupV2DenyAllDevices
If we want to deny all devices we just need to replace any existing
program with new program with empty map.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:42 +01:00
Pavel Hrdina
285aefb31c vircgroup: introduce virCgroupV2AllowAllDevices
If we want to allow all devices with all permissions we need to replace
any existing program that has any rule configured, otherwise we just
need to add new rule which will for example allow read access to all
devices.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:41 +01:00
Pavel Hrdina
d5b09ce5d9 vircgroup: introduce virCgroupV2DenyDevice
In order to deny device we need to check if there is any entry in BPF
map and we need to load the current value from map if there is already
entry for that device.  If both values are same we can remove that entry
but if they are different we need to update the entry because we don't
have to deny all access, but for example only write access.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:40 +01:00
Pavel Hrdina
5d49651912 vircgroup: introduce virCgroupV2AllowDevice
In order to allow device we need to create key and value which will be
used to update BPF map.  virBPFUpdateElem() can override existing
entries in BPF map so we need to check if that entry exists in order to
track number of entries in our map.

This can add rule for specific device but major and minor can be both
-1 which follows the same behavior as in cgroup v1.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:39 +01:00
Pavel Hrdina
6a24bd75ed vircgroup: introduce virCgroupV2DevicesRemoveProg
We need to close our FD that we have for BPF program and map in order
to let kernel remove all resources once the cgroup is removed as well.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:34 +01:00
Pavel Hrdina
30b6ddc44c vircgroup: introduce virCgroupV2DevicesAvailable
There is no exact way how to figure out whether BPF devices support is
compiled into kernel.  One way is to check kernel configure options but
this is not reliable as it may not be available.  Let's try to do
syscall to which will list BPF cgroup device programs.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:04 +01:00
Michal Privoznik
91d88aaf23 util: Use g_strdup_printf() instead of virAsprintf()
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-11-12 16:15:58 +01:00
Michal Privoznik
3b4df5d350 Drop needless ret variable
In few places we have the following code pattern:

  int ret;
  ... /* @ret is not accessed here */
  ret = f(...);
  return ret;

This pattern can be written less verbose:

  ...
  return f(...);

This patch was generated with following coccinelle spatch:

  @@
  type T;
  constant C;
  expression f;
  identifier ret;
  @@
  -T ret = C;
   ... when != ret
  -ret = f;
  -return ret;
  +return f;

Afterwards I needed to fix a few places, e.g. comment in
virDomainNetIPParseXML() was removed too because coccinelle
thinks it refers to @ret while in fact it doesn't. Also in few
places it replaced @ret declaration with a few spaces instead of
removing the line. But nothing terribly wrong.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-10-24 08:10:37 +02:00
Ján Tomko
8212e5e4ab vircgroup: use g_strdup instead of VIR_STRDUP
Replace all occurrences of
  if (VIR_STRDUP(a, b) < 0)
     /* effectively dead code */
with:
  a = g_strdup(b);

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-10-21 12:51:58 +02:00
Ján Tomko
3cbd4351de Use g_strdup where VIR_STRDUP's return value was propagated
All the callers of these functions only check for a negative
return value.

However, virNetDevOpenvswitchGetVhostuserIfname is documented
as returning 1 for openvswitch interfaces so preserve that.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-10-21 12:51:55 +02:00
Ján Tomko
a3931b4996 util: use g_steal_pointer instead of VIR_STEAL_PTR
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-10-16 15:59:42 +02:00
Ján Tomko
1e2ae2e311 Use g_autofree instead of VIR_AUTOFREE
Since commit 44e7f02915
    util: rewrite auto cleanup macros to use glib's equivalent

VIR_AUTOFREE is just an alias for g_autofree. Use the GLib macros
directly instead of our custom aliases.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-10-16 12:06:43 +02:00
Ján Tomko
67e72053c1 Use G_N_ELEMENTS instead of ARRAY_CARDINALITY
Prefer the GLib version of the macro.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-10-15 16:14:19 +02:00
Ján Tomko
679f8b3994 util: use G_GNUC_UNUSED
Use G_GNUC_UNUSED from GLib instead of ATTRIBUTE_UNUSED.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-10-15 11:25:25 +02:00
Daniel P. Berrangé
d5d6dbcfb5 build: remove all gnulib bit manipulation modules
We're using gnulib to get ffs, ffsl, rotl32, count_one_bits,
and count_leading_zeros. Except for rotl32 they can all be
replaced with gcc/clangs builtins. rotl32 is a one-line
trivial function.

Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-10-07 13:39:26 +01:00
Cole Robinson
a95e585e13 vircgroup: Add some VIR_DEBUG statements
These helped with debugging
https://bugzilla.redhat.com/show_bug.cgi?id=1612383

Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
2019-09-27 16:45:23 -04:00
Cole Robinson
b5290a6e6e vircgroupv2: Fix VM startup when legacy cgroups are defined
On Fedora 31, starting a 'mock' build alters /proc/$pid/cgroup,
probably due to usage of systemd-nspawn.

Before:
$ cat /proc/self/cgroup
0::/user.slice/user-1000.slice/...

After:
$ cat /proc/self/cgroup
1:name=systemd:/
0::/user.slice/user-1000.slice/...

The cgroupv2 code mishandles that first line in the second case, which
causes VM startup to fail with: Unable to read from
'/sys/fs/cgroup/machine/cgroup.controllers': No such file or directory

The kernel docs[1] say that the cgroupv2 path will always start with
'0::', which in the code here controllers="". Only set the v2 placement
path when we see that cgroup file entry.

[1] https://www.kernel.org/doc/html/v5.3/admin-guide/cgroup-v2.html#processes

https://bugzilla.redhat.com/show_bug.cgi?id=1751120

Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
2019-09-27 16:45:23 -04:00
Pavel Hrdina
0bd4ad193d vircgroupv2: fix setting cpu.max period
When we set cpu.max period we need to parse the cpu.max file first as
it contains both quota and period values separated by space.  When only
a single number is written to that file it will set quota.  However,
in order to change period we need to write both values.

The code was prepared for that but mistakenly used new line to end the
string with the first value.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1749227

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2019-09-06 09:24:43 +02:00
Pavel Hrdina
9a99b01f8d vircgroupv2: fix abort in VIR_AUTOFREE
Introduced by commit <c854e0bd33c7a5afb04a36465bf04f861b2efef5> that
tried to fix an issue where we would fail to parse values from files.

We cannot change the original pointer that is going to be used by
VIR_AUTOFREE.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1747440

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Acked-by: Peter Krempa <pkrempa@redhat.com>
2019-08-30 16:31:28 +02:00
Pavel Hrdina
23689cddd4 vircgroupv2: fix virCgroupV2GetCpuCfsQuota for "max" value
If the first value in cpu.max is "max" return from function.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1741837

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2019-08-20 17:37:37 +02:00
Pavel Hrdina
c854e0bd33 vircgroupv2: fix parsing multiple values in single file
Our virStrToLong* helpers converts string to integers where it wraps
strtol standard function.  After the conversion happens and there are
some remaining invalid characters our helpers will fail if the second
argument is NULL.

We need to pass pointer to string in cases where there are multiple
values in a single file.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1741825

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2019-08-20 17:37:37 +02:00
Pavel Hrdina
759bf903a6 vircgroupv2: remove ATTRIBUTE_UNUSED for used attribute
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Acked-by: Peter Krempa <pkrempa@redhat.com>
2019-07-25 10:51:56 +02:00
Pavel Hrdina
56fdf3f025 vircgroupv2: store enabled controllers
In cgroups v2 when a new group is created by default no controller is
enabled so the detection code will not detect any controllers.

When enabling the controllers we should also store them for the group.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Acked-by: Peter Krempa <pkrempa@redhat.com>
2019-07-25 10:51:53 +02:00
Pavel Hrdina
7b77f3a11e vircgroup: fix cgroups v2 controllers detection
When creating new group for cgroups v2 the we cannot check
cgroups.controllers for that cgroup because the directory is created
later.  In that case we should check cgroups.subtree_control of parent
group to get list of controllers enabled for child cgroups.

In order to achieve that we will prefer the parent group if it exists,
the current group will be used only for root group.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Acked-by: Peter Krempa <pkrempa@redhat.com>
2019-07-25 10:50:44 +02:00
Pavel Hrdina
62dd4d25a2 util: vircgroupv2: stop enabling missing controllers with systemd
Because of a systemd delegation policy [1] we should not write to any
cgroups files owned by systemd which in case of cgroups v2 includes
'cgroups.subtree_control'.

systemd will enable controllers automatically for us to have them
available for VM cgroups.

[1] <https://github.com/systemd/systemd/blob/master/docs/CGROUP_DELEGATION.md>

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-06-28 15:17:37 +02:00
Pavel Hrdina
d117431143 Revert "util: vircgroup: pass parent cgroup into virCgroupDetectControllersCB"
This reverts commit 7bca1c9bdc.

As it turns out it's not a good idea on systemd hosts.  The root
cgroup can have all controllers enabled but they don't have to be
enabled for sub-cgroups.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-06-28 15:17:26 +02:00
Pavel Hrdina
05807e5d42 util: vircgroupv2: mark only requested controllers as available
When detecting available controllers on host we can be limited by list
of controllers from qemu.conf file.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-06-26 13:34:01 +02:00
Pavel Hrdina
1d49cdcd11 util: vircgroupv2: don't error out if enabling controller fails
Currently CPU controller cannot be enabled if there is any real-time
task running and is assigned to non-root cgroup which is the case on
several distributions with graphical environment.

Instead of erroring out treat it as the controller is not available.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-06-26 13:34:01 +02:00
Pavel Hrdina
29a94a3fef util: vircgroupv2: separate return values of virCgroupV2EnableController
In order to skip controllers that we are not able to activate we need
to return different return value so the caller can decide what to do.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-06-26 13:34:01 +02:00
Pavel Hrdina
f9d1c08557 util: vircgroupv2: enable CPU controller only if it's available
It might happen that we are not able to enable CPU controller so we
can enable it for thread sub-cgroups only if it's available in parent
cgroup.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-06-26 13:34:01 +02:00
Pavel Hrdina
535bdf83c0 util: vircgroupv2: use any controller to create thread directory
The assumption that CPU controller would be always enabled is wrong, we
should use any available controller to create a new sub-cgroup.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-06-26 13:34:01 +02:00
Pavel Hrdina
d3007c844d util: vircgroup: improve controller detection
This affects only cgroups v2 where enabled controllers are not based on
available mount points but on the list provided in cgroup.controllers
file.  However, moving it will fill in placement as well, so it needs
to be freed together with mount point if we don't need that controller.

Before this patch we were assuming that all controllers available in
root cgroup where available in all other sub-cgroups which was wrong.

In order to fix it we need to move the cgroup controllers detection
after cgroup placement was prepared in order to build correct path for
cgroup.controllers file.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-06-26 13:34:01 +02:00
Pavel Hrdina
7bca1c9bdc util: vircgroup: pass parent cgroup into virCgroupDetectControllersCB
In cgroups v2 we don't have to detect available controllers every single
time if we are creating a new cgroup based on parent cgroup.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-06-26 13:34:01 +02:00
Pavel Hrdina
7e8a1a6e21 util: vircgroupv2: add support for BFQ files
In kernel 4.12 there was introduced new BFQ scheduler and in kernel
5.0 the old CFQ scheduler was removed.  This has an implication on
the cgroups file names.

If the CFQ controller is enabled we use one file:

    io.weight

The new BFQ controller expose one file with different name:

    io.bfq.weight

Except for different name they have different syntax.

io.weight:

    default $val
    major:minor $val

io.bfq.weight:

    $val

The difference is that BFQ doesn't support per-device weight.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-06-21 14:36:02 +02:00
Pavel Hrdina
c23829f18a util: vircgroup: move virCgroupGetValueStr out of virCgroupGetValueForBlkDev
If we need to get a path of specific file and we need to check its
existence before we use it then we can reuse that path to get value
for specific device.  This way we will not build the path again in
virCgroupGetValueForBlkDev.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-06-21 14:35:57 +02:00
Cole Robinson
1d31526b52 Always put _LAST enums on second line of VIR_ENUM_IMPL
Standardize on putting the _LAST enum value on the second line
of VIR_ENUM_IMPL invocations. Later patches that add string labels
to VIR_ENUM_IMPL will push most of these to the second line anyways,
so this saves some noise.

Signed-off-by: Cole Robinson <crobinso@redhat.com>
2019-04-11 12:47:23 -04:00
Peter Krempa
f785318187 Revert "Include unistd.h directly by files using it"
This reverts commit a5e1602090.

Getting rid of unistd.h from our headers will require more work than
just fixing the broken mingw build. Revert it until I have a more
complete proposal.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2019-04-10 12:26:32 +02:00
Peter Krempa
a5e1602090 Include unistd.h directly by files using it
util/virutil.h bogously included unistd.h. Drop it and replace it by
including it directly where needed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 09:12:04 +02:00
Peter Krempa
c0abcca417 util: Don't include 'viralloc.h' into other header files
'viralloc.h' does not provide any type or macro which would be necessary
in headers. Prevent leakage of the inclusion.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 09:12:04 +02:00
Pavel Hrdina
a6aedcf39b util: enable cgroups v2 cpuset controller for threads
When we create cgroup for qemu threads we need to enable cpuset
controller in order to use it.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2019-03-07 13:44:52 +01:00
Pavel Hrdina
3b72c84ff1 util: implement virCgroupV2(Set|Get)CpusetCpus
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2019-03-07 13:44:52 +01:00
Pavel Hrdina
77c1cf4da2 util: implement virCgroupV2(Set|Get)CpusetMemoryMigrate
Cgroups v2 don't have memory_migrate interface and the migration is
enabled by default.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2019-03-07 13:44:52 +01:00
Pavel Hrdina
74e7da0605 util: implement virCgroupV2(Set|Get)CpusetMems
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2019-03-07 13:44:52 +01:00
Ján Tomko
0f110d5ac8 Use NULLSTR_EMPTY
Instead of repetitive:
  s ? s : ""
use NULLSTR_EMPTY.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2019-02-14 14:09:38 +01:00
Cole Robinson
6a4d938dd3 Require a semicolon for VIR_ENUM_IMPL calls
Missing semicolon at the end of macros can confuse some analyzers
(like cppcheck <filename>), and we have a mix of semicolon and
non-semicolon usage through the code. Let's standardize on using
a semicolon for VIR_ENUM_IMPL calls.

Move the verify() statement to the end of the macro and drop
the semicolon, so the compiler will require callers to add a
semicolon.

While we are touching these call sites, standardize on putting
the closing parenth on its own line, as discussed here:
https://www.redhat.com/archives/libvir-list/2019-January/msg00750.html

Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
2019-02-03 17:46:29 -05:00
Daniel P. Berrangé
568a417224 Enforce a standard header file guard symbol name
Require that all headers are guarded by a symbol named

  LIBVIRT_$FILENAME

where $FILENAME is the uppercased filename, with all characters
outside a-z changed into '_'.

Note we do not use a leading __ because that is technically a
namespace reserved for the toolchain.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-12-14 10:47:13 +00:00
Pavel Hrdina
634bd528cb vircgroupv2: fix virCgroupV2ValidateMachineGroup
When libvirt is reconnecting to running domain that uses cgroup v2
the QEMU process reports cgroup for the emulator directory because the
main thread is in that cgroup.  We need to remove the "/emulator" part
in order to match with the root cgroup directory name for that domain.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2018-12-14 11:30:20 +08:00
Pavel Hrdina
b532546823 vircgroup: introduce virCgroupKillRecursiveCB
The rewrite to support cgroup v2 missed this function.  In cgroup v2
we have different files to track tasks.

We would fail to remove cgroup on non-systemd OSes if there is any
extra process assigned to guest cgroup because we would not kill any
process form the guest cgroup.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2018-12-14 11:27:18 +08:00
Peter Chubb
b8176d6eaa util: Fix virCgroupGetMemoryStat
Commit 901d2b9c introduced virCgroupGetMemoryStat and replaced
the LXC virLXCCgroupGetMemStat logic in commit e634c7cd0. However,
in doing so the replacement wasn't exact as the LXC logic used
getline() to process the cgroup controller data, while the new
virCgroupGetMemoryStat used "memory.stat" manual buffer read/
processing which neglected to forward through @line in order
to read each line in the output.

To fix that, we should be sure to carry forward the @line value
for each line read updating it beyond that current @newLine value
once we've calculated the values that we want.

Signed-off-by: Peter Chubb <peter.chubb@data61.csiro.au>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2018-11-14 14:45:02 -05:00