Commit Graph

98 Commits

Author SHA1 Message Date
Pavel Hrdina
c88b3712ca vircgroup: remove useless cgroup->path variable
It is only used for debug and error purposes which can be easily
replaced by @placement.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-11-03 21:26:32 +01:00
Pavel Hrdina
9d312af357 vircgroupv2: detect controllers enabled in parent cgroup
With cgroups v2 working with controllers is a bit more complicated then
with cgroups v1 where the controller had to be mounted.

There are two files, cgroups.controllers and cgroup.subtree_control.
The file cgroup.controllers lists all controllers enabled in the current
cgroup and cgroups.subtree_control, as the name suggest, controls which
controllers are enabled for a subtree of cgroups.

Now the issue here is that the current code doesn't make any difference
if the @parent variable is NULL or not because ../cgroup.subtree_control
will list the same controllers as ./cgroup.controllers.

The whole point of the @parent variable is when we are building the
cgroup topology ourselves without systemd help we need to detect which
controllers are enabled in the parent cgroup in order to enable them for
the current cgroup as well and for that we need to check
cgroup.controllers of the parent group.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-11-03 21:26:32 +01:00
Pavel Hrdina
902c6644a8 vircgroupv2: properly detect placement of running VM
When libvirtd starts a VM it internally stores a path to the main
cgroup. When we restart libvirtd we should get to the same state.

When we start a VM on host with systemd the cgroup is created for us and
the process is already placed into that cgroup and we detect the path
created by systemd using /proc/$PID/cgroup. After that we create
sub-cgroups and move all threads there.

Once libvirtd is restarted we again detect the cgroup path using
/proc/$PID/cgroup, but in this case we will get a different path because
the main thread was moved to a "emulator" cgroup.

Instead of ignoring the "emulator" directory when validating cgroups
remove it completely when detecting cgroup otherwise cgroups will not
work properly when libvirtd is restarted.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-11-03 21:26:32 +01:00
Pavel Hrdina
e85cfb095a vircgroupv2: properly detect empty tasks
With cgroups v2 the file cgroup.procs will never be empty if threading
is enabled as it will always have ID of all processes even if all
threads of the processes are moved to sub-cgroups. If that happens the
file cgroup.threads will be empty.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-11-03 21:26:32 +01:00
Peter Krempa
b16629f00c util: cgroup: Use GHashTable instead of virHashTable
Rewrite using GHashTable which already has interfaces for using a number
as hash key. Glib's implementation doesn't copy the key by default, so
we need to allocate it, but overal the interface is more suited for this
case.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2020-10-22 15:02:46 +02:00
Ján Tomko
ef87d60120 util: cgroup: remove unused opts in virCgroupV2BindMount
In virCgroupV2BindMount there is an unused variable containing
what seem to be tmpfs mount options.

Delete it. Unlike with cgroups v1, we do not create a tmpfs
here.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2020-08-03 15:52:09 +02:00
Ján Tomko
ee247e1d3f Use g_strfeev instead of virStringFreeList
Both accept a NULL value gracefully and virStringFreeList
does not zero the pointer afterwards, so a straight replace
is safe.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2020-08-03 15:37:36 +02:00
Ján Tomko
33f6260352 util: vircgroup: include unistd.h rather than virutil.h
There is nothing in the vircgroup.h header file
requiring virutil.h.

Remove it and include unistd.h in the C files.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-02-24 23:15:49 +01:00
Pavel Hrdina
c359cb9aee vircgroup: workaround devices in hybrid mode
So the issue here is that you can end up with configuration where
you have cgroup v1 and v2 enabled at the same time and the devices
controllers is enabled for cgroup v1.

In cgroup v2 there is no devices controller, the device access is
controlled using BPF and since it is not a cgroup controller both
of them can exists at the same time and both of them are applied while
resolving access to devices.

In order to avoid configuring both BPF and cgroup v1 devices we will
use BPF if possible and otherwise fallback to cgroup v1 devices.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:43 +01:00
Pavel Hrdina
884479b42b vircgroup: introduce virCgroupV2DenyAllDevices
If we want to deny all devices we just need to replace any existing
program with new program with empty map.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:42 +01:00
Pavel Hrdina
285aefb31c vircgroup: introduce virCgroupV2AllowAllDevices
If we want to allow all devices with all permissions we need to replace
any existing program that has any rule configured, otherwise we just
need to add new rule which will for example allow read access to all
devices.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:41 +01:00
Pavel Hrdina
d5b09ce5d9 vircgroup: introduce virCgroupV2DenyDevice
In order to deny device we need to check if there is any entry in BPF
map and we need to load the current value from map if there is already
entry for that device.  If both values are same we can remove that entry
but if they are different we need to update the entry because we don't
have to deny all access, but for example only write access.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:40 +01:00
Pavel Hrdina
5d49651912 vircgroup: introduce virCgroupV2AllowDevice
In order to allow device we need to create key and value which will be
used to update BPF map.  virBPFUpdateElem() can override existing
entries in BPF map so we need to check if that entry exists in order to
track number of entries in our map.

This can add rule for specific device but major and minor can be both
-1 which follows the same behavior as in cgroup v1.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:39 +01:00
Pavel Hrdina
6a24bd75ed vircgroup: introduce virCgroupV2DevicesRemoveProg
We need to close our FD that we have for BPF program and map in order
to let kernel remove all resources once the cgroup is removed as well.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:34 +01:00
Pavel Hrdina
30b6ddc44c vircgroup: introduce virCgroupV2DevicesAvailable
There is no exact way how to figure out whether BPF devices support is
compiled into kernel.  One way is to check kernel configure options but
this is not reliable as it may not be available.  Let's try to do
syscall to which will list BPF cgroup device programs.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-11-15 12:58:04 +01:00
Michal Privoznik
91d88aaf23 util: Use g_strdup_printf() instead of virAsprintf()
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-11-12 16:15:58 +01:00
Michal Privoznik
3b4df5d350 Drop needless ret variable
In few places we have the following code pattern:

  int ret;
  ... /* @ret is not accessed here */
  ret = f(...);
  return ret;

This pattern can be written less verbose:

  ...
  return f(...);

This patch was generated with following coccinelle spatch:

  @@
  type T;
  constant C;
  expression f;
  identifier ret;
  @@
  -T ret = C;
   ... when != ret
  -ret = f;
  -return ret;
  +return f;

Afterwards I needed to fix a few places, e.g. comment in
virDomainNetIPParseXML() was removed too because coccinelle
thinks it refers to @ret while in fact it doesn't. Also in few
places it replaced @ret declaration with a few spaces instead of
removing the line. But nothing terribly wrong.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-10-24 08:10:37 +02:00
Ján Tomko
8212e5e4ab vircgroup: use g_strdup instead of VIR_STRDUP
Replace all occurrences of
  if (VIR_STRDUP(a, b) < 0)
     /* effectively dead code */
with:
  a = g_strdup(b);

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-10-21 12:51:58 +02:00
Ján Tomko
3cbd4351de Use g_strdup where VIR_STRDUP's return value was propagated
All the callers of these functions only check for a negative
return value.

However, virNetDevOpenvswitchGetVhostuserIfname is documented
as returning 1 for openvswitch interfaces so preserve that.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-10-21 12:51:55 +02:00
Ján Tomko
a3931b4996 util: use g_steal_pointer instead of VIR_STEAL_PTR
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-10-16 15:59:42 +02:00
Ján Tomko
1e2ae2e311 Use g_autofree instead of VIR_AUTOFREE
Since commit 44e7f02915
    util: rewrite auto cleanup macros to use glib's equivalent

VIR_AUTOFREE is just an alias for g_autofree. Use the GLib macros
directly instead of our custom aliases.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-10-16 12:06:43 +02:00
Ján Tomko
67e72053c1 Use G_N_ELEMENTS instead of ARRAY_CARDINALITY
Prefer the GLib version of the macro.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-10-15 16:14:19 +02:00
Ján Tomko
679f8b3994 util: use G_GNUC_UNUSED
Use G_GNUC_UNUSED from GLib instead of ATTRIBUTE_UNUSED.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-10-15 11:25:25 +02:00
Daniel P. Berrangé
d5d6dbcfb5 build: remove all gnulib bit manipulation modules
We're using gnulib to get ffs, ffsl, rotl32, count_one_bits,
and count_leading_zeros. Except for rotl32 they can all be
replaced with gcc/clangs builtins. rotl32 is a one-line
trivial function.

Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-10-07 13:39:26 +01:00
Cole Robinson
a95e585e13 vircgroup: Add some VIR_DEBUG statements
These helped with debugging
https://bugzilla.redhat.com/show_bug.cgi?id=1612383

Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
2019-09-27 16:45:23 -04:00
Cole Robinson
b5290a6e6e vircgroupv2: Fix VM startup when legacy cgroups are defined
On Fedora 31, starting a 'mock' build alters /proc/$pid/cgroup,
probably due to usage of systemd-nspawn.

Before:
$ cat /proc/self/cgroup
0::/user.slice/user-1000.slice/...

After:
$ cat /proc/self/cgroup
1:name=systemd:/
0::/user.slice/user-1000.slice/...

The cgroupv2 code mishandles that first line in the second case, which
causes VM startup to fail with: Unable to read from
'/sys/fs/cgroup/machine/cgroup.controllers': No such file or directory

The kernel docs[1] say that the cgroupv2 path will always start with
'0::', which in the code here controllers="". Only set the v2 placement
path when we see that cgroup file entry.

[1] https://www.kernel.org/doc/html/v5.3/admin-guide/cgroup-v2.html#processes

https://bugzilla.redhat.com/show_bug.cgi?id=1751120

Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
2019-09-27 16:45:23 -04:00
Pavel Hrdina
0bd4ad193d vircgroupv2: fix setting cpu.max period
When we set cpu.max period we need to parse the cpu.max file first as
it contains both quota and period values separated by space.  When only
a single number is written to that file it will set quota.  However,
in order to change period we need to write both values.

The code was prepared for that but mistakenly used new line to end the
string with the first value.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1749227

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2019-09-06 09:24:43 +02:00
Pavel Hrdina
9a99b01f8d vircgroupv2: fix abort in VIR_AUTOFREE
Introduced by commit <c854e0bd33c7a5afb04a36465bf04f861b2efef5> that
tried to fix an issue where we would fail to parse values from files.

We cannot change the original pointer that is going to be used by
VIR_AUTOFREE.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1747440

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Acked-by: Peter Krempa <pkrempa@redhat.com>
2019-08-30 16:31:28 +02:00
Pavel Hrdina
23689cddd4 vircgroupv2: fix virCgroupV2GetCpuCfsQuota for "max" value
If the first value in cpu.max is "max" return from function.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1741837

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2019-08-20 17:37:37 +02:00
Pavel Hrdina
c854e0bd33 vircgroupv2: fix parsing multiple values in single file
Our virStrToLong* helpers converts string to integers where it wraps
strtol standard function.  After the conversion happens and there are
some remaining invalid characters our helpers will fail if the second
argument is NULL.

We need to pass pointer to string in cases where there are multiple
values in a single file.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1741825

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2019-08-20 17:37:37 +02:00
Pavel Hrdina
759bf903a6 vircgroupv2: remove ATTRIBUTE_UNUSED for used attribute
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Acked-by: Peter Krempa <pkrempa@redhat.com>
2019-07-25 10:51:56 +02:00
Pavel Hrdina
56fdf3f025 vircgroupv2: store enabled controllers
In cgroups v2 when a new group is created by default no controller is
enabled so the detection code will not detect any controllers.

When enabling the controllers we should also store them for the group.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Acked-by: Peter Krempa <pkrempa@redhat.com>
2019-07-25 10:51:53 +02:00
Pavel Hrdina
7b77f3a11e vircgroup: fix cgroups v2 controllers detection
When creating new group for cgroups v2 the we cannot check
cgroups.controllers for that cgroup because the directory is created
later.  In that case we should check cgroups.subtree_control of parent
group to get list of controllers enabled for child cgroups.

In order to achieve that we will prefer the parent group if it exists,
the current group will be used only for root group.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Acked-by: Peter Krempa <pkrempa@redhat.com>
2019-07-25 10:50:44 +02:00
Pavel Hrdina
62dd4d25a2 util: vircgroupv2: stop enabling missing controllers with systemd
Because of a systemd delegation policy [1] we should not write to any
cgroups files owned by systemd which in case of cgroups v2 includes
'cgroups.subtree_control'.

systemd will enable controllers automatically for us to have them
available for VM cgroups.

[1] <https://github.com/systemd/systemd/blob/master/docs/CGROUP_DELEGATION.md>

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-06-28 15:17:37 +02:00
Pavel Hrdina
d117431143 Revert "util: vircgroup: pass parent cgroup into virCgroupDetectControllersCB"
This reverts commit 7bca1c9bdc.

As it turns out it's not a good idea on systemd hosts.  The root
cgroup can have all controllers enabled but they don't have to be
enabled for sub-cgroups.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-06-28 15:17:26 +02:00
Pavel Hrdina
05807e5d42 util: vircgroupv2: mark only requested controllers as available
When detecting available controllers on host we can be limited by list
of controllers from qemu.conf file.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-06-26 13:34:01 +02:00
Pavel Hrdina
1d49cdcd11 util: vircgroupv2: don't error out if enabling controller fails
Currently CPU controller cannot be enabled if there is any real-time
task running and is assigned to non-root cgroup which is the case on
several distributions with graphical environment.

Instead of erroring out treat it as the controller is not available.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-06-26 13:34:01 +02:00
Pavel Hrdina
29a94a3fef util: vircgroupv2: separate return values of virCgroupV2EnableController
In order to skip controllers that we are not able to activate we need
to return different return value so the caller can decide what to do.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-06-26 13:34:01 +02:00
Pavel Hrdina
f9d1c08557 util: vircgroupv2: enable CPU controller only if it's available
It might happen that we are not able to enable CPU controller so we
can enable it for thread sub-cgroups only if it's available in parent
cgroup.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-06-26 13:34:01 +02:00
Pavel Hrdina
535bdf83c0 util: vircgroupv2: use any controller to create thread directory
The assumption that CPU controller would be always enabled is wrong, we
should use any available controller to create a new sub-cgroup.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-06-26 13:34:01 +02:00
Pavel Hrdina
d3007c844d util: vircgroup: improve controller detection
This affects only cgroups v2 where enabled controllers are not based on
available mount points but on the list provided in cgroup.controllers
file.  However, moving it will fill in placement as well, so it needs
to be freed together with mount point if we don't need that controller.

Before this patch we were assuming that all controllers available in
root cgroup where available in all other sub-cgroups which was wrong.

In order to fix it we need to move the cgroup controllers detection
after cgroup placement was prepared in order to build correct path for
cgroup.controllers file.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2019-06-26 13:34:01 +02:00
Pavel Hrdina
7bca1c9bdc util: vircgroup: pass parent cgroup into virCgroupDetectControllersCB
In cgroups v2 we don't have to detect available controllers every single
time if we are creating a new cgroup based on parent cgroup.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-06-26 13:34:01 +02:00
Pavel Hrdina
7e8a1a6e21 util: vircgroupv2: add support for BFQ files
In kernel 4.12 there was introduced new BFQ scheduler and in kernel
5.0 the old CFQ scheduler was removed.  This has an implication on
the cgroups file names.

If the CFQ controller is enabled we use one file:

    io.weight

The new BFQ controller expose one file with different name:

    io.bfq.weight

Except for different name they have different syntax.

io.weight:

    default $val
    major:minor $val

io.bfq.weight:

    $val

The difference is that BFQ doesn't support per-device weight.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-06-21 14:36:02 +02:00
Pavel Hrdina
c23829f18a util: vircgroup: move virCgroupGetValueStr out of virCgroupGetValueForBlkDev
If we need to get a path of specific file and we need to check its
existence before we use it then we can reuse that path to get value
for specific device.  This way we will not build the path again in
virCgroupGetValueForBlkDev.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-06-21 14:35:57 +02:00
Cole Robinson
1d31526b52 Always put _LAST enums on second line of VIR_ENUM_IMPL
Standardize on putting the _LAST enum value on the second line
of VIR_ENUM_IMPL invocations. Later patches that add string labels
to VIR_ENUM_IMPL will push most of these to the second line anyways,
so this saves some noise.

Signed-off-by: Cole Robinson <crobinso@redhat.com>
2019-04-11 12:47:23 -04:00
Peter Krempa
f785318187 Revert "Include unistd.h directly by files using it"
This reverts commit a5e1602090.

Getting rid of unistd.h from our headers will require more work than
just fixing the broken mingw build. Revert it until I have a more
complete proposal.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2019-04-10 12:26:32 +02:00
Peter Krempa
a5e1602090 Include unistd.h directly by files using it
util/virutil.h bogously included unistd.h. Drop it and replace it by
including it directly where needed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 09:12:04 +02:00
Peter Krempa
c0abcca417 util: Don't include 'viralloc.h' into other header files
'viralloc.h' does not provide any type or macro which would be necessary
in headers. Prevent leakage of the inclusion.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 09:12:04 +02:00
Pavel Hrdina
a6aedcf39b util: enable cgroups v2 cpuset controller for threads
When we create cgroup for qemu threads we need to enable cpuset
controller in order to use it.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2019-03-07 13:44:52 +01:00
Pavel Hrdina
3b72c84ff1 util: implement virCgroupV2(Set|Get)CpusetCpus
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2019-03-07 13:44:52 +01:00