Commit Graph

192 Commits

Author SHA1 Message Date
Peter Krempa
cf113e8d54 util: cgroup: Allow ignoring EACCES in virCgroup(Allow|Deny)DevicePath
When adding disk images to ACL we may call those functions on NFS
shares. In that case we might get an EACCES, which isn't really relevant
since NFS would not hold a block device. This patch adds a flag that
allows to stop reporting an error on EACCES to avoid spaming logs.

Currently there's no functional change.
2016-02-17 10:54:05 +01:00
Peter Krempa
9cd5da710e util: cgroup: Drop virCgroup(Allow|Deny)DeviceMajor
Since commit 47e5b5ae virCgroupAllowDevice allows to pass -1 as either
the minor or major device number and it automatically uses '*' in place
of that. Reuse the new approach through the code and drop the duplicated
functions.
2016-02-17 10:54:05 +01:00
Peter Krempa
f42b5c327f util: cgroup: Instrument virCgroupDenyDevice to handle -1 device number as *
Similarly to commit 47e5b5ae virCgroupDenyDevice will handle -1 as *.
2016-02-17 10:54:05 +01:00
Michal Privoznik
a0aa92a24b vircgroup: Update virCgroupGetPercpuStats stump
In the commit 7938b533 we've changed the function signature,
however forgot to update stump that's used on systems without
CGroups causing a build failure.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-02-08 14:06:30 +01:00
Peter Krempa
7938b533d5 cgroup: Prepare for sparse vCPU topologies in virCgroupGetPercpuStats
Pass a bitmap of enabled guest vCPUs to virCgroupGetPercpuStats so that
non-continuous vCPU topologies can be used.
2016-02-08 09:51:34 +01:00
Martin Kletzander
c3bd0019c0 systemd: Modernize machine naming
So, systemd-machined has this philosophy that machine names are like
hostnames and hence should follow the same rules.  But we always allowed
international characters in domain names.  Thus we need to modify the
machine name we are passing to systemd.

In order to change some machine names that we will be passing to systemd,
we also need to call TerminateMachine at the end of a lifetime of a
domain.  Even for domains that were started with older libvirt.  That
can be achieved thanks to virSystemdGetMachineNameByPID().  And because
we can change machine names, we can get rid of the inconsistent and
pointless escaping of domain names when creating machine names.

So this patch modifies the naming in the following way.  It creates the
name as <drivername>-<id>-<name> where invalid hostname characters are
stripped out of the name and if the resulting name is longer, it
truncates it to 64 characters.  That way we can start domains we
couldn't start before.  Well, at least on systemd.

To make it work all together, the machineName (which is needed only with
systemd) is saved in domain's private data.  That way the generation is
moved to the driver and we don't need to pass various unnecessary
arguments to cgroup functions.

The only thing this complicates a bit is the scope generation when
validating a cgroup where we must check both old and new naming, so a
slight modification was needed there.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1282846

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-02-05 16:11:50 +01:00
Peter Krempa
58578f83bc cgroup: Clean up virCgroupGetPercpuStats
Use 'ret' for return variable name, clarify use of 'param_idx' and avoid
unnecessary 'success' label. No functional changes. Also document the
function.
2016-02-03 13:10:04 +01:00
Michal Privoznik
c7f5e26b5f vircgroup: Finish renaming of virCgroupIsolateMount
In dc576025c3 we renamed virCgroupIsolateMount function to
virCgroupBindMount. However, we forgot about one occurrence in
section of the code which provides stubs for platforms without
support for CGroups like *BSD for instance.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-01-26 17:39:47 +01:00
Daniel P. Berrange
dc576025c3 lxc: don't try to hide parent cgroups inside container
On the host when we start a container, it will be
placed in a cgroup path of

   /machine.slice/machine-lxc\x2ddemo.scope

under /sys/fs/cgroup/*

Inside the containers' namespace we need to setup
/sys/fs/cgroup mounts, and currently will bind
mount /machine.slice/machine-lxc\x2ddemo.scope on
the host to appear as / in the container.

While this may sound nice, it confuses applications
dealing with cgroups, because /proc/$PID/cgroup
now does not match the directory in /sys/fs/cgroup

This particularly causes problems for systems and
will make it create repeated path components in
the cgroup for apps run in the container eg

  /machine.slice/machine-lxc\x2ddemo.scope/machine.slice/machine-lxc\x2ddemo.scope/user.slice/user-0.slice/session-61.scope

This also causes any systemd service that uses
sd-notify to fail to start, because when systemd
receives the notification it won't be able to
identify the corresponding unit it came from.
In particular this break rabbitmq-server startup

Future kernels will provide proper cgroup namespacing
which will handle this problem, but until that time
we should not try to play games with hiding parent
cgroups.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-01-26 16:11:32 +00:00
John Ferlan
d41bd09596 Revert "util: cgroups do not implicitly add task to new machine cgroup"
This reverts commit 71ce475967.

Since commit id 'a41c00b47' has been reverted, this no longer is
necessary
2016-01-14 11:00:25 -05:00
Jasper Lievisse Adriaanse
1b60f1b401 cgroup: don't include sys/mount.h if not needed
As cgroup implementation only works on Linux, it does not
make much sense to include sys/mount.h if other requirements are
not met, such as HAVE_MNTENT_H and HAVE_GETMNTENT_R.

Also, it fixes build on OpenBSD that requires to include sys/param.h
along with sys/mount.h.

Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
2016-01-11 19:56:06 +03:00
Michal Privoznik
f55d1316ad sysconf: Include unistd.h
The manpage for sysconf() suggest including unistd.h as the
function is declared there. Even though we are not hitting any
compile issues currently, let's include the correct header file
instead of relying on some hidden include chain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-12-24 18:03:50 +01:00
Henning Schild
71ce475967 util: cgroups do not implicitly add task to new machine cgroup
virCgroupNewMachine used to add the pidleader to the newly created
machine cgroup. Do not do this implicit anymore.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
2015-12-14 15:43:29 -05:00
Roman Bogorodskiy
46550cde0f util: fix build without cgroup
Commit 89c509a0 added getters for cgroup block device I/O throttling,
however stub versions of these functions have not matching function
prototypes that result in compilation fail on platforms not supporting
cgroup.

Fix build by correcting prototypes of the stubbed functions.

Pushing under build-breaker rule.
2015-08-20 09:42:56 +03:00
Martin Kletzander
89c509a0c1 util: Add getters for cgroup block device I/O throttling
Since now they were not needed, but I sense they will be in a short
while.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2015-08-18 16:25:16 -07:00
Martin Kletzander
ea9db906fc util: Add virCgroupGetBlockDevString
This function translates device paths to "major:minor " string, and all
virCgroupSetBlkioDevice* functions are modified to use it.  It's a
cleanup with no functional change.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2015-08-18 16:16:38 -07:00
Peter Krempa
88f6c007c3 cgroup: Drop resource partition from virSystemdMakeScopeName
The scope name, even according to our docs is
"machine-$DRIVER\x2d$VMNAME.scope" virSystemdMakeScopeName would use the
resource partition name instead of "machine-" if it was specified thus
creating invalid scope paths.

This makes libvirt drop cgroups for a VM that uses custom resource
partition upon reconnecting since the detected scope name would not
match the expected name generated by virSystemdMakeScopeName.

The error is exposed by the following log entry:

debug : virCgroupValidateMachineGroup:302 : Name 'machine-qemu\x2dtestvm.scope' for controller 'cpu' does not match 'testvm', 'testvm.libvirt-qemu' or 'machine-test-qemu\x2dtestvm.scope'

for a "/machine/test" resource and "testvm" vm.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1238570
2015-07-22 07:12:56 +02:00
John Ferlan
51281dcb90 nodeinfo: Add sysfs_prefix to nodeGetPresentCPUBitmap
Add the sysfs_prefix argument to the call to allow for setting the
path for tests to something other than SYSFS_SYSTEM_PATH.
2015-07-13 15:59:32 -04:00
John Ferlan
0456eda317 cgroup: Use virCgroupNewThread
Replace the virCgroupNew{Vcpu|Emulator|IOThread} calls with the common
virCgroupNewThread API

Signed-off-by: John Ferlan <jferlan@redhat.com>
2015-04-09 19:27:08 -04:00
John Ferlan
2cd3a980dc cgroup: Introduce virCgroupNewThread
Create a new common API to replace the virCgroupNew{Vcpu|Emulator|IOThread}
API's using an emum to generate the cgroup name

Signed-off-by: John Ferlan <jferlan@redhat.com>
2015-04-09 19:27:08 -04:00
Michal Privoznik
d65acbde35 vircgroup: Introduce virCgroupControllerAvailable
This new internal API checks if given CGroup controller is
available.  It is going to be needed later when we need to make a
decision whether pin domain memory onto NUMA nodes using cpuset
CGroup controller or using numa_set_membind().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-04-08 11:54:24 +02:00
Michal Privoznik
149a62bc83 virCgroupNew: Enhance debug message
When creating new internal representation of cgroups, all passed
arguments are logged. Well, except for two: pid and pointer for
return value. Lets log them too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-03-30 15:20:24 +02:00
Michal Privoznik
0a09bcdc7f virCgroupNewPartition: Fix comment
The function has no argument named @name rather than @path
instead.  The comment is, however, referring to @name while it
should have been referring to @path really.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-03-30 15:20:20 +02:00
John Ferlan
cf6ab17e45 vircgroup: Fix build issue mingw cross compile
Commit id '2dbfa716' exposed virCgroupDetectMountsFromFile, but did not
add the corresponding entry in the "#else /* !VIR_CGROUP_SUPPORTED */"
section of the module.
2015-03-27 18:09:07 -04:00
John Ferlan
38efd52584 vircgroup: Fix build issue on mingw cross compile
Commit id 'ba1dfc5' added virCgroupSetCpusetMemoryMigrate and
virCgroupGetCpusetMemoryMigrate, but did not add the corresponding
entry points into the "#else /* !VIR_CGROUP_SUPPORTED */" section
2015-03-27 18:09:07 -04:00
Martin Kletzander
ba1dfc5b6a cgroup: Add accessors for cpuset.memory_migrate
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2015-03-20 13:40:02 +01:00
Jiri Denemark
2dbfa716e8 tests: Add tests for virCgroupDetectMounts
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2015-03-18 09:53:24 +01:00
Ján Tomko
22fd3ac38f Introduce virBitmapIsBitSet
A helper that never returns an error and treats bits out of bitmap range
as false.

Use it everywhere we use ignore_value on virBitmapGetBit, or loop over
the bitmap size.
2015-03-13 15:31:33 +01:00
Ján Tomko
b54f48812d Fix a memory leak in virCgroupGetPercpuStats
Coverity reports that my commit af1c98e introduced
two memory leaks:
the cpumap if ncpus == 0 in virCgroupGetPercpuStats
and the params array in the test of the function.
2015-01-26 16:13:06 +01:00
Ján Tomko
af1c98e406 Fix virCgroupGetPercpuStats with non-continuous present CPUs
Per-cpu stats are only shown for present CPUs in the cgroups,
but we were only parsing the largest CPU number from
/sys/devices/system/cpu/present and looking for stats even for
non-present CPUs.
This resulted in:
internal error: cpuacct parse error
2015-01-22 17:01:11 +01:00
Ján Tomko
c803c070c4 Fix virCgroupNewMachine prototype on non-Linux
Commit 318df5a changed the prototype of virCgroupNewMachine
without adjusting the stub function for platforms without
cgroups.
2015-01-20 10:02:53 +01:00
Daniel P. Berrange
318df5a05f Add support for systemd-machined CreateMachineWithNetwork
systemd-machined introduced a new method CreateMachineWithNetwork
that obsoletes CreateMachine. It expects to be given a list of
VETH/TAP device indexes for the host side device(s) associated
with a container/machine.

This falls back to the old CreateMachine method when the new
one is not supported.
2015-01-15 11:07:07 +00:00
Martin Kletzander
3b0f05573f util: Fix possible NULL dereference
Commit 1a80b97d, which added the virCgroupHasEmptyTasks() function
forgot that the parameter @cgroup may be NULL and did not check that.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-21 10:30:49 +01:00
Martin Kletzander
1a80b97ddf util: Add function virCgroupHasEmptyTasks
That function helps checking whether there's a task in that cgroup.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-16 11:15:27 +01:00
Cédric Bosdonnat
5acbb8f99e Avoid getting '-1:-1' in devices cgroup list
When calling virCgroupAllowAllDevices we get these invalid entries
in the device cgroup config.
    b -1:-1 rw
    c -1:-1 rw
Check for positive values before outputting the major and minor to
avoid that.
2014-12-12 17:25:00 +01:00
Eric Blake
eb9093763f maint: forbid 'int foo = true'
I noticed this while working on qemuDomainGetBlockInfo.  Assigning
a bool value to an int variable compiles fine, but raises red flags
on the maintenance front as it becomes too easy to assign -1 or 2
or any other non-bool value to the same variable.

* cfg.mk (sc_prohibit_int_assign_bool): New rule.
* src/conf/snapshot_conf.c (virDomainSnapshotRedefinePrep): Fix
offenders.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo)
(qemuDomainSnapshotCreateXML): Likewise.
* src/test/test_driver.c (testDomainSnapshotAlignDisks):
Likewise.
* src/util/vircgroup.c (virCgroupSupportsCpuBW): Likewise.
* src/util/virpci.c (virPCIDeviceBindToStub): Likewise.
* src/util/virutil.c (virIsCapableVport): Likewise.
* tools/virsh-domain-monitor.c (cmdDomMemStat): Likewise.
* tools/virsh-domain.c (cmdBlockResize, cmdScreenshot)
(cmdInjectNMI, cmdSendKey, cmdSendProcessSignal)
(cmdDetachInterface): Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-11-19 08:20:39 -07:00
Ján Tomko
99b2b4571d Add virCgroupTerminateMachine stub
Fix the build on FreeBSD, broken by commit 4882618.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
2014-10-02 11:11:10 +02:00
Guido Günther
4882618ed1 qemu: use systemd's TerminateMachine to kill all processes
If we don't properly clean up all processes in the
machine-<vmname>.scope systemd won't remove the cgroup and subsequent vm
starts fail with

  'CreateMachine: File exists'

Additional processes can e.g. be added via

  echo $PID > /sys/fs/cgroup/systemd/machine.slice/machine-${VMNAME}.scope/tasks

but there are other cases like

  http://bugs.debian.org/761521

Invoke TerminateMachine to be on the safe side since systemd tracks the
cgroup anyway. This is a noop if all processes have terminated already.
2014-10-01 20:17:46 +02:00
John Ferlan
e45f0d057e vircgroup: Fix broken builds without cgroups
I missed adding virCgroupNewIOThread to the !VIR_CGROUP_SUPPORTED

Pushing as build breaker
2014-09-15 14:48:52 -04:00
John Ferlan
3abb95cad4 vircgroup: Introduce virCgroupNewIOThread
Add virCgroupNewIOThread() to mimic virCgroupNewVcpu() except the naming
scheme with use "iothread" rather than "vcpu".
2014-09-15 13:18:56 -04:00
Wang Rui
d01a062be6 vircgroup: Resolve Coverity RESOURCE_LEAK
Need to free 'root' and 'opts' before 'return -1' if symlink fails.

Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
2014-09-03 15:00:19 -04:00
Cédric Bosdonnat
47e5b5ae32 lxc: allow to keep or drop capabilities
Added <capabilities> in the <features> section of LXC domains
configuration. This section can contain elements named after the
capabilities like:

  <mknod state="on"/>, keep CAP_MKNOD capability
  <sys_chroot state="off"/> drop CAP_SYS_CHROOT capability

Users can restrict or give more capabilities than the default using
this mechanism.
2014-07-23 15:12:37 +08:00
Peter Krempa
464f7678d9 util: cgroup: Fix build on non-cgroup platforms
Commit a48f445100 introduced a helper
function to convert cgroup device mode to string. The function was only
conditionally compiled on platforms that support cgroup. This broke the
build when attempting to export the symbol:

  CCLD     libvirt.la
  Cannot export virCgroupGetDevicePermsString: symbol not defined

Move the function out of the ifdef, as it doesn't really depend on the
cgroup code being present.
2014-07-09 09:45:36 +02:00
Peter Krempa
a48f445100 util: cgroup: Add helper to convert device mode to string
Cgroups code uses VIR_CGROUP_DEVICE_* flags to specify the mode but in
the end it needs to be converted to a string. Add a helper to do it and
use it in the cgroup code before introducing it into the rest of the
code.
2014-07-08 14:34:05 +02:00
Chen Hanxiao
d18aa70416 util: fix memory leak in failure path of virCgroupKillRecursiveInternal
Don't leak keypath when we fail to kill a process

Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
2014-05-16 14:11:07 +03:00
Eric Blake
ac1d42ac72 util: use virDirRead API
In making the conversion to the new API, I fixed a couple bugs:
virSCSIDeviceGetSgName would leak memory if a directory
unexpectedly contained multiple entries;
virNetDevTapGetRealDeviceName could report a spurious error
from a stale errno inherited before starting the readdir search.

The decision on whether to store the result of virDirRead into
a variable is based on whether the end of the loop falls through
to cleanup code automatically.  In some cases, we have loops that
are documented to return NULL on failure, and which raise an
error on most failure paths but not in the case where the directory
was unexpectedly empty; it may be worth a followup patch to
explicitly report an error if readdir was successful but the
directory was empty, so that a NULL return always has an error set.

* src/util/vircgroup.c (virCgroupRemoveRecursively): Use new
interface.
(virCgroupKillRecursiveInternal, virCgroupSetOwner): Report
readdir failures.
* src/util/virfile.c (virFileLoopDeviceOpenSearch)
(virFileNBDDeviceFindUnused, virFileDeleteTree): Use new
interface.
* src/util/virnetdevtap.c (virNetDevTapGetRealDeviceName):
Properly check readdir errors.
* src/util/virpci.c (virPCIDeviceIterDevices)
(virPCIDeviceFileIterate, virPCIGetNetName): Report readdir
failures.
(virPCIDeviceAddressIOMMUGroupIterate): Use new interface.
* src/util/virscsi.c (virSCSIDeviceGetSgName): Report readdir
failures, and avoid memory leak.
(virSCSIDeviceGetDevName): Report readdir failures.
* src/util/virusb.c (virUSBDeviceSearch): Report readdir
failures.
* src/util/virutil.c (virGetFCHostNameByWWN)
(virFindFCHostCapableVport): Report readdir failures.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-04-28 17:52:45 -06:00
Ján Tomko
5dfcd6fbc6 Fix build on mingw32
My commit 897808e added a parameter to virCgroupGetPercpuStats,
but didn't change the stub for systems where cgroups are not supported.
2014-04-09 16:47:26 +02:00
Ján Tomko
2adf59ebde Clean up virCgroupGetPercpuStats
The iterator is checked for being less than or equal to need_cpus.
The 'n' variable is incremented need_cpus + 1 times.

Simplify the computation of need_cpus and make its value one larger,
to let it be used instead of 'n' and compared without the equal sign
in loop conditions.

Just index the sum_cpu_time array instead of using a helper variable.

Start the loop at start_cpu instead of continuing for all lower values.
2014-04-09 16:24:08 +02:00
Ján Tomko
9fe5267ade Check maximum startcpu value correctly
The cpus are indexed from 0, so a startcpu value equal
to the number of CPUs is invalid.

https://bugzilla.redhat.com/show_bug.cgi?id=1070680
2014-04-09 16:24:08 +02:00
Ján Tomko
dd74ab4e82 Rename id, max_id to need_cpus, total_cpus
total_cpus is the total number of CPUs on the host
need_cpus is the number of CPUs we need to look at

(need_cpus can be larger than ncpus, because we need to look
 at CPUs before the startcpu too, even if we aren't reporting
 their stats)
2014-04-09 16:24:08 +02:00
Ján Tomko
897808e74f Extend virCgroupGetPercpuStats to fill in vcputime too
Currently, virCgroupGetPercpuStats is only used by the LXC driver,
filling out the CPUTIME stats. qemuDomainGetPercpuStats does this
and also filles out VCPUTIME stats.

Extend virCgroupGetPercpuStats to also report VCPUTIME stats if
nvcpupids is non-zero. In the LXC driver, we don't have cpupids.
In the QEMU driver, there is at least one cpupid for a running domain,
so the behavior shouldn't change for QEMU either.

Also rename getSumVcpuPercpuStats to virCgroupGetPercpuVcpuSum.
2014-04-09 16:24:08 +02:00
Ján Tomko
23d2d863b7 Fix return value of virCgroupGetPercpuStats
We need to return the number of successfully populated stats,
not the nparams supplied by the user.
2014-04-09 16:24:08 +02:00
Hongwei Bi
4ef09c4690 util: remove useless comment for virCgroupMoveTask in vircgroup.c
Signed-off-by: Hongwei Bi <hwbi2008@gmail.com>
2014-03-31 14:16:05 +02:00
Ján Tomko
bada4222e5 Indent top-level labels by one space in src/util/ 2014-03-25 14:58:40 +01:00
Wang Yufei
bfb29654c8 cgroup: Fix start VMs coincidently failed
When I start multi VMs coincidently and any of the cgroup directories
named machine doesn't exist. There's a chance that VM start failed because
of creating directory failed:
Unable to initialize /machine cgroup: File exists
When the errno returned by mkdir in virCgroupMakeGroup is EEXIST,
we should pass it through and continue to start the VM.
Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>
2014-03-21 13:27:28 +01:00
Daniel P. Berrange
2835c1e730 Add virLogSource variables to all source files
Any source file which calls the logging APIs now needs
to have a VIR_LOG_INIT("source.name") declaration at
the start of the file. This provides a static variable
of the virLogSource type.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-03-18 14:29:22 +00:00
Martin Kletzander
cc9c62fef9 Require spaces around equality comparisons
Commit a1cbe4b5 added a check for spaces around assignments and this
patch extends it to checks for spaces around '=='.  One exception is
virAssertCmpInt where comma after '==' is acceptable (since it is a
macro and '==' is its argument).

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-03-18 11:29:44 +01:00
Eric Blake
fa2e4dbfd6 build: fix cgroups on non-Linux
Running ./autobuild.sh detected a mingw failure:

  CCLD     libvirt.la
Cannot export virCgroupGetPercpuStats: symbol not defined
Cannot export virCgroupSetOwner: symbol not defined

* src/util/vircgroup.c (virCgroupGetPercpuStats)
(virCgroupSetOwner): Implement stubs.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-02-25 17:38:46 -07:00
Richard Weinberger
6fb42d7cdc Ensure systemd cgroup ownership is delegated to container with userns
This function is needed for user namespaces, where we need to chmod()
the cgroup to the initial uid/gid such that systemd is allowed to
use the cgroup.

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-24 15:35:47 +00:00
Ján Tomko
abf1daf0d7 Add a stub for virCgroupGetDomainTotalCpuStats
Commit 6515889 broke the build on FreeBSD:
In function `qemuDomainGetCPUStats':
/../../src/qemu/qemu_driver.c:16102:
undefined reference to `virCgroupGetDomainTotalCpuStats'
2014-02-21 09:10:48 +01:00
Thorsten Behrens
4b3b2f6ceb Implement domainGetCPUStats for lxc driver. 2014-02-20 16:20:09 +01:00
Thorsten Behrens
65158899b7 Make qemuGetDomainTotalCPUStats a virCgroup function.
To reuse this from other drivers, like lxc.
2014-02-20 16:20:09 +01:00
Thorsten Behrens
a2bb187c7e Add util virCgroupGetBlkioIo*Serviced methods.
This reads blkio stats from blkio.throttle.io_service_bytes and
blkio.throttle.io_serviced.
2014-02-20 16:20:09 +01:00
Gao feng
3b431929a2 blkio: Setting throttle blkio cgroup for domain
This patch introduces virCgroupSetBlkioDeviceReadIops,
virCgroupSetBlkioDeviceWriteIops,
virCgroupSetBlkioDeviceReadBps and
virCgroupSetBlkioDeviceWriteBps,

we can use these interfaces to set up throttle
blkio cgroup for domain.

This patch also adds the new throttle blkio cgroup
elements to the test xml.

Signed-off-by: Guan Qiang <hzguanqiang@corp.netease.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2014-01-20 10:52:44 +08:00
Martin Kletzander
231656bbeb cgroups: Redefine what "unlimited" means wrt memory limits
Since kernel 3.12 (commit 34ff8dc08956098563989d8599840b130be81252 in
linux-stable.git in particular) the value for 'unlimited' in cgroup
memory limits changed from LLONG_MAX to ULLONG_MAX.  Due to rather
unfortunate choice of our VIR_DOMAIN_MEMORY_PARAM_UNLIMITED constant
(which we transfer as an unsigned long long in Kibibytes), we ended up
with the situation described below (applies to x86_64):

 - 2^64-1 (ULLONG_MAX) -- "unlimited" in kernel = 3.12

 - 2^63-1 (LLONG_MAX) -- "unlimited" in kernel < 3.12
 - 2^63-1024 -- our PARAM_UNLIMITED scaled to Bytes

 - 2^53-1 -- our PARAM_UNLIMITED unscaled (in Kibibytes)

This means that when any number within (2^63-1, 2^64-1] is read from
memory cgroup, we are transferring that number instead of "unlimited".
Unfortunately, changing VIR_DOMAIN_MEMORY_PARAM_UNLIMITED would break
ABI compatibility and thus we have to resort to a different solution.

With this patch every value greater than PARAM_UNLIMITED means
"unlimited".  Even though this may seem misleading, we are already in
such unclear situation when running 3.12 kernel with memory limits set
to 2^63.

One example showing most of the problems at once (with kernel 3.12.2):
 # virsh memtune asdf --hard-limit 9007199254740991 --swap-hard-limit -1
 # echo 12345678901234567890 >\
/sys/fs/cgroup/memory/machine/asdf.libvirt-qemu/memory.soft_limit_in_bytes
 # virsh memtune asdf
 hard_limit     : 18014398509481983
 soft_limit     : 12056327051986884
 swap_hard_limit: 18014398509481983

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2013-12-10 08:38:46 +01:00
Zhou Yimin
036aeca721 Cgroup: Replace 'newpath' with 'newPath'
Unifying codding style, replace 'newpath' with 'newPath'.

From: Zhou Yimin <zhouyimin@huawei.com>
2013-12-06 16:18:14 +01:00
Chen Hanxiao
521cec2aab cgroup: leave blkio cgroup value checking to kernel
The range of valid values for cgroup tunables has
changed in the past and may change again in future
kernels. Avoid hardcoding range checks in libvirt
code, delegating range checking to the kernel itself.

Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
2013-10-15 12:22:07 +01:00
Chen Hanxiao
501476fccf cgroup: show error when EINVAL is returned
When EINVAL is returned while changing a cgroups value, tell
user that what values are invalid for the field.

Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
2013-10-15 12:18:47 +01:00
Chen Hanxiao
fc9a416df7 cgroup: fix a comment typo in vircgroup.c
s/shoule/should

Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
2013-10-09 17:16:58 +02:00
Peter Krempa
d79fe8b50b cgroup: Move [qemu|lxc]GetCpuBWStatus to vicgroup.c and refactor it
The function existed in two identical instances in lxc and qemu. Move it
to vircgroup.c and simplify it. Refactor the callers too.
2013-09-16 11:32:49 +02:00
Peter Krempa
4baa8d7637 cleanup: Kill usage of access(PATH, F_OK) in favor of virFileExists()
Semantics of the libvirt helper are more clear. This change also allows
to clean up some pieces of code.
2013-09-16 10:37:39 +02:00
Daniel P. Berrange
a48838ad2e Fix launching of VMs on when only logind part of systemd is present
Debian systems may run the 'systemd-logind' daemon, which causes the
/sys/fs/cgroup/systemd  mount to be setup, but no other cgroup
controllers are created. While the LXC driver considers cgroups to
be mandatory, the QEMU driver is supposed to accept them as optional.

We detect whether they are present by looking in /proc/mounts for
any mounts of type 'cgroups', but this is not sufficient. We need to
skip any named mounts (as seen by a name=XXX string in the mount
options), so that we only detect actual resource controllers.

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=721979

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-12 11:32:36 +01:00
Daniel P. Berrange
f0b6d8d472 Fix cgroups when all are mounted on /sys/fs/cgroup
Some users in Ubuntu/Debian seem to have a setup where all the
cgroup controllers are mounted on /sys/fs/cgroup rather than
any /sys/fs/cgroup/<controller> name. In the loop which detects
which controllers are present for a mount point we were modifying
'mnt_dir' field in the 'struct mntent' var, but not always restoring
the original value. This caused detection to break in the all-in-one
mount setup.

Fix that logic bug and add test case coverage for this mount
setup.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-11 11:45:38 +01:00
Roman Bogorodskiy
81b1915773 cgroup macros refactoring, part 5
Complete the refactoring by adding missing stubs so it compiles on
platform without cgroup support.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-08-12 16:58:54 -06:00
Roman Bogorodskiy
2d795df3f0 cgroup macros refactoring, part 4
Complete moving to VIR_CGROUP_SUPPORTED

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-08-12 16:58:54 -06:00
Roman Bogorodskiy
7f5f270d5f cgroup macros refactoring, part 3
Continue converting to VIR_CGROUP_SUPPORTED

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-08-12 16:58:54 -06:00
Roman Bogorodskiy
c419e9b51c cgroup macros refactoring, part 2
- Convert virCgroupGet* to VIR_CGROUP_SUPPORTED
- Convert virCgroup(Get|Set)FreezerState to VIR_CGROUP_SUPPORTED

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-08-12 16:58:47 -06:00
Roman Bogorodskiy
02f1fd41f6 cgroup macros refactoring, part 1
- Introduce VIR_CGROUP_SUPPORTED conditional
- Convert virCgroupKill* to use it
- Convert virCgroupIsolateMount() to use it
- Convert virCgroupRemoveRecursively to VIR_CGROUP_SUPPORTED

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-08-12 16:15:58 -06:00
Eric Blake
2ff9e54cbf cgroup: functional sort
Make future patches smaller by matching a sane header listing in
the first place.  No semantic change.

* src/util/vircgroup.h: Move free next to new, and controller
functions next to each other.
* src/util/vircgroup.c (virCgroupFree, virCgroupHasController)
(virCgroupPathOfController, virCgroupRemoveRecursively)
(virCgroupRemove): Sort implementation to be closer to header.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-08-12 16:08:18 -06:00
Eric Blake
7ccd322b20 cgroup: topological sort
Avoid a forward declaration of a static function.

* src/util/vircgroup.c (virCgroupPartitionNeedsEscaping)
(virCgroupParticionEscape): Move up.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-08-12 15:38:37 -06:00
Eric Blake
a91929053c cgroup: use consistent formatting
Format all functions with two blank lines between, and return type
on separate line from function name.  Also break some lines longer
than 80 columns.  This makes the subsequent macro refactoring
less noisy.

* src/util/vircgroup.c: Match prevailing style.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-08-12 15:36:35 -06:00
Daniel P. Berrange
2fe2470181 Enable support for systemd-machined in cgroups creation
Make the virCgroupNewMachine method try to use systemd-machined
first. If that fails, then fallback to using the traditional
cgroup setup code path.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-31 19:29:19 +01:00
Daniel P. Berrange
75304eaa1a Cope with races while killing processes
When systemd is involved in managing processes, it may start
killing off & tearing down croups associated with the process
while we're still doing virCgroupKillPainfully. We must
explicitly check for ENOENT and treat it as if we had finished
killing processes

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-31 19:27:28 +01:00
Daniel P. Berrange
aedd46e7e3 Add support for systemd cgroup mount
Systemd uses a named cgroup mount for tracking processes. Add
it as another type of controller, albeit one which we have to
special case in a number of places. In particular we must
never create/delete directories there, nor add tasks. Essentially
the systemd mount is to be considered read-only for libvirt.

With this change both the virCgroupDetectPlacement and
virCgroupCopyPlacement methods must be invoked. The copy
placement method will copy setup for resource controllers
only. The detect placement method will probe for any
named controllers, or resource controllers not already
setup.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-31 19:27:19 +01:00
Eric Blake
a2d0c3f553 build: fix vircgroup build on mingw
The previous patch was incomplete.

  CC       libvirt_util_la-vircgroup.lo
../../src/util/vircgroup.c:70:12: error: 'virCgroupPartitionEscape' declared 'static' but never defined [-Werror=unused-function]
 static int virCgroupPartitionEscape(char **path);
            ^

* src/util/vircgroup.c (virCgroupPartitionEscape): Move forward
declaration inside conditional.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-07-29 08:56:20 -06:00
Daniel P. Berrange
7cf81fa175 Conditionalize build of virCgroupValidateMachineGroup
The virCgroupValidateMachineGroup method calls some functions
which are only conditionally compiled, thus it too must be
made conditional. This fixes the build on non-Linux hosts.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-29 14:36:44 +01:00
Daniel P. Berrange
56b54173ed Skip detecting placement if controller is disabled
If the app has provided a whitelist of controllers to be used,
we skip detecting its mount point. We still, however, fill in
the placement info which later confuses the machine name
validation code. Skip detecting placement if the controller
mount point is not set

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-25 19:55:51 +01:00
Daniel P. Berrange
5ec5a22493 Add 'controllers' arg to virCgroupNewDetect
When detecting cgroups we must honour any controllers
whitelist the driver may have.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-25 19:55:47 +01:00
Daniel P. Berrange
c101b851c1 Fix detection of 'emulator' cgroup
When a VM has an 'emulator' child cgroup present, we must
strip off that suffix when detecting the cgroup for a
machine

Rename the virCgroupIsValidMachineGroup method to
virCgroupValidateMachineGroup to make a bit clearer
that this isn't simply a boolean check, it will make
changes to the object.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-25 19:55:46 +01:00
Daniel P. Berrange
525c9d5a49 Make virCgroupIsValidMachine static
The virCgroupIsValidMachine does not need to be called from
outside the cgroups file now, so make it static.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-25 19:55:29 +01:00
Daniel P. Berrange
a45b99ead9 Introduce a more convenient virCgroupNewDetectMachine
Instead of requiring drivers to use a combination of calls
to virCgroupNewDetect and virCgroupIsValidMachine, combine
the two into virCgroupNewDetectMachine

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-25 19:47:30 +01:00
Daniel P. Berrange
3068244e85 Protection against doing bad stuff to the root group
Add protection such that the virCgroupRemove and
virCgroupKill* do not do anything to the root cgroup.

Killing all PIDs in the root cgroup does not end well.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-25 11:42:48 +01:00
Daniel P. Berrange
b333330aa5 New cgroups API for atomically creating machine cgroups
Instead of requiring one API call to create a cgroup and
another to add a task to it, introduce a new API
virCgroupNewMachine which does both jobs at once. This
will facilitate the later code to talk to systemd to
achieve this job which is also atomic.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-25 11:42:47 +01:00
Roman Bogorodskiy
fa6805e55e Fix virCgroupAvailable() w/o HAVE_GETMNTENT_R defined
virCgroupAvailable() implementation calls getmntent_r
without checking if HAVE_GETMNTENT_R is defined, so it fails
to build on platforms without getmntent_r support.

Make virCgroupAvailable() just return false without
HAVE_GETMNTENT_R.
2013-07-24 15:31:34 +02:00
Daniel P. Berrange
d64e852b5a Remove obsolete cgroups creation apis
The virCgroupNewDomainDriver and virCgroupNewDriver methods
are obsolete now that we can auto-detect existing cgroup
placement. Delete them to reduce code bloat.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-23 22:46:31 +01:00
Daniel P. Berrange
e638778eb3 Add API for checking if a cgroup is valid for a domain
Add virCgroupIsValidMachine API to check whether an auto
detected cgroup is valid for a machine. This lets us
check if a VM has just been placed into some generic
shared cgroup, or worse, the root cgroup

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-23 22:46:31 +01:00
Daniel P. Berrange
66a7f857f3 Add a virCgroupNewDetect API for finding cgroup placement
Add a virCgroupNewDetect API which is used to initialize a
cgroup object with the placement of an arbitrary process.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-23 22:35:26 +01:00
Daniel P. Berrange
0d7f45aea7 Convert remainder of cgroups code to report errors
Convert the remaining methods in vircgroup.c to report errors
instead of returning errno values.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-22 13:09:58 +01:00
Daniel P. Berrange
3260fdfab0 Convert the virCgroupKill* APIs to report errors
Instead of returning errno values, change the virCgroupKill*
APIs to fully report errors.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-22 13:09:58 +01:00
Daniel P. Berrange
b64dabff27 Report full errors from virCgroupNew*
Instead of returning raw errno values, report full libvirt
errors in virCgroupNew* functions.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-22 13:09:58 +01:00