libvirt

mirror of https://gitlab.com/libvirt/libvirt.git synced 2024-12-29 00:55:22 +00:00

Author	SHA1	Message	Date
Daniel P. Berrange	44f79a0bd0	lxc: ensure libvirt_lxc and qemu-nbd move into systemd machine slice Currently when spawning containers with systemd, the container PID 1 will get moved into the systemd machine slice. Libvirt then manually moves the libvirt_lxc and qemu-nbd processes into the cgroups associated with the slice, but skips the systemd controller cgroup. This means that from systemd's POV, libvirt_lxc and qemu-nbd are still part of the libvirtd.service unit. On systemctl daemon-reload, it will notice that libvirt_lxc & qemu-nbd are in the libvirtd.service unit for the systemd controller, but in the machine cgroups for resources. Systemd will thus move them back into the libvirtd.service resource cgroups next time libvirtd is restarted. This causes libvirtd to kill off the container due to incorrect cgroup placement. The solution is to ensure that when moving libvirt_lxc & qemu-nbd, we also move the systemd cgroup controller placement. Normally this is not something we ever want todo, but this is a special case as we are intentionally wanting to move them to a different systemd unit. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2017-01-09 12:46:52 +00:00
Boris Fiuczynski	dbeaa7e666	cgroup: reduce complexity of controller disabling This patch reduces the complexity of the filtering algorithm in virCgroupDetect by first correcting the controller mask and then checking for potential co-mounts without any correlating controller mask modifications. If you agree that this patch removes complexity and improves readability it could simply be squashed into the first patch of this series. Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com> Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com> Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>	2016-12-20 11:18:09 +01:00
Boris Fiuczynski	dfcfe0bb9c	cgroup: unavailable controller prevents controller disabling The cgroup controller filtering in virCgroupDetect does not work properly if the following conditions are met: 1) the host system does not have a cgroup controller which libvirt requests (unavailable controller) and 2) libvirt is configured to disable a controller (disabled controller) and 3) the disabled controller is located before the unavailable controller in virCgroupController. As an example: The memory controller is unavailable and the cpuset controller is configured to be disabled. In this scenario trying to start a domain results in the error error: Controller 'cpuset' is not wanted, but 'memory' is co-mounted: Invalid argument This error occurs when virCgroupDetect is called with a valid parent group. The resulting group created by virCgroupCopyMounts holds for cpuset and memory controller empty mount points. The filtering of disabled controllers checks for co-mounts by comparing the mount points. The cpuset controller causes the filtering to occur before the memory controller is marked as to be ignored by modifying the controller mask since it is unavailable. Therefore the co-mount detection logic compares the cpuset and memory controller mount points and since both are empty the memory controller is regarded erroneously as being co-mounted. Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com> Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com> Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2016-12-20 11:17:22 +01:00
Viktor Mihajlovski	ac8ac9e052	cgroup: Use system reported "unlimited" value for comparison With kernel 3.18 (since commit 3e32cb2e0a12b6915056ff04601cf1bb9b44f967) the "unlimited" value for cgroup memory limits has changed once again as its byte value is now computed from a page counter. The new "unlimited" value reported by the cgroup fs is therefore 2**51-1 pages which is (VIR_DOMAIN_MEMORY_PARAM_UNLIMITED - 3072). This results e.g. in virsh memtune displaying 9007199254740988 instead of unlimited for the limits. This patch uses the value of memory.limit_in_bytes from the cgroup memory root which is the system's "real" unlimited value for comparison. See also libvirt commit `231656bbeb` for the history for kernel 3.12 and before. Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>	2016-12-06 16:25:20 +01:00
Michal Privoznik	c2a5a4e7ea	virstring: Unify string list function names We have couple of functions that operate over NULL terminated lits of strings. However, our naming sucks: virStringJoin virStringFreeList virStringFreeListCount virStringArrayHasString virStringGetFirstWithPrefix We can do better: virStringListJoin virStringListFree virStringListFreeCount virStringListHasString virStringListGetFirstWithPrefix Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2016-11-25 13:54:05 +01:00
Nitesh Konkar	d276da48bc	Fix typos and grammar Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>	2016-11-23 12:08:15 -05:00
Michal Privoznik	b7d2d4af2b	src: Treat PID as signed This initially started as a fix of some debug printing in virCgroupDetect. However it turned out that other places suffer from the similar problem. While dealing with pids, esp. in cases where we cannot use pid_t for ABI stability reasons, we often chose an unsigned integer type. This makes no sense as pid_t is signed. Also, new syntax-check rule is introduced so we won't repeat this mistake. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2016-10-13 17:58:56 +08:00
Michal Privoznik	f3f15cc240	Make sure sys/types.h is included after sys/sysmacros.h In the latest glibc, major() and minor() functions are marked as deprecated (glibc commit dbab6577): CC util/libvirt_util_la-vircgroup.lo util/vircgroup.c: In function 'virCgroupGetBlockDevString': util/vircgroup.c:768:5: error: '__major_from_sys_types' is deprecated: In the GNU C Library, `major' is defined by <sys/sysmacros.h>. For historical compatibility, it is currently defined by <sys/types.h> as well, but we plan to remove this soon. To use `major', include <sys/sysmacros.h> directly. If you did not intend to use a system-defined macro `major', you should #undef it after including <sys/types.h>. [-Werror=deprecated-declarations] if (virAsprintf(&ret, "%d:%d ", major(sb.st_rdev), minor(sb.st_rdev)) < 0) ^~ In file included from /usr/include/features.h:397:0, from /usr/include/bits/libc-header-start.h:33, from /usr/include/stdio.h:28, from ../gnulib/lib/stdio.h:43, from util/vircgroup.c:26: /usr/include/sys/sysmacros.h:87:1: note: declared here __SYSMACROS_DEFINE_MAJOR (__SYSMACROS_FST_IMPL_TEMPL) ^ Moreover, in the glibc commit, there's suggestion to keep ordering of including of header files as implemented here. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2016-09-06 17:49:36 +02:00
Peter Krempa	c84c2cb389	util: Extract and rename qemuDomainDelCgroupForThread to virCgroupDelThread	2016-08-24 15:44:47 -04:00
Ján Tomko	cd6e4e5fe4	cgroup: drop INSERT_ELEMENT usage virCgroupPartitionEscape Use virAsprintf to prepend an underscore to make the code more readable.	2016-07-26 10:41:26 +02:00
Ján Tomko	994b024624	Use virDirOpenQuiet Remove all the remaining usage of opendir.	2016-06-24 14:20:57 +02:00
Ján Tomko	42b4a37d68	Use virDirOpenIfExists Use it instead of opendir everywhere we need to check for ENOENT.	2016-06-24 14:20:57 +02:00
Ján Tomko	e81de04c10	Use virDirOpen Switch from opendir to virDirOpen everywhere we need to report an error.	2016-06-24 14:20:57 +02:00
Ján Tomko	70a033ab42	Do not ignore hidden files in /sys and /proc The directories we iterate over are unlikely to contain any entries starting with a dot, other than '.' and '..' which is already skipped by virDirRead.	2016-06-23 21:58:38 +02:00
Ján Tomko	fe79c3f2c1	Do not check for '.' and '..' after virDirRead It skips those directory entries.	2016-06-23 21:58:38 +02:00
Ján Tomko	a4e6f1eb9c	Introduce VIR_DIR_CLOSE Introduce a helper that only calls closedir if DIR* is non-NULL and sets it to NULL afterwards.	2016-06-23 21:58:33 +02:00
Daniel P. Berrange	eaf18f4c2b	nodeinfo: move host CPU APIs out into virhostcpu.c file Move all APIs with a virHostCPU name prefix out into new util/virhostcpu.h & util/virhostcpu.c files Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2016-06-09 18:31:11 +01:00
Daniel P. Berrange	4053350bfe	nodeinfo: rename all CPU APIs to have a virHostCPU prefix In preparation for moving all the CPU related APIs out of the nodeinfo file, give them a virHostCPU name prefix. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2016-06-09 18:08:15 +01:00
Daniel P. Berrange	08ea852c25	nodeinfo: remove sysfs_prefix from all methods Nearly all the methods in the nodeinfo file are given a 'const char *sysfs_prefix' parameter to override the default sysfs path (/sys/devices/system). Every single caller passes in NULL for this, except one use in the unit tests. Furthermore this parameter is totally Linux-specific, when the APIs are intended to be cross platform portable. This removes the sysfs_prefix parameter and instead gives a new method linuxNodeInfoSetSysFSSystemPath for use by the test suite. For two of the methods this hardcodes use of the constant SYSFS_SYSTEM_PATH, since the test suite does not need to override the path for thos methods. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2016-06-09 18:00:18 +01:00
Michal Privoznik	fb377701f2	virCgroupValidateMachineGroup: Reflect change in CGroup struct naming Fron `c3bd0019c0` on instead of creating the following path for cgroups: /sys/fs/cgroupX/$name.libvirt-$driver we generate rather more verbose one: /sys/fs/cgroupX/$driver-$id-$name.libvirt-$driver where $name is optional and included iff contains allowed chars. See original commit for more reasoning. Now, problem with the original commit is that we are unable to start any LXC domain after it. Because when starting LXC container, the CGroup layout is created by our lxc_controller process and then detected and validated by libvirtd. The validation is done by trying to match detected layout against all the possible patterns for cgroup paths that we've ever had. And the commit in question forgot to update this part of the code. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2016-05-06 12:51:06 +02:00
Martin Kletzander	aca4d72b2a	Include sysmacros.h where needed So in glibc-2.23 sys/sysmacros.h is no longer included from sys/types.h and we don't build because of the usage of major/minor/makedev macros. Autoconf already has AC_HEADER_MAJOR macro that check where exactly these functions/macros are defined, so let's use that. Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2016-04-18 20:36:57 +02:00
Henning Schild	ff16bde100	qemu_cgroup: use virCgroupAddTask instead of virCgroupMoveTask qemuProcessSetupEmulator runs at a point in time where there is only the qemu main thread. Use virCgroupAddTask to put just that one task into the emulator cgroup. That patch makes virCgroupMoveTask and virCgroupAddTaskStrController obsolete. Signed-off-by: Henning Schild <henning.schild@siemens.com>	2016-03-01 14:07:27 +00:00
Henning Schild	85d7480654	vircgroup: one central point for adding tasks to cgroups Use virCgroupAddTaskController in virCgroupAddTask so we have one single point where we add tasks to cgroups. Signed-off-by: Henning Schild <henning.schild@siemens.com>	2016-03-01 11:20:56 +00:00
Michal Privoznik	6bfb03ae15	vircgroup: Update virCgroupDenyDevicePath stub In `cf113e8d` we changed the declaration of virCgroupAllowDevicePath() and virCgroupDenyDevicePath(). However, while updating the stub for non-cgroup platforms for the former we forgot to update the latter too causing a build failure. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2016-02-17 14:25:35 +01:00
Peter Krempa	cf113e8d54	util: cgroup: Allow ignoring EACCES in virCgroup(Allow\|Deny)DevicePath When adding disk images to ACL we may call those functions on NFS shares. In that case we might get an EACCES, which isn't really relevant since NFS would not hold a block device. This patch adds a flag that allows to stop reporting an error on EACCES to avoid spaming logs. Currently there's no functional change.	2016-02-17 10:54:05 +01:00
Peter Krempa	9cd5da710e	util: cgroup: Drop virCgroup(Allow\|Deny)DeviceMajor Since commit `47e5b5ae` virCgroupAllowDevice allows to pass -1 as either the minor or major device number and it automatically uses '*' in place of that. Reuse the new approach through the code and drop the duplicated functions.	2016-02-17 10:54:05 +01:00
Peter Krempa	f42b5c327f	util: cgroup: Instrument virCgroupDenyDevice to handle -1 device number as * Similarly to commit `47e5b5ae` virCgroupDenyDevice will handle -1 as *.	2016-02-17 10:54:05 +01:00
Michal Privoznik	a0aa92a24b	vircgroup: Update virCgroupGetPercpuStats stump In the commit `7938b533` we've changed the function signature, however forgot to update stump that's used on systems without CGroups causing a build failure. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2016-02-08 14:06:30 +01:00
Peter Krempa	7938b533d5	cgroup: Prepare for sparse vCPU topologies in virCgroupGetPercpuStats Pass a bitmap of enabled guest vCPUs to virCgroupGetPercpuStats so that non-continuous vCPU topologies can be used.	2016-02-08 09:51:34 +01:00
Martin Kletzander	c3bd0019c0	systemd: Modernize machine naming So, systemd-machined has this philosophy that machine names are like hostnames and hence should follow the same rules. But we always allowed international characters in domain names. Thus we need to modify the machine name we are passing to systemd. In order to change some machine names that we will be passing to systemd, we also need to call TerminateMachine at the end of a lifetime of a domain. Even for domains that were started with older libvirt. That can be achieved thanks to virSystemdGetMachineNameByPID(). And because we can change machine names, we can get rid of the inconsistent and pointless escaping of domain names when creating machine names. So this patch modifies the naming in the following way. It creates the name as <drivername>-<id>-<name> where invalid hostname characters are stripped out of the name and if the resulting name is longer, it truncates it to 64 characters. That way we can start domains we couldn't start before. Well, at least on systemd. To make it work all together, the machineName (which is needed only with systemd) is saved in domain's private data. That way the generation is moved to the driver and we don't need to pass various unnecessary arguments to cgroup functions. The only thing this complicates a bit is the scope generation when validating a cgroup where we must check both old and new naming, so a slight modification was needed there. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1282846 Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2016-02-05 16:11:50 +01:00
Peter Krempa	58578f83bc	cgroup: Clean up virCgroupGetPercpuStats Use 'ret' for return variable name, clarify use of 'param_idx' and avoid unnecessary 'success' label. No functional changes. Also document the function.	2016-02-03 13:10:04 +01:00
Michal Privoznik	c7f5e26b5f	vircgroup: Finish renaming of virCgroupIsolateMount In `dc576025c3` we renamed virCgroupIsolateMount function to virCgroupBindMount. However, we forgot about one occurrence in section of the code which provides stubs for platforms without support for CGroups like *BSD for instance. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2016-01-26 17:39:47 +01:00
Daniel P. Berrange	dc576025c3	lxc: don't try to hide parent cgroups inside container On the host when we start a container, it will be placed in a cgroup path of /machine.slice/machine-lxc\x2ddemo.scope under /sys/fs/cgroup/* Inside the containers' namespace we need to setup /sys/fs/cgroup mounts, and currently will bind mount /machine.slice/machine-lxc\x2ddemo.scope on the host to appear as / in the container. While this may sound nice, it confuses applications dealing with cgroups, because /proc/$PID/cgroup now does not match the directory in /sys/fs/cgroup This particularly causes problems for systems and will make it create repeated path components in the cgroup for apps run in the container eg /machine.slice/machine-lxc\x2ddemo.scope/machine.slice/machine-lxc\x2ddemo.scope/user.slice/user-0.slice/session-61.scope This also causes any systemd service that uses sd-notify to fail to start, because when systemd receives the notification it won't be able to identify the corresponding unit it came from. In particular this break rabbitmq-server startup Future kernels will provide proper cgroup namespacing which will handle this problem, but until that time we should not try to play games with hiding parent cgroups. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2016-01-26 16:11:32 +00:00
John Ferlan	d41bd09596	Revert "util: cgroups do not implicitly add task to new machine cgroup" This reverts commit `71ce475967`. Since commit id 'a41c00b47' has been reverted, this no longer is necessary	2016-01-14 11:00:25 -05:00
Jasper Lievisse Adriaanse	1b60f1b401	cgroup: don't include sys/mount.h if not needed As cgroup implementation only works on Linux, it does not make much sense to include sys/mount.h if other requirements are not met, such as HAVE_MNTENT_H and HAVE_GETMNTENT_R. Also, it fixes build on OpenBSD that requires to include sys/param.h along with sys/mount.h. Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>	2016-01-11 19:56:06 +03:00
Michal Privoznik	f55d1316ad	sysconf: Include unistd.h The manpage for sysconf() suggest including unistd.h as the function is declared there. Even though we are not hitting any compile issues currently, let's include the correct header file instead of relying on some hidden include chain. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2015-12-24 18:03:50 +01:00
Henning Schild	71ce475967	util: cgroups do not implicitly add task to new machine cgroup virCgroupNewMachine used to add the pidleader to the newly created machine cgroup. Do not do this implicit anymore. Signed-off-by: Henning Schild <henning.schild@siemens.com>	2015-12-14 15:43:29 -05:00
Roman Bogorodskiy	46550cde0f	util: fix build without cgroup Commit `89c509a0` added getters for cgroup block device I/O throttling, however stub versions of these functions have not matching function prototypes that result in compilation fail on platforms not supporting cgroup. Fix build by correcting prototypes of the stubbed functions. Pushing under build-breaker rule.	2015-08-20 09:42:56 +03:00
Martin Kletzander	89c509a0c1	util: Add getters for cgroup block device I/O throttling Since now they were not needed, but I sense they will be in a short while. Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2015-08-18 16:25:16 -07:00
Martin Kletzander	ea9db906fc	util: Add virCgroupGetBlockDevString This function translates device paths to "major:minor " string, and all virCgroupSetBlkioDevice* functions are modified to use it. It's a cleanup with no functional change. Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2015-08-18 16:16:38 -07:00
Peter Krempa	88f6c007c3	cgroup: Drop resource partition from virSystemdMakeScopeName The scope name, even according to our docs is "machine-$DRIVER\x2d$VMNAME.scope" virSystemdMakeScopeName would use the resource partition name instead of "machine-" if it was specified thus creating invalid scope paths. This makes libvirt drop cgroups for a VM that uses custom resource partition upon reconnecting since the detected scope name would not match the expected name generated by virSystemdMakeScopeName. The error is exposed by the following log entry: debug : virCgroupValidateMachineGroup:302 : Name 'machine-qemu\x2dtestvm.scope' for controller 'cpu' does not match 'testvm', 'testvm.libvirt-qemu' or 'machine-test-qemu\x2dtestvm.scope' for a "/machine/test" resource and "testvm" vm. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1238570	2015-07-22 07:12:56 +02:00
John Ferlan	51281dcb90	nodeinfo: Add sysfs_prefix to nodeGetPresentCPUBitmap Add the sysfs_prefix argument to the call to allow for setting the path for tests to something other than SYSFS_SYSTEM_PATH.	2015-07-13 15:59:32 -04:00
John Ferlan	0456eda317	cgroup: Use virCgroupNewThread Replace the virCgroupNew{Vcpu\|Emulator\|IOThread} calls with the common virCgroupNewThread API Signed-off-by: John Ferlan <jferlan@redhat.com>	2015-04-09 19:27:08 -04:00
John Ferlan	2cd3a980dc	cgroup: Introduce virCgroupNewThread Create a new common API to replace the virCgroupNew{Vcpu\|Emulator\|IOThread} API's using an emum to generate the cgroup name Signed-off-by: John Ferlan <jferlan@redhat.com>	2015-04-09 19:27:08 -04:00
Michal Privoznik	d65acbde35	vircgroup: Introduce virCgroupControllerAvailable This new internal API checks if given CGroup controller is available. It is going to be needed later when we need to make a decision whether pin domain memory onto NUMA nodes using cpuset CGroup controller or using numa_set_membind(). Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2015-04-08 11:54:24 +02:00
Michal Privoznik	149a62bc83	virCgroupNew: Enhance debug message When creating new internal representation of cgroups, all passed arguments are logged. Well, except for two: pid and pointer for return value. Lets log them too. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2015-03-30 15:20:24 +02:00
Michal Privoznik	0a09bcdc7f	virCgroupNewPartition: Fix comment The function has no argument named @name rather than @path instead. The comment is, however, referring to @name while it should have been referring to @path really. Signed-off-by: Michal Privoznik <mprivozn@redhat.com>	2015-03-30 15:20:20 +02:00
John Ferlan	cf6ab17e45	vircgroup: Fix build issue mingw cross compile Commit id '2dbfa716' exposed virCgroupDetectMountsFromFile, but did not add the corresponding entry in the "#else /* !VIR_CGROUP_SUPPORTED */" section of the module.	2015-03-27 18:09:07 -04:00
John Ferlan	38efd52584	vircgroup: Fix build issue on mingw cross compile Commit id 'ba1dfc5' added virCgroupSetCpusetMemoryMigrate and virCgroupGetCpusetMemoryMigrate, but did not add the corresponding entry points into the "#else /* !VIR_CGROUP_SUPPORTED */" section	2015-03-27 18:09:07 -04:00
Martin Kletzander	ba1dfc5b6a	cgroup: Add accessors for cpuset.memory_migrate Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2015-03-20 13:40:02 +01:00
Jiri Denemark	2dbfa716e8	tests: Add tests for virCgroupDetectMounts Signed-off-by: Jiri Denemark <jdenemar@redhat.com>	2015-03-18 09:53:24 +01:00
Ján Tomko	22fd3ac38f	Introduce virBitmapIsBitSet A helper that never returns an error and treats bits out of bitmap range as false. Use it everywhere we use ignore_value on virBitmapGetBit, or loop over the bitmap size.	2015-03-13 15:31:33 +01:00
Ján Tomko	b54f48812d	Fix a memory leak in virCgroupGetPercpuStats Coverity reports that my commit `af1c98e` introduced two memory leaks: the cpumap if ncpus == 0 in virCgroupGetPercpuStats and the params array in the test of the function.	2015-01-26 16:13:06 +01:00
Ján Tomko	af1c98e406	Fix virCgroupGetPercpuStats with non-continuous present CPUs Per-cpu stats are only shown for present CPUs in the cgroups, but we were only parsing the largest CPU number from /sys/devices/system/cpu/present and looking for stats even for non-present CPUs. This resulted in: internal error: cpuacct parse error	2015-01-22 17:01:11 +01:00
Ján Tomko	c803c070c4	Fix virCgroupNewMachine prototype on non-Linux Commit `318df5a` changed the prototype of virCgroupNewMachine without adjusting the stub function for platforms without cgroups.	2015-01-20 10:02:53 +01:00
Daniel P. Berrange	318df5a05f	Add support for systemd-machined CreateMachineWithNetwork systemd-machined introduced a new method CreateMachineWithNetwork that obsoletes CreateMachine. It expects to be given a list of VETH/TAP device indexes for the host side device(s) associated with a container/machine. This falls back to the old CreateMachine method when the new one is not supported.	2015-01-15 11:07:07 +00:00
Martin Kletzander	3b0f05573f	util: Fix possible NULL dereference Commit `1a80b97d`, which added the virCgroupHasEmptyTasks() function forgot that the parameter @cgroup may be NULL and did not check that. Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2014-12-21 10:30:49 +01:00
Martin Kletzander	1a80b97ddf	util: Add function virCgroupHasEmptyTasks That function helps checking whether there's a task in that cgroup. Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2014-12-16 11:15:27 +01:00
Cédric Bosdonnat	5acbb8f99e	Avoid getting '-1:-1' in devices cgroup list When calling virCgroupAllowAllDevices we get these invalid entries in the device cgroup config. b -1:-1 rw c -1:-1 rw Check for positive values before outputting the major and minor to avoid that.	2014-12-12 17:25:00 +01:00
Eric Blake	eb9093763f	maint: forbid 'int foo = true' I noticed this while working on qemuDomainGetBlockInfo. Assigning a bool value to an int variable compiles fine, but raises red flags on the maintenance front as it becomes too easy to assign -1 or 2 or any other non-bool value to the same variable. * cfg.mk (sc_prohibit_int_assign_bool): New rule. * src/conf/snapshot_conf.c (virDomainSnapshotRedefinePrep): Fix offenders. * src/qemu/qemu_driver.c (qemuDomainGetBlockInfo) (qemuDomainSnapshotCreateXML): Likewise. * src/test/test_driver.c (testDomainSnapshotAlignDisks): Likewise. * src/util/vircgroup.c (virCgroupSupportsCpuBW): Likewise. * src/util/virpci.c (virPCIDeviceBindToStub): Likewise. * src/util/virutil.c (virIsCapableVport): Likewise. * tools/virsh-domain-monitor.c (cmdDomMemStat): Likewise. * tools/virsh-domain.c (cmdBlockResize, cmdScreenshot) (cmdInjectNMI, cmdSendKey, cmdSendProcessSignal) (cmdDetachInterface): Likewise. Signed-off-by: Eric Blake <eblake@redhat.com>	2014-11-19 08:20:39 -07:00
Ján Tomko	99b2b4571d	Add virCgroupTerminateMachine stub Fix the build on FreeBSD, broken by commit `4882618`. Signed-off-by: Ján Tomko <jtomko@redhat.com>	2014-10-02 11:11:10 +02:00
Guido Günther	4882618ed1	qemu: use systemd's TerminateMachine to kill all processes If we don't properly clean up all processes in the machine-<vmname>.scope systemd won't remove the cgroup and subsequent vm starts fail with 'CreateMachine: File exists' Additional processes can e.g. be added via echo $PID > /sys/fs/cgroup/systemd/machine.slice/machine-${VMNAME}.scope/tasks but there are other cases like http://bugs.debian.org/761521 Invoke TerminateMachine to be on the safe side since systemd tracks the cgroup anyway. This is a noop if all processes have terminated already.	2014-10-01 20:17:46 +02:00
John Ferlan	e45f0d057e	vircgroup: Fix broken builds without cgroups I missed adding virCgroupNewIOThread to the !VIR_CGROUP_SUPPORTED Pushing as build breaker	2014-09-15 14:48:52 -04:00
John Ferlan	3abb95cad4	vircgroup: Introduce virCgroupNewIOThread Add virCgroupNewIOThread() to mimic virCgroupNewVcpu() except the naming scheme with use "iothread" rather than "vcpu".	2014-09-15 13:18:56 -04:00
Wang Rui	d01a062be6	vircgroup: Resolve Coverity RESOURCE_LEAK Need to free 'root' and 'opts' before 'return -1' if symlink fails. Signed-off-by: Wang Rui <moon.wangrui@huawei.com>	2014-09-03 15:00:19 -04:00
Cédric Bosdonnat	47e5b5ae32	lxc: allow to keep or drop capabilities Added <capabilities> in the <features> section of LXC domains configuration. This section can contain elements named after the capabilities like: <mknod state="on"/>, keep CAP_MKNOD capability <sys_chroot state="off"/> drop CAP_SYS_CHROOT capability Users can restrict or give more capabilities than the default using this mechanism.	2014-07-23 15:12:37 +08:00
Peter Krempa	464f7678d9	util: cgroup: Fix build on non-cgroup platforms Commit `a48f445100` introduced a helper function to convert cgroup device mode to string. The function was only conditionally compiled on platforms that support cgroup. This broke the build when attempting to export the symbol: CCLD libvirt.la Cannot export virCgroupGetDevicePermsString: symbol not defined Move the function out of the ifdef, as it doesn't really depend on the cgroup code being present.	2014-07-09 09:45:36 +02:00
Peter Krempa	a48f445100	util: cgroup: Add helper to convert device mode to string Cgroups code uses VIR_CGROUP_DEVICE_* flags to specify the mode but in the end it needs to be converted to a string. Add a helper to do it and use it in the cgroup code before introducing it into the rest of the code.	2014-07-08 14:34:05 +02:00
Chen Hanxiao	d18aa70416	util: fix memory leak in failure path of virCgroupKillRecursiveInternal Don't leak keypath when we fail to kill a process Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>	2014-05-16 14:11:07 +03:00
Eric Blake	ac1d42ac72	util: use virDirRead API In making the conversion to the new API, I fixed a couple bugs: virSCSIDeviceGetSgName would leak memory if a directory unexpectedly contained multiple entries; virNetDevTapGetRealDeviceName could report a spurious error from a stale errno inherited before starting the readdir search. The decision on whether to store the result of virDirRead into a variable is based on whether the end of the loop falls through to cleanup code automatically. In some cases, we have loops that are documented to return NULL on failure, and which raise an error on most failure paths but not in the case where the directory was unexpectedly empty; it may be worth a followup patch to explicitly report an error if readdir was successful but the directory was empty, so that a NULL return always has an error set. * src/util/vircgroup.c (virCgroupRemoveRecursively): Use new interface. (virCgroupKillRecursiveInternal, virCgroupSetOwner): Report readdir failures. * src/util/virfile.c (virFileLoopDeviceOpenSearch) (virFileNBDDeviceFindUnused, virFileDeleteTree): Use new interface. * src/util/virnetdevtap.c (virNetDevTapGetRealDeviceName): Properly check readdir errors. * src/util/virpci.c (virPCIDeviceIterDevices) (virPCIDeviceFileIterate, virPCIGetNetName): Report readdir failures. (virPCIDeviceAddressIOMMUGroupIterate): Use new interface. * src/util/virscsi.c (virSCSIDeviceGetSgName): Report readdir failures, and avoid memory leak. (virSCSIDeviceGetDevName): Report readdir failures. * src/util/virusb.c (virUSBDeviceSearch): Report readdir failures. * src/util/virutil.c (virGetFCHostNameByWWN) (virFindFCHostCapableVport): Report readdir failures. Signed-off-by: Eric Blake <eblake@redhat.com>	2014-04-28 17:52:45 -06:00
Ján Tomko	5dfcd6fbc6	Fix build on mingw32 My commit `897808e` added a parameter to virCgroupGetPercpuStats, but didn't change the stub for systems where cgroups are not supported.	2014-04-09 16:47:26 +02:00
Ján Tomko	2adf59ebde	Clean up virCgroupGetPercpuStats The iterator is checked for being less than or equal to need_cpus. The 'n' variable is incremented need_cpus + 1 times. Simplify the computation of need_cpus and make its value one larger, to let it be used instead of 'n' and compared without the equal sign in loop conditions. Just index the sum_cpu_time array instead of using a helper variable. Start the loop at start_cpu instead of continuing for all lower values.	2014-04-09 16:24:08 +02:00
Ján Tomko	9fe5267ade	Check maximum startcpu value correctly The cpus are indexed from 0, so a startcpu value equal to the number of CPUs is invalid. https://bugzilla.redhat.com/show_bug.cgi?id=1070680	2014-04-09 16:24:08 +02:00
Ján Tomko	dd74ab4e82	Rename id, max_id to need_cpus, total_cpus total_cpus is the total number of CPUs on the host need_cpus is the number of CPUs we need to look at (need_cpus can be larger than ncpus, because we need to look at CPUs before the startcpu too, even if we aren't reporting their stats)	2014-04-09 16:24:08 +02:00
Ján Tomko	897808e74f	Extend virCgroupGetPercpuStats to fill in vcputime too Currently, virCgroupGetPercpuStats is only used by the LXC driver, filling out the CPUTIME stats. qemuDomainGetPercpuStats does this and also filles out VCPUTIME stats. Extend virCgroupGetPercpuStats to also report VCPUTIME stats if nvcpupids is non-zero. In the LXC driver, we don't have cpupids. In the QEMU driver, there is at least one cpupid for a running domain, so the behavior shouldn't change for QEMU either. Also rename getSumVcpuPercpuStats to virCgroupGetPercpuVcpuSum.	2014-04-09 16:24:08 +02:00
Ján Tomko	23d2d863b7	Fix return value of virCgroupGetPercpuStats We need to return the number of successfully populated stats, not the nparams supplied by the user.	2014-04-09 16:24:08 +02:00
Hongwei Bi	4ef09c4690	util: remove useless comment for virCgroupMoveTask in vircgroup.c Signed-off-by: Hongwei Bi <hwbi2008@gmail.com>	2014-03-31 14:16:05 +02:00
Ján Tomko	bada4222e5	Indent top-level labels by one space in src/util/	2014-03-25 14:58:40 +01:00
Wang Yufei	bfb29654c8	cgroup: Fix start VMs coincidently failed When I start multi VMs coincidently and any of the cgroup directories named machine doesn't exist. There's a chance that VM start failed because of creating directory failed: Unable to initialize /machine cgroup: File exists When the errno returned by mkdir in virCgroupMakeGroup is EEXIST, we should pass it through and continue to start the VM. Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>	2014-03-21 13:27:28 +01:00
Daniel P. Berrange	2835c1e730	Add virLogSource variables to all source files Any source file which calls the logging APIs now needs to have a VIR_LOG_INIT("source.name") declaration at the start of the file. This provides a static variable of the virLogSource type. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2014-03-18 14:29:22 +00:00
Martin Kletzander	cc9c62fef9	Require spaces around equality comparisons Commit `a1cbe4b5` added a check for spaces around assignments and this patch extends it to checks for spaces around '=='. One exception is virAssertCmpInt where comma after '==' is acceptable (since it is a macro and '==' is its argument). Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2014-03-18 11:29:44 +01:00
Eric Blake	fa2e4dbfd6	build: fix cgroups on non-Linux Running ./autobuild.sh detected a mingw failure: CCLD libvirt.la Cannot export virCgroupGetPercpuStats: symbol not defined Cannot export virCgroupSetOwner: symbol not defined * src/util/vircgroup.c (virCgroupGetPercpuStats) (virCgroupSetOwner): Implement stubs. Signed-off-by: Eric Blake <eblake@redhat.com>	2014-02-25 17:38:46 -07:00
Richard Weinberger	6fb42d7cdc	Ensure systemd cgroup ownership is delegated to container with userns This function is needed for user namespaces, where we need to chmod() the cgroup to the initial uid/gid such that systemd is allowed to use the cgroup. Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2014-02-24 15:35:47 +00:00
Ján Tomko	abf1daf0d7	Add a stub for virCgroupGetDomainTotalCpuStats Commit `6515889` broke the build on FreeBSD: In function `qemuDomainGetCPUStats': /../../src/qemu/qemu_driver.c:16102: undefined reference to `virCgroupGetDomainTotalCpuStats'	2014-02-21 09:10:48 +01:00
Thorsten Behrens	4b3b2f6ceb	Implement domainGetCPUStats for lxc driver.	2014-02-20 16:20:09 +01:00
Thorsten Behrens	65158899b7	Make qemuGetDomainTotalCPUStats a virCgroup function. To reuse this from other drivers, like lxc.	2014-02-20 16:20:09 +01:00
Thorsten Behrens	a2bb187c7e	Add util virCgroupGetBlkioIo*Serviced methods. This reads blkio stats from blkio.throttle.io_service_bytes and blkio.throttle.io_serviced.	2014-02-20 16:20:09 +01:00
Gao feng	3b431929a2	blkio: Setting throttle blkio cgroup for domain This patch introduces virCgroupSetBlkioDeviceReadIops, virCgroupSetBlkioDeviceWriteIops, virCgroupSetBlkioDeviceReadBps and virCgroupSetBlkioDeviceWriteBps, we can use these interfaces to set up throttle blkio cgroup for domain. This patch also adds the new throttle blkio cgroup elements to the test xml. Signed-off-by: Guan Qiang <hzguanqiang@corp.netease.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2014-01-20 10:52:44 +08:00
Martin Kletzander	231656bbeb	cgroups: Redefine what "unlimited" means wrt memory limits Since kernel 3.12 (commit 34ff8dc08956098563989d8599840b130be81252 in linux-stable.git in particular) the value for 'unlimited' in cgroup memory limits changed from LLONG_MAX to ULLONG_MAX. Due to rather unfortunate choice of our VIR_DOMAIN_MEMORY_PARAM_UNLIMITED constant (which we transfer as an unsigned long long in Kibibytes), we ended up with the situation described below (applies to x86_64): - 2^64-1 (ULLONG_MAX) -- "unlimited" in kernel = 3.12 - 2^63-1 (LLONG_MAX) -- "unlimited" in kernel < 3.12 - 2^63-1024 -- our PARAM_UNLIMITED scaled to Bytes - 2^53-1 -- our PARAM_UNLIMITED unscaled (in Kibibytes) This means that when any number within (2^63-1, 2^64-1] is read from memory cgroup, we are transferring that number instead of "unlimited". Unfortunately, changing VIR_DOMAIN_MEMORY_PARAM_UNLIMITED would break ABI compatibility and thus we have to resort to a different solution. With this patch every value greater than PARAM_UNLIMITED means "unlimited". Even though this may seem misleading, we are already in such unclear situation when running 3.12 kernel with memory limits set to 2^63. One example showing most of the problems at once (with kernel 3.12.2): # virsh memtune asdf --hard-limit 9007199254740991 --swap-hard-limit -1 # echo 12345678901234567890 >\ /sys/fs/cgroup/memory/machine/asdf.libvirt-qemu/memory.soft_limit_in_bytes # virsh memtune asdf hard_limit : 18014398509481983 soft_limit : 12056327051986884 swap_hard_limit: 18014398509481983 Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2013-12-10 08:38:46 +01:00
Zhou Yimin	036aeca721	Cgroup: Replace 'newpath' with 'newPath' Unifying codding style, replace 'newpath' with 'newPath'. From: Zhou Yimin <zhouyimin@huawei.com>	2013-12-06 16:18:14 +01:00
Chen Hanxiao	521cec2aab	cgroup: leave blkio cgroup value checking to kernel The range of valid values for cgroup tunables has changed in the past and may change again in future kernels. Avoid hardcoding range checks in libvirt code, delegating range checking to the kernel itself. Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>	2013-10-15 12:22:07 +01:00
Chen Hanxiao	501476fccf	cgroup: show error when EINVAL is returned When EINVAL is returned while changing a cgroups value, tell user that what values are invalid for the field. Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>	2013-10-15 12:18:47 +01:00
Chen Hanxiao	fc9a416df7	cgroup: fix a comment typo in vircgroup.c s/shoule/should Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>	2013-10-09 17:16:58 +02:00
Peter Krempa	d79fe8b50b	cgroup: Move [qemu\|lxc]GetCpuBWStatus to vicgroup.c and refactor it The function existed in two identical instances in lxc and qemu. Move it to vircgroup.c and simplify it. Refactor the callers too.	2013-09-16 11:32:49 +02:00
Peter Krempa	4baa8d7637	cleanup: Kill usage of access(PATH, F_OK) in favor of virFileExists() Semantics of the libvirt helper are more clear. This change also allows to clean up some pieces of code.	2013-09-16 10:37:39 +02:00
Daniel P. Berrange	a48838ad2e	Fix launching of VMs on when only logind part of systemd is present Debian systems may run the 'systemd-logind' daemon, which causes the /sys/fs/cgroup/systemd mount to be setup, but no other cgroup controllers are created. While the LXC driver considers cgroups to be mandatory, the QEMU driver is supposed to accept them as optional. We detect whether they are present by looking in /proc/mounts for any mounts of type 'cgroups', but this is not sufficient. We need to skip any named mounts (as seen by a name=XXX string in the mount options), so that we only detect actual resource controllers. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=721979 Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-09-12 11:32:36 +01:00
Daniel P. Berrange	f0b6d8d472	Fix cgroups when all are mounted on /sys/fs/cgroup Some users in Ubuntu/Debian seem to have a setup where all the cgroup controllers are mounted on /sys/fs/cgroup rather than any /sys/fs/cgroup/<controller> name. In the loop which detects which controllers are present for a mount point we were modifying 'mnt_dir' field in the 'struct mntent' var, but not always restoring the original value. This caused detection to break in the all-in-one mount setup. Fix that logic bug and add test case coverage for this mount setup. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-09-11 11:45:38 +01:00
Roman Bogorodskiy	81b1915773	cgroup macros refactoring, part 5 Complete the refactoring by adding missing stubs so it compiles on platform without cgroup support. Signed-off-by: Eric Blake <eblake@redhat.com>	2013-08-12 16:58:54 -06:00
Roman Bogorodskiy	2d795df3f0	cgroup macros refactoring, part 4 Complete moving to VIR_CGROUP_SUPPORTED Signed-off-by: Eric Blake <eblake@redhat.com>	2013-08-12 16:58:54 -06:00
Roman Bogorodskiy	7f5f270d5f	cgroup macros refactoring, part 3 Continue converting to VIR_CGROUP_SUPPORTED Signed-off-by: Eric Blake <eblake@redhat.com>	2013-08-12 16:58:54 -06:00
Roman Bogorodskiy	c419e9b51c	cgroup macros refactoring, part 2 - Convert virCgroupGet* to VIR_CGROUP_SUPPORTED - Convert virCgroup(Get\|Set)FreezerState to VIR_CGROUP_SUPPORTED Signed-off-by: Eric Blake <eblake@redhat.com>	2013-08-12 16:58:47 -06:00
Roman Bogorodskiy	02f1fd41f6	cgroup macros refactoring, part 1 - Introduce VIR_CGROUP_SUPPORTED conditional - Convert virCgroupKill* to use it - Convert virCgroupIsolateMount() to use it - Convert virCgroupRemoveRecursively to VIR_CGROUP_SUPPORTED Signed-off-by: Eric Blake <eblake@redhat.com>	2013-08-12 16:15:58 -06:00
Eric Blake	2ff9e54cbf	cgroup: functional sort Make future patches smaller by matching a sane header listing in the first place. No semantic change. * src/util/vircgroup.h: Move free next to new, and controller functions next to each other. * src/util/vircgroup.c (virCgroupFree, virCgroupHasController) (virCgroupPathOfController, virCgroupRemoveRecursively) (virCgroupRemove): Sort implementation to be closer to header. Signed-off-by: Eric Blake <eblake@redhat.com>	2013-08-12 16:08:18 -06:00
Eric Blake	7ccd322b20	cgroup: topological sort Avoid a forward declaration of a static function. * src/util/vircgroup.c (virCgroupPartitionNeedsEscaping) (virCgroupParticionEscape): Move up. Signed-off-by: Eric Blake <eblake@redhat.com>	2013-08-12 15:38:37 -06:00
Eric Blake	a91929053c	cgroup: use consistent formatting Format all functions with two blank lines between, and return type on separate line from function name. Also break some lines longer than 80 columns. This makes the subsequent macro refactoring less noisy. * src/util/vircgroup.c: Match prevailing style. Signed-off-by: Eric Blake <eblake@redhat.com>	2013-08-12 15:36:35 -06:00
Daniel P. Berrange	2fe2470181	Enable support for systemd-machined in cgroups creation Make the virCgroupNewMachine method try to use systemd-machined first. If that fails, then fallback to using the traditional cgroup setup code path. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-31 19:29:19 +01:00
Daniel P. Berrange	75304eaa1a	Cope with races while killing processes When systemd is involved in managing processes, it may start killing off & tearing down croups associated with the process while we're still doing virCgroupKillPainfully. We must explicitly check for ENOENT and treat it as if we had finished killing processes Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-31 19:27:28 +01:00
Daniel P. Berrange	aedd46e7e3	Add support for systemd cgroup mount Systemd uses a named cgroup mount for tracking processes. Add it as another type of controller, albeit one which we have to special case in a number of places. In particular we must never create/delete directories there, nor add tasks. Essentially the systemd mount is to be considered read-only for libvirt. With this change both the virCgroupDetectPlacement and virCgroupCopyPlacement methods must be invoked. The copy placement method will copy setup for resource controllers only. The detect placement method will probe for any named controllers, or resource controllers not already setup. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-31 19:27:19 +01:00
Eric Blake	a2d0c3f553	build: fix vircgroup build on mingw The previous patch was incomplete. CC libvirt_util_la-vircgroup.lo ../../src/util/vircgroup.c:70:12: error: 'virCgroupPartitionEscape' declared 'static' but never defined [-Werror=unused-function] static int virCgroupPartitionEscape(char *path); ^ src/util/vircgroup.c (virCgroupPartitionEscape): Move forward declaration inside conditional. Signed-off-by: Eric Blake <eblake@redhat.com>	2013-07-29 08:56:20 -06:00
Daniel P. Berrange	7cf81fa175	Conditionalize build of virCgroupValidateMachineGroup The virCgroupValidateMachineGroup method calls some functions which are only conditionally compiled, thus it too must be made conditional. This fixes the build on non-Linux hosts. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-29 14:36:44 +01:00
Daniel P. Berrange	56b54173ed	Skip detecting placement if controller is disabled If the app has provided a whitelist of controllers to be used, we skip detecting its mount point. We still, however, fill in the placement info which later confuses the machine name validation code. Skip detecting placement if the controller mount point is not set Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-25 19:55:51 +01:00
Daniel P. Berrange	5ec5a22493	Add 'controllers' arg to virCgroupNewDetect When detecting cgroups we must honour any controllers whitelist the driver may have. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-25 19:55:47 +01:00
Daniel P. Berrange	c101b851c1	Fix detection of 'emulator' cgroup When a VM has an 'emulator' child cgroup present, we must strip off that suffix when detecting the cgroup for a machine Rename the virCgroupIsValidMachineGroup method to virCgroupValidateMachineGroup to make a bit clearer that this isn't simply a boolean check, it will make changes to the object. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-25 19:55:46 +01:00
Daniel P. Berrange	525c9d5a49	Make virCgroupIsValidMachine static The virCgroupIsValidMachine does not need to be called from outside the cgroups file now, so make it static. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-25 19:55:29 +01:00
Daniel P. Berrange	a45b99ead9	Introduce a more convenient virCgroupNewDetectMachine Instead of requiring drivers to use a combination of calls to virCgroupNewDetect and virCgroupIsValidMachine, combine the two into virCgroupNewDetectMachine Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-25 19:47:30 +01:00
Daniel P. Berrange	3068244e85	Protection against doing bad stuff to the root group Add protection such that the virCgroupRemove and virCgroupKill* do not do anything to the root cgroup. Killing all PIDs in the root cgroup does not end well. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-25 11:42:48 +01:00
Daniel P. Berrange	b333330aa5	New cgroups API for atomically creating machine cgroups Instead of requiring one API call to create a cgroup and another to add a task to it, introduce a new API virCgroupNewMachine which does both jobs at once. This will facilitate the later code to talk to systemd to achieve this job which is also atomic. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-25 11:42:47 +01:00
Roman Bogorodskiy	fa6805e55e	Fix virCgroupAvailable() w/o HAVE_GETMNTENT_R defined virCgroupAvailable() implementation calls getmntent_r without checking if HAVE_GETMNTENT_R is defined, so it fails to build on platforms without getmntent_r support. Make virCgroupAvailable() just return false without HAVE_GETMNTENT_R.	2013-07-24 15:31:34 +02:00
Daniel P. Berrange	d64e852b5a	Remove obsolete cgroups creation apis The virCgroupNewDomainDriver and virCgroupNewDriver methods are obsolete now that we can auto-detect existing cgroup placement. Delete them to reduce code bloat. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-23 22:46:31 +01:00
Daniel P. Berrange	e638778eb3	Add API for checking if a cgroup is valid for a domain Add virCgroupIsValidMachine API to check whether an auto detected cgroup is valid for a machine. This lets us check if a VM has just been placed into some generic shared cgroup, or worse, the root cgroup Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-23 22:46:31 +01:00
Daniel P. Berrange	66a7f857f3	Add a virCgroupNewDetect API for finding cgroup placement Add a virCgroupNewDetect API which is used to initialize a cgroup object with the placement of an arbitrary process. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-23 22:35:26 +01:00
Daniel P. Berrange	0d7f45aea7	Convert remainder of cgroups code to report errors Convert the remaining methods in vircgroup.c to report errors instead of returning errno values. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-22 13:09:58 +01:00
Daniel P. Berrange	3260fdfab0	Convert the virCgroupKill* APIs to report errors Instead of returning errno values, change the virCgroupKill* APIs to fully report errors. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-22 13:09:58 +01:00
Daniel P. Berrange	b64dabff27	Report full errors from virCgroupNew* Instead of returning raw errno values, report full libvirt errors in virCgroupNew* functions. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-22 13:09:58 +01:00
Ján Tomko	cc7329317f	cgroup: reuse buffer for getline Reuse the buffer for getline and track buffer allocation separately from the string length to prevent unlikely out-of-bounds memory access. This fixes the following leak that happened when zero bytes were read: ==404== 120 bytes in 1 blocks are definitely lost in loss record 1,344 of 1,671 ==404== at 0x4C2C71B: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==404== by 0x906F862: getdelim (iogetdelim.c:68) ==404== by 0x52A48FB: virCgroupPartitionNeedsEscaping (vircgroup.c:1136) ==404== by 0x52A0FB4: virCgroupPartitionEscape (vircgroup.c:1171) ==404== by 0x52A0EA4: virCgroupNewDomainPartition (vircgroup.c:1450)	2013-07-17 14:08:11 +02:00
Daniel P. Berrange	f8b42f3224	Convert 'int i' to 'size_t i' in src/util/ files Convert the type of loop iterators named 'i', 'j', k', 'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or 'unsigned int', also santizing 'ii', 'jj', 'kk' to use the normal 'i', 'j', 'k' naming Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-10 17:40:13 +01:00
Michal Privoznik	a2f8babc7d	Adapt to VIR_ALLOC and virAsprintf in src/util/*	2013-07-10 11:07:33 +02:00
Michal Privoznik	8290cbbc38	viralloc: Report OOM error on failure Similarly to VIR_STRDUP, we want the OOM error to be reported in VIR_ALLOC and friends.	2013-07-10 11:07:31 +02:00
Michal Privoznik	bc13222185	virCgroupNewPartition: Don't leak @newpath The @newpath variable is allocated in virCgroupSetPartitionSuffix(). But it's newer freed.	2013-07-03 09:42:11 +02:00
Ján Tomko	5bc8ecb8d1	Plug leak in virCgroupMoveTask We only break out of the while loop if *content is an empty string. However the buffer has been allocated to BUFSIZ + 1 (8193 in my case), but it gets overwritten in the next for iteration. Move VIR_FREE right before we overwrite it to avoid the leak. ==5777== 16,386 bytes in 2 blocks are definitely lost in loss record 1,022 of 1,027 ==5777== by 0x5296E28: virReallocN (viralloc.c:184) ==5777== by 0x52B0C66: virFileReadLimFD (virfile.c:1137) ==5777== by 0x52B0E1A: virFileReadAll (virfile.c:1199) ==5777== by 0x529B092: virCgroupGetValueStr (vircgroup.c:534) ==5777== by 0x529AF64: virCgroupMoveTask (vircgroup.c:1079) Introduced by `83e4c77`. https://bugzilla.redhat.com/show_bug.cgi?id=978352	2013-06-26 15:38:01 +02:00
Ján Tomko	306c49ffd5	Fix invalid read in virCgroupGetValueStr Don't check for '\n' at the end of file if zero bytes were read. Found by valgrind: ==404== Invalid read of size 1 ==404== at 0x529B09F: virCgroupGetValueStr (vircgroup.c:540) ==404== by 0x529AF64: virCgroupMoveTask (vircgroup.c:1079) ==404== by 0x1EB475: qemuSetupCgroupForEmulator (qemu_cgroup.c:1061) ==404== by 0x1D9489: qemuProcessStart (qemu_process.c:3801) ==404== by 0x18557E: qemuDomainObjStart (qemu_driver.c:5787) ==404== by 0x190FA4: qemuDomainCreateWithFlags (qemu_driver.c:5839) Introduced by `0d0b409`. https://bugzilla.redhat.com/show_bug.cgi?id=978356	2013-06-26 15:05:43 +02:00
Ján Tomko	e557766c3b	Replace two-state local integers with bool Found with 'git grep "= 1"'.	2013-06-06 17:22:53 +02:00
Viktor Mihajlovski	eb21408f44	cgroups: Do not enforce nonexistent controllers Currently, the controllers argument to virCgroupDetect acts both as a result filter and a required controller specification, which is a bit overloaded. If both functionalities are needed, it would be better to have them seperated into a filter and a requirement mask. The only situation where it is used today is to ensure that only CPU related controllers are used for the VCPU directories. But here we clearly do not want to enforce the existence of cpu, cpuacct and specifically not cpuset at the same time. This commit changes the semantics of controllers to "filter only". Should a required mask ever be needed, more work will have to be done. Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>	2013-05-24 12:11:24 +02:00
Michal Privoznik	eb8e5e8774	Adapt to VIR_STRDUP and VIR_STRNDUP in src/util/vircgroup.c This commit is separate due to unusual paradigm compared to the most source files.	2013-05-24 10:10:03 +02:00
Michal Privoznik	b43bb98a31	virCgroupAddTaskStrController: s/-1/-ENOMEM/ Within whole vircgroup.c we 'return -errno', e.g. 'return -ENOMEM'. However, in this specific function virCgroupAddTaskStrController we weren't returning -ENOMEM but -1 despite fact that later in the function we are returning one of errno values indeed.	2013-05-24 10:03:22 +02:00
Eric Blake	83e4c77547	cgroup: be robust against cgroup movement races https://bugzilla.redhat.com/show_bug.cgi?id=965169 documents a problem starting domains when cgroups are enabled; I was able to reliably reproduce the race about 5% of the time when I added hooks to domain startup by 3 seconds (as that seemed to be about the length of time that qemu created and then closed a temporary thread, probably related to aio handling of initially opening a disk image). The problem has existed since we introduced virCgroupMoveTask in commit `9102829` (v0.10.0). There are some inherent TOCTTOU races when moving tasks between kernel cgroups, precisely because threads can be created or completed in the window between when we read a thread id from the source and when we write to the destination. As the goal of virCgroupMoveTask is merely to move ALL tasks into the new cgroup, it is sufficient to iterate until no more threads are being created in the old group, and ignoring any threads that die before we can move them. It would be nicer to start the threads in the right cgroup to begin with, but by default, all child threads are created in the same cgroup as their parent, and we don't want vcpu child threads in the emulator cgroup, so I don't see any good way of avoiding the move. It would also be nice if the kernel were to implement something like rename() as a way to atomically move a group of threads from one cgroup to another, instead of forcing a window where we have to read and parse the source, then format and write back into the destination. * src/util/vircgroup.c (virCgroupAddTaskStrController): Ignore ESRCH, because a thread ended between read and write attempts. (virCgroupMoveTask): Loop until all threads have moved. Signed-off-by: Eric Blake <eblake@redhat.com>	2013-05-21 11:33:56 -06:00
Osier Yang	3fcc1df2f8	src/utils: Remove the whitespace before ";"	2013-05-21 23:41:45 +08:00
Daniel P. Berrange	c2cf5f1c2a	Fix failure to detect missing cgroup partitions Change `bbe97ae968` caused the QEMU driver to ignore ENOENT errors from cgroups, in order to cope with missing /proc/cgroups. This is not good though because many other things can cause ENOENT and should not be ignored. The callers expect to see ENXIO when cgroups are not present, so adjust the code to report that errno when /proc/cgroups is missing Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-05-17 10:25:15 +01:00
Jim Fehlig	bbe97ae968	Fix starting domains when kernel has no cgroups support Found that I was unable to start existing domains after updating to a kernel with no cgroups support # zgrep CGROUP /proc/config.gz # CONFIG_CGROUPS is not set # virsh start test error: Failed to start domain test error: Unable to initialize /machine cgroup: Cannot allocate memory virCgroupPartitionNeedsEscaping() correctly returns errno (ENOENT) when attempting to open /proc/cgroups on such a system, but it was being dropped in virCgroupSetPartitionSuffix(). Change virCgroupSetPartitionSuffix() to propagate errors returned by its callees. Also check for ENOENT in qemuInitCgroup() when determining if cgroups support is available.	2013-05-13 09:27:46 -06:00
Daniel P. Berrange	0ced83dcfb	Escaping leading '.' in cgroup names Escaping a leading '.' with '_' in the cgroup names Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-05-13 14:28:46 +01:00
Eric Blake	25ae3d3015	build: avoid useless virAsprintf virAsprintf(&foo, "%s", bar) is wasteful compared to foo = strdup(bar) (or eventually, VIR_STRDUP(foo, bar), but one thing at a time...). Noticed while reviewing Laine's attempt to clean up broken qemu:///session. * cfg.mk (sc_prohibit_asprintf): Enhance rule. * src/esx/esx_storage_backend_vmfs.c (esxStorageBackendVMFSVolumeLookupByKey): Fix offender. * src/network/bridge_driver.c (networkStateInitialize): Likewise. * src/nwfilter/nwfilter_dhcpsnoop.c (virNWFilterSnoopDHCPOpen): Likewise. * src/storage/storage_backend_sheepdog.c (virStorageBackendSheepdogRefreshVol): Likewise. * src/util/vircgroup.c (virCgroupAddTaskStrController): Likewise. * src/util/virdnsmasq.c (addnhostsAdd): Likewise. * src/xen/block_stats.c (xenLinuxDomainDeviceID): Likewise. * src/xen/xen_driver.c (xenUnifiedConnectOpen): Likewise. * tools/virsh.c (vshGetTypedParamValue): Likewise. Signed-off-by: Eric Blake <eblake@redhat.com>	2013-05-02 13:35:26 -06:00
Daniel P. Berrange	f3662737b1	Do proper escaping of cgroup resource partitions If a user cgroup name begins with "cgroup.", "_" or with any of the controllers from /proc/cgroups followed by a dot, then they need to be prefixed with a single underscore. eg if there is an object "cpu.service", then this would end up as "_cpu.service" in the cgroup filesystem tree, however, "waldo.service" would stay "waldo.service", at least as long as nobody comes up with a cgroup controller called "waldo". Since we require a '.XXXX' suffix on all partitions, there is no scope for clashing with the kernel 'tasks' and 'release_agent' files. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-26 13:52:02 +01:00
Daniel P. Berrange	9ddfe7eea6	Ensure all cgroup partitions have a suffix of ".partition" If the partition named passed in the XML does not already have a suffix, ensure it gets a '.partition' added to each component. The exceptions are /machine, /user and /system which do not need to have a suffix, since they are fixed partitions at the top level. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-26 13:52:02 +01:00
Daniel P. Berrange	824e86e723	Change VM cgroup suffix from '{lxc,qemu}.libvirt' to 'libvirt-{lxc,qemu}' Recently we changed to create VM cgroups with the naming pattern $VMNAME.$DRIVER.libvirt. Following discussions with the systemd community it was decided that only having a single '.' in the names is preferrable. So this changes the naming scheme to be $VMNAME.libvirt-$DRIVER. eg for LXC 'mycontainer.libvirt-lxc' or for KVM 'myvm.libvirt-qemu'. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-26 13:52:02 +01:00
Eric Blake	1fbf190554	build: avoid unsafe functions in libgen.h POSIX says that both basename() and dirname() may return static storage (aka they need not be thread-safe); and that they may but not must modify their input argument. Furthermore, <libgen.h> is not available on all platforms. For these reasons, you should never use these functions in a multi-threaded library. Gnulib instead recommends a way to avoid the portability nightmare: gnulib's "dirname.h" provides useful thread-safe counterparts. The obvious dir_name() and base_name() are GPL (because they malloc(), but call exit() on failure) so we can't use them; but the LGPL variants mdir_name() (malloc's or returns NULL) and last_component (always points into the incoming string without modifying it, differing from basename semantics only on corner cases like the empty string that we shouldn't be hitting in the first place) are already in use in libvirt. This finishes the swap over to the safe functions. * cfg.mk (sc_prohibit_libgen): New rule. * src/util/vircgroup.c: Fix offenders. * src/parallels/parallels_storage.c (parallelsPoolAddByDomain): Likewise. * src/parallels/parallels_network.c (parallelsGetBridgedNetInfo): Likewise. * src/node_device/node_device_udev.c (udevProcessSCSIHost) (udevProcessSCSIDevice): Likewise. * src/storage/storage_backend_disk.c (virStorageBackendDiskDeleteVol): Likewise. * src/util/virpci.c (virPCIGetDeviceAddressFromSysfsLink): Likewise. * src/util/virstoragefile.h (_virStorageFileMetadata): Avoid false positive. Signed-off-by: Eric Blake <eblake@redhat.com>	2013-04-25 14:47:01 -06:00
Stefan Berger	0cb171f60f	Fix compilation error in util/vircgroup.c Fix the error util/vircgroup.c: In function 'virCgroupNewDomainPartition': util/vircgroup.c:1299:11: error: declaration of 'dirname' shadows a global declaration [-Werror=shadow] Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>	2013-04-16 08:16:37 -04:00
Daniel P. Berrange	e7d8ab016b	Add support for perf_event and net_cls cgroup controllers Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-15 17:35:32 +01:00
Daniel P. Berrange	1da631ecf3	Add an API for re-mounting cgroups, to isolate the process location Add a virCgroupIsolateMount method which looks at where the current process is place in the cgroups (eg /system/demo.lxc.libvirt) and then remounts the cgroups such that this sub-directory becomes the root directory from the current process' POV. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-15 17:35:32 +01:00
Daniel P. Berrange	83336118db	Track symlinks for co-mounted cgroup controllers If a cgroup controller is co-mounted with another, eg /sys/fs/cgroup/cpu,cpuacct Then it is a requirement that there exist symlinks at /sys/fs/cgroup/cpu /sys/fs/cgroup/cpuacct pointing to the real mount point. Add support to virCgroupPtr to detect and track these symlinks Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-15 17:35:32 +01:00
Daniel P. Berrange	767596bdb4	Remove non-functional code for setting up non-root cgroups The virCgroupNewDriver method had a 'bool privileged' param. If a false value was ever passed in, it would simply not work, since non-root users don't have any privileges to create new cgroups. Just delete this broken code entirely and make the QEMU driver skip cgroup setup in non-privileged mode Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-15 17:35:31 +01:00

1 2 3 4 5 ...

266 Commits