libvirt

mirror of https://gitlab.com/libvirt/libvirt.git synced 2025-01-03 19:45:21 +00:00

Author	SHA1	Message	Date
Osier Yang	2b66504ded	util: Add "shareable" field for virSCSIDevice struct Unlike the host devices of other types, SCSI host device XML supports "shareable" tag. This patch introduces it for the virSCSIDevice struct for a later patch use (to detect if the SCSI device is shareable when preparing the SCSI host device in QEMU driver).	2014-01-23 17:52:33 +08:00
Gao feng	3b431929a2	blkio: Setting throttle blkio cgroup for domain This patch introduces virCgroupSetBlkioDeviceReadIops, virCgroupSetBlkioDeviceWriteIops, virCgroupSetBlkioDeviceReadBps and virCgroupSetBlkioDeviceWriteBps, we can use these interfaces to set up throttle blkio cgroup for domain. This patch also adds the new throttle blkio cgroup elements to the test xml. Signed-off-by: Guan Qiang <hzguanqiang@corp.netease.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2014-01-20 10:52:44 +08:00
Gao feng	b9ce5d388f	rename virBlkioDeviceWeightPtr to virBlkioDevicePtr The throttle blkio cgroup will reuse this struct. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-12-12 12:29:59 +00:00
Eric Blake	5d509e9ee2	maint: fix comma style issues: qemu Most of our code base uses space after comma but not before; fix the remaining uses before adding a syntax check. * src/qemu/qemu_cgroup.c: Consistently use commas. * src/qemu/qemu_command.c: Likewise. * src/qemu/qemu_conf.c: Likewise. * src/qemu/qemu_driver.c: Likewise. * src/qemu/qemu_monitor.c: Likewise. Signed-off-by: Eric Blake <eblake@redhat.com>	2013-11-20 09:14:55 -07:00
Cole Robinson	a924d9d083	qemu: cgroup: Fix crash if starting nographics guest We can dereference graphics[0] even if guest has no graphics device configured. I screwed this up in `a216e64872` https://bugzilla.redhat.com/show_bug.cgi?id=1014088	2013-10-01 11:22:18 -04:00
Peter Krempa	4baa8d7637	cleanup: Kill usage of access(PATH, F_OK) in favor of virFileExists() Semantics of the libvirt helper are more clear. This change also allows to clean up some pieces of code.	2013-09-16 10:37:39 +02:00
Cole Robinson	a216e64872	qemu: Set QEMU_AUDIO_DRV=none with -nographic On my machine, a guest fails to boot if it has a sound card, but not graphical device/display is configured, because pulseaudio fails to initialize since it can't access $HOME. A workaround is removing the audio device, however on ARM boards there isn't any option to do that, so -nographic always fails. Set QEMU_AUDIO_DRV=none if no <graphics> are configured. Unfortunately this has massive test suite fallout. Add a qemu.conf parameter nographics_allow_host_audio, that if enabled will pass through QEMU_AUDIO_DRV from sysconfig (similar to vnc_allow_host_audio)	2013-09-02 16:53:39 -04:00
Michal Privoznik	94a24dd3a9	qemuSetupMemoryCgroup: Handle hard_limit properly Since 16bcb3 we have a regression. The hard_limit is set unconditionally. By default the limit is zero. Hence, if user hasn't configured any, we set the zero in cgroup subsystem making the kernel kill the corresponding qemu process immediately. The proper fix is to set hard_limit iff user has configured any.	2013-08-20 15:03:17 +02:00
Michal Privoznik	16bcb3b616	qemu: Drop qemuDomainMemoryLimit This function is to guess the correct limit for maximal memory usage by qemu for given domain. This can never be guessed correctly, not to mention all the pains and sleepless nights this code has caused. Once somebody discovers algorithm to solve the Halting Problem, we can compute the limit algorithmically. But till then, this code should never see the light of the release again.	2013-08-19 11:16:58 +02:00
Daniel P. Berrange	1166eeba61	Fix crashing upgrading from older libvirts with running guests If upgrading from a libvirt that is older than 1.0.5, we can not assume that vm->def->resource is non-NULL. This bogus assumption caused libvirtd to crash Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-08-02 15:32:26 +01:00
Daniel P. Berrange	2fe2470181	Enable support for systemd-machined in cgroups creation Make the virCgroupNewMachine method try to use systemd-machined first. If that fails, then fallback to using the traditional cgroup setup code path. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-31 19:29:19 +01:00
Daniel P. Berrange	5ec5a22493	Add 'controllers' arg to virCgroupNewDetect When detecting cgroups we must honour any controllers whitelist the driver may have. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-25 19:55:47 +01:00
Daniel P. Berrange	a45b99ead9	Introduce a more convenient virCgroupNewDetectMachine Instead of requiring drivers to use a combination of calls to virCgroupNewDetect and virCgroupIsValidMachine, combine the two into virCgroupNewDetectMachine Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-25 19:47:30 +01:00
Daniel P. Berrange	02098ac260	Convert QEMU driver to use virCgroupNewMachine Convert the QEMU driver code to use the new atomic API for setup of cgroups Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-25 11:42:47 +01:00
Daniel P. Berrange	2049ef9942	Create + setup cgroups atomically for QEMU process Currently the QEMU driver creates the VM's cgroup prior to forking, and then uses a virCommand hook to move the child into the cgroup. This won't work with systemd whose APIs do the creation of cgroups + attachment of processes atomically. Fortunately we have a handshake taking place between the QEMU driver and the child process prior to QEMU being exec()d, which was introduced to allow setup of disk locking. By good fortune this synchronization point can be used to enable the QEMU driver to do atomic setup of cgroups removing the use of the hook script. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-23 22:46:31 +01:00
Daniel P. Berrange	87b2e6fa84	Auto-detect existing cgroup placement Use the new virCgroupNewDetect function to determine cgroup placement of existing running VMs. This will allow the legacy cgroups creation APIs to be removed entirely Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-23 22:46:31 +01:00
Daniel P. Berrange	0d7f45aea7	Convert remainder of cgroups code to report errors Convert the remaining methods in vircgroup.c to report errors instead of returning errno values. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-22 13:09:58 +01:00
Daniel P. Berrange	b64dabff27	Report full errors from virCgroupNew* Instead of returning raw errno values, report full libvirt errors in virCgroupNew* functions. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-22 13:09:58 +01:00
Peter Krempa	bac2182041	qemu: Cleanup coding style nits in qemu_cgroup.c	2013-07-18 14:58:12 +02:00
Osier Yang	a39f69d2bb	qemu: Set cpuset.cpus for domain process When either "cpuset" of <vcpu> is specified, or the "placement" of <vcpu> is "auto", only setting the cpuset.mems might cause the guest starting to fail. E.g. ("placement" of both <vcpu> and <numatune> is "auto"): 1) Related XMLs <vcpu placement='auto'>4</vcpu> <numatune> <memory mode='strict' placement='auto'/> </numatune> 2) Host NUMA topology % numactl --hardware available: 8 nodes (0-7) node 0 cpus: 0 4 8 12 16 20 24 28 node 0 size: 16374 MB node 0 free: 11899 MB node 1 cpus: 32 36 40 44 48 52 56 60 node 1 size: 16384 MB node 1 free: 15318 MB node 2 cpus: 2 6 10 14 18 22 26 30 node 2 size: 16384 MB node 2 free: 15766 MB node 3 cpus: 34 38 42 46 50 54 58 62 node 3 size: 16384 MB node 3 free: 15347 MB node 4 cpus: 3 7 11 15 19 23 27 31 node 4 size: 16384 MB node 4 free: 15041 MB node 5 cpus: 35 39 43 47 51 55 59 63 node 5 size: 16384 MB node 5 free: 15202 MB node 6 cpus: 1 5 9 13 17 21 25 29 node 6 size: 16384 MB node 6 free: 15197 MB node 7 cpus: 33 37 41 45 49 53 57 61 node 7 size: 16368 MB node 7 free: 15669 MB 4) cpuset.cpus will be set as: (from debug log) 2013-05-09 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/toy/cpuset.cpus' to '0-63' 5) The advisory nodeset got from querying numad (from debug log) 2013-05-09 16:50:17.295+0000: 417: debug : qemuProcessStart:3614 : Nodeset returned from numad: 1 6) cpuset.mems will be set as: (from debug log) 2013-05-09 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/toy/cpuset.mems' to '0-7' I.E, the domain process's memory is restricted on the first NUMA node, however, it can use all of the CPUs, which will likely cause the domain process to fail to start because of the kernel fails to allocate memory with the the memory policy as "strict". % tail -n 20 /var/log/libvirt/qemu/toy.log ... 2013-05-09 05:53:32.972+0000: 7318: debug : virCommandHandshakeChild:377 : Handshake with parent is done char device redirected to /dev/pts/2 (label charserial0) kvm_init_vcpu failed: Cannot allocate memory ... Signed-off-by: Peter Krempa <pkrempa@redhat.com>	2013-07-18 14:57:57 +02:00
Daniel P. Berrange	50760e2a8a	Convert 'int i' to 'size_t i' in src/qemu files Convert the type of loop iterators named 'i', 'j', k', 'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or 'unsigned int', also santizing 'ii', 'jj', 'kk' to use the normal 'i', 'j', 'k' naming Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-10 17:55:15 +01:00
Michal Privoznik	e987a30dfa	Adapt to VIR_ALLOC and virAsprintf in src/qemu/*	2013-07-10 11:07:32 +02:00
Jiri Denemark	e0e438af00	qemu: Move memory limit computation to a reusable function	2013-07-08 12:35:27 +02:00
Laine Stump	1d829e1306	pci: rename virPCIDeviceGetVFIOGroupDev to virPCIDeviceGetIOMMUGroupDev I realized after the fact that it's probably better in the long run to give this function a name that matches the name of the link used in sysfs to hold the group (iommu_group). I'm changing it now because I'm about to add several more functions that deal with iommu groups.	2013-06-25 18:07:38 -04:00
Osier Yang	8da9516a84	qemu: Abstract code for the cpu controller setting into a helper	2013-06-05 19:25:48 +08:00
Michal Privoznik	a88fb3009f	Adapt to VIR_STRDUP and VIR_STRNDUP in src/qemu/*	2013-05-23 09:56:38 +02:00
Osier Yang	66194f71df	src/qemu: Remove the whitespace before ';'	2013-05-21 23:41:44 +08:00
Osier Yang	58f8e0cd58	qemu: Don't remove the "return 0" Commit `f60a50c795` intended to remove the warning only, but not with the "return 0" together.	2013-05-21 23:08:57 +08:00
Osier Yang	479d5991cd	qemu: Abstract code for cpuset controller setting into a helper	2013-05-20 19:57:00 +08:00
Osier Yang	9f2455d359	qemu: Abstract code for devices controller setting into a helper	2013-05-20 19:52:35 +08:00
Osier Yang	f60a50c795	qemu: Abstract code for memory controller setting into a helper	2013-05-20 19:39:54 +08:00
Osier Yang	2fd16df7b5	qemu: Abstract the code for blkio controller setting into a helper	2013-05-20 19:24:45 +08:00
Daniel P. Berrange	c2cf5f1c2a	Fix failure to detect missing cgroup partitions Change `bbe97ae968` caused the QEMU driver to ignore ENOENT errors from cgroups, in order to cope with missing /proc/cgroups. This is not good though because many other things can cause ENOENT and should not be ignored. The callers expect to see ENXIO when cgroups are not present, so adjust the code to report that errno when /proc/cgroups is missing Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-05-17 10:25:15 +01:00
Jim Fehlig	bbe97ae968	Fix starting domains when kernel has no cgroups support Found that I was unable to start existing domains after updating to a kernel with no cgroups support # zgrep CGROUP /proc/config.gz # CONFIG_CGROUPS is not set # virsh start test error: Failed to start domain test error: Unable to initialize /machine cgroup: Cannot allocate memory virCgroupPartitionNeedsEscaping() correctly returns errno (ENOENT) when attempting to open /proc/cgroups on such a system, but it was being dropped in virCgroupSetPartitionSuffix(). Change virCgroupSetPartitionSuffix() to propagate errors returned by its callees. Also check for ENOENT in qemuInitCgroup() when determining if cgroups support is available.	2013-05-13 09:27:46 -06:00
Han Cheng	6eb42e38e8	qemu: Allow the scsi-generic device in cgroup This adds the scsi-generic device into the device controller's whitelist, so that it's allowed to used by the qemu process. Signed-off-by: Han Cheng <hanc.fnst@cn.fujitsu.com> Signed-off-by: Osier Yang <jyang@redhat.com>	2013-05-13 19:08:34 +08:00
Laine Stump	52ba0f6e1c	qemu: fix stupid typos in VFIO cgroup setup/teardown I must have looked at this a couple dozen times before I noticed it had "!=" instead of "==". Not doing this setup prevented qemu from doing anything with the vfio group device.	2013-05-03 14:32:54 -04:00
Michal Privoznik	7c9a2d88cd	virutil: Move string related functions to virstring.c The source code base needs to be adapted as well. Some files include virutil.h just for the string related functions (here, the include is substituted to match the new file), some include virutil.h without any need (here, the include is removed), and some require both.	2013-05-02 16:56:55 +02:00
Laine Stump	811143c0b6	qemu: put usb cgroup setup in common function The USB-specific cgroup setup had been inserted inline in qemuDomainAttachHostUsbDevice and qemuSetupCgroup, but now there is a common cgroup setup function called for all hostdevs, so it makes sens to put the usb-specific setup there and just rely on that function being called. The one thing I'm uncertain of here (and a reason for not pushing until after release) is that previously hostdev->missing was checked only when starting a domain (and cgroup setup for the device skipped if missing was true), but with this consolidation, it is now checked in the case of hotplug as well. I don't know if this will have any practical effect (does it make sense to hotplug a "missing" usb device?)	2013-04-29 21:52:28 -04:00
Laine Stump	6e13860cb4	qemu: add vfio devices to cgroup ACL when appropriate PCIO device assignment using VFIO requires read/write access by the qemu process to /dev/vfio/vfio, and /dev/vfio/nn, where "nn" is the VFIO group number that the assigned device belongs to (and can be found with the function virPCIDeviceGetVFIOGroupDev) /dev/vfio/vfio can be accessible to any guest without danger (according to vfio developers), so it is added to the static ACL. The group device must be dynamically added to the cgroup ACL for each vfio hostdev in two places: 1) for any devices in the persistent config when the domain is started (done during qemuSetupCgroup()) 2) at device attach time for any hotplug devices (done in qemuDomainAttachHostDevice) The group device must be removed from the ACL when a device it "hot-unplugged" (in qemuDomainDetachHostDevice()) Note that USB devices are already doing their own cgroup setup and teardown in the hostdev-usb specific function. I chose to make the new functions generic and call them in a common location though. We can then move the USB-specific code (which is duplicated in two locations) to this single location. I'll be posting a followup patch to do that.	2013-04-29 21:52:28 -04:00
Daniel P. Berrange	1e05073fbb	Replace more cases of /system with /machine The change in commit `aed4986322` was incomplete, missing a couple of cases of /system. This caused failure to start VMs. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-22 17:11:36 +01:00
Daniel P. Berrange	aed4986322	Change default resource partition to /machine After discussions with systemd developers it was decided that a better default policy for resource partitions is to have 3 default partitions at the top level /system - system services /machine - virtual machines / containers /user - user login session This ensures that the default policy isolates guest from user login sessions & system services, so a mis-behaving guest can't consume 100% of CPU usage if other things are contending for it. Thus we change the default partition from /system to /machine Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-22 12:10:12 +01:00
Daniel P. Berrange	767596bdb4	Remove non-functional code for setting up non-root cgroups The virCgroupNewDriver method had a 'bool privileged' param. If a false value was ever passed in, it would simply not work, since non-root users don't have any privileges to create new cgroups. Just delete this broken code entirely and make the QEMU driver skip cgroup setup in non-privileged mode Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-15 17:35:31 +01:00
Daniel P. Berrange	db44eb1b5f	Change default cgroup layout for QEMU/LXC and honour XML config Historically QEMU/LXC guests have been placed in a cgroup layout that is $LOCATION-OF-LIBVIRTD/libvirt/{qemu,lxc}/$VMNAME This is bad for a number of reasons - The cgroup hierarchy gets very deep which seriously impacts kernel performance due to cgroups scalability limitations. - It is hard to setup cgroup policies which apply across services and virtual machines, since all VMs are underneath the libvirtd service. To address this the default cgroup location is changed to be /system/$VMNAME.{lxc,qemu}.libvirt This puts virtual machines at the same level in the hierarchy as system services, allowing consistent policy to be setup across all of them. This also honours the new resource partition location from the XML configuration, for example <resource> <partition>/virtualmachines/production</partitions> </resource> will result in the VM being placed at /virtualmachines/production/$VMNAME.{lxc,qemu}.libvirt NB, with the exception of the default, /system, path which is intended to always exist, libvirt will not attempt to auto-create the partitions in the XML. It is the responsibility of the admin/app to configure the partitions. Later libvirt APIs will provide a way todo this. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-15 17:35:31 +01:00
Daniel P. Berrange	aa8604dd45	Add a new virCgroupNewPartition for setting up resource partitions A resource partition is an absolute cgroup path, ignoring the current process placement. Expose a virCgroupNewPartition API for constructing such cgroups Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-15 17:35:31 +01:00
Daniel P. Berrange	04c18d25f1	Rename virCgroupForXXX to virCgroupNewXXX Rename all the virCgroupForXXX methods to use the form virCgroupNewXXX since they are all constructors. Also make sure the output parameter is the last one in the list, and annotate all pointers as non-null. Fix up all callers, and make sure they use true/false not 0/1 for the boolean parameters Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-15 17:35:31 +01:00
Daniel P. Berrange	632f78caaf	Store a virCgroupPtr instance in qemuDomainObjPrivatePtr Instead of calling virCgroupForDomain every time we need the virCgrouPtr instance, just do it once at Vm startup and cache a reference to the object in qemuDomainObjPrivatePtr until shutdown of the VM. Removing the virCgroupPtr from the QEMU driver state also means we don't have stale mount info, if someone mounts the cgroups filesystem after libvirtd has been started Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-15 17:35:31 +01:00
Stefan Berger	22feb0d3e7	QEMU Cgroup support for TPM passthrough Some refactoring for virDomainChrSourceDef type of devices so we can use common code. Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com> Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>	2013-04-12 16:55:46 -04:00
Daniel P. Berrange	dca927c82f	Rename virCgroupMounted to virCgroupHasController & make it more robust The virCgroupMounted method is badly named, since a controller can be mounted, but disabled in the current object. Rename the method to be virCgroupHasController. Also make it tolerant to a NULL virCgroupPtr and out-of-range controller index, to avoid duplication of these checks in all callers Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-08 14:49:12 +01:00
Daniel P. Berrange	56f27b3bbc	Don't create dirs in cgroup controllers we don't want to use Currently when getting an instance of virCgroupPtr we will create the path in all cgroup controllers. Only at the virt driver layer are we attempting to filter controllers. This is bad because the mere act of creating the dirs in the controllers can have a functional impact on the kernel, particularly for performance. Update the virCgroupForDriver() method to accept a bitmask of controllers to use. Only create dirs in the controllers that are requested. When creating cgroups for domains, respect the active controller list from the parent cgroup Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-05 10:41:54 +01:00
Gao feng	45e9d27ad8	NUMA: cleanup for numa related codes Intend to reduce the redundant code,use virNumaSetupMemoryPolicy to replace virLXCControllerSetupNUMAPolicy and qemuProcessInitNumaMemoryPolicy. This patch also moves the numa related codes to the file virnuma.c and virnuma.h Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-03-20 19:37:00 +08:00
Daniel P. Berrange	7f544a4c8f	Don't try to add non-existant devices to ACL The QEMU driver has a list of devices nodes that are whitelisted for all guests. The kernel has recently started returning an error if you try to whitelist a device which does not exist. This causes a warning in libvirt logs and an audit error for any missing devices. eg 2013-02-27 16:08:26.515+0000: 29625: warning : virDomainAuditCgroup:451 : success=no virt=kvm resrc=cgroup reason=allow vm="vm031714" uuid=9d8f1de0-44f4-a0b1-7d50-e41ee6cd897b cgroup="/sys/fs/cgroup/devices/libvirt/qemu/vm031714/" class=path path=/dev/kqemu rdev=? acl=rw Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-02-27 22:51:24 +00:00
Daniel P. Berrange	279336c5d8	Avoid spamming logs with cgroups warnings The code for putting the emulator threads in a separate cgroup would spam the logs with warnings 2013-02-27 16:08:26.731+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 3 2013-02-27 16:08:26.731+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 4 2013-02-27 16:08:26.732+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 6 This is because it has only created child cgroups for 3 of the controllers, but was trying to move the processes from all the controllers. The fix is to only try to move threads in the controllers we actually created. Also remove the warning and make it return a hard error to avoid such lazy callers in the future. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-02-27 22:51:24 +00:00
Eric Blake	82d5fe5437	qemu: check backing chains even when cgroup is omitted https://bugzilla.redhat.com/show_bug.cgi?id=896685 points out a regression caused by commit `38c4a9c` - libvirt only labels the backing chain if the backing chain cache is populated, but the code to populate the cache was only conditionally performed if cgroup labeling was necessary. * src/qemu/qemu_cgroup.c (qemuSetupCgroup): Hoist cache setup... * src/qemu/qemu_process.c (qemuProcessStart): ...earlier into caller, where it is now unconditional.	2013-02-21 12:32:56 -07:00
Daniel P. Berrange	77c3015f9c	Rename all USB device functions to have a standard name prefix Rename all the usbDeviceXXX and usbXXXDevice APIs to have a fixed virUSBDevice name prefix	2013-02-05 19:22:25 +00:00
Daniel P. Berrange	3e86e8f327	Fix leak of usbDevice struct when initializing cgroups When iterating over USB host devices to setup cgroups, the usbDevice object was leaked in both LXC and QEMU driers Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-02-05 19:22:25 +00:00
Daniel P. Berrange	b090aa7d55	Introduce a virQEMUDriverConfigPtr object Currently the virQEMUDriverPtr struct contains an wide variety of data with varying access needs. Move all the static config data into a dedicated virQEMUDriverConfigPtr object. The only locking requirement is to hold the driver lock, while obtaining an instance of virQEMUDriverConfigPtr. Once a reference is held on the config object, it can be used completely lockless since it is immutable. NB, not all APIs correctly hold the driver lock while getting a reference to the config object in this patch. This is safe for now since the config is never updated on the fly. Later patches will address this fully. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-02-05 15:49:25 +00:00
Eric Blake	7034531814	maint: fix comment typo While OOM can have knock-on effects that trash a system, generally the first symptom is one of memory thrashing. * src/qemu/qemu_cgroup.c (qemuSetupCgroup): Reword slightly.	2013-01-09 16:45:59 -07:00
Michal Privoznik	3c83df679e	qemu: Relax hard RSS limit Currently, if there's no hard memory limit defined for a domain, libvirt tries to calculate one, based on domain definition and magic equation and set it upon the domain startup. The rationale behind was, if there's a memory leak or exploit in qemu, we should prevent the host system trashing. However, the equation was too tightening, as it didn't reflect what the kernel counts into the memory used by a process. Since many hosts do have a swap, nobody hasn't noticed anything, because if hard memory limit is reached, process can continue allocating memory on a swap. However, if there is no swap on the host, the process gets killed by OOM killer. In our case, the qemu process it is. To prevent this, we need to relax the hard RSS limit. Moreover, we should reflect more precisely the kernel way of accounting the memory for process. That is, even the kernel caches are counted within the memory used by a process (within cgroups at least). Hence the magic equation has to be changed: limit = 1.5 * (domain memory + total video memory) + (32MB for cache per each disk) + 200MB	2013-01-08 16:32:11 +01:00
Daniel P. Berrange	f24404a324	Rename virterror.c virterror_internal.h to virerror.{c,h}	2012-12-21 11:19:50 +00:00
Daniel P. Berrange	44f6ae27fe	Rename util.{c,h} to virutil.{c,h}	2012-12-21 11:19:49 +00:00
Daniel P. Berrange	ab9b7ec2f6	Rename memory.{c,h} to viralloc.{c,h}	2012-12-21 11:17:14 +00:00
Daniel P. Berrange	936d95d347	Rename logging.{c,h} to virlog.{c,h}	2012-12-21 11:17:14 +00:00
Daniel P. Berrange	f9c7020c1f	Rename cgroup.{h,c} to vircgroup.{h,c} To bring in line with new naming practice, rename the= src/util/cgroup.{h,c} files to vircgroup.{h,c} Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2012-12-21 11:17:12 +00:00
Daniel P. Berrange	df5928ea56	Allow passing a vroot into security manager hostdev labelling When LXC labels USB devices during hotplug, it is running in host context, so it needs to pass in a vroot path to the container root. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2012-12-17 17:50:51 +00:00
Daniel P. Berrange	4738c2a7e7	Replace 'struct qemud_driver *' with virQEMUDriverPtr Remove the obsolete 'qemud' naming prefix and underscore based type name. Introduce virQEMUDriverPtr as the replacement, in common with LXC driver naming style	2012-11-28 18:17:25 +00:00
Daniel P. Berrange	1c04f99970	Remove spurious whitespace between function name & open brackets The libvirt coding standard is to use 'function(...args...)' instead of 'function (...args...)'. A non-trivial number of places did not follow this rule and are fixed in this patch. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2012-11-02 13:36:49 +00:00
Osier Yang	bb81021bfe	qemu: Keep the affinity when creating cgroup for emulator thread When the cpu placement model is "auto", it sets the affinity for domain process with the advisory nodeset from numad, however, creating cgroup for the domain process (called emulator thread in some contexts) later overrides that with pinning it to all available pCPUs. How to reproduce: * Configure the domain with "auto" placement for <vcpu>, e.g. <vcpu placement='auto'>4</vcpu> * % virsh start dom * % cat /proc/$dompid/status Though the emulator cgroup cause conflicts, but we can't simply prohibit creating it, as other tunables are still useful, such as "emulator_period", which is used by API virDomainSetSchedulerParameter. So this patch doesn't prohibit creating the emulator cgroup, but inherit the nodeset from numad, and reset the affinity for domain process. * src/qemu/qemu_cgroup.h: Modify definition of qemuSetupCgroupForEmulator to accept the passed nodenet * src/qemu/qemu_cgroup.c: Set the affinity with the passed nodeset	2012-10-24 21:46:24 +08:00
Eric Blake	67aea3fb78	blockjob: remove unused parameters after previous patch Minor cleanup made possible by previous simplifications. * src/qemu/qemu_cgroup.h (qemuSetupDiskCgroup) (qemuTeardownDiskCgroup): Alter signature. * src/qemu/qemu_cgroup.c (qemuSetupDiskCgroup) (qemuTeardownDiskCgroup, qemuSetupCgroup): Update all uses. * src/qemu/qemu_hotplug.c (qemuDomainDetachPciDiskDevice) (qemuDomainDetachDiskDevice): Likewise. * src/qemu/qemu_driver.c (qemuDomainAttachDeviceDiskLive) (qemuDomainChangeDiskMediaLive) (qemuDomainSnapshotCreateSingleDiskActive) (qemuDomainSnapshotUndoSingleDiskActive): Likewise.	2012-10-19 17:35:11 -06:00
Eric Blake	38c4a9cc40	storage: use cache to walk backing chain We used to walk the backing file chain at least twice per disk, once to set up cgroup device whitelisting, and once to set up security labeling. Rather than walk the chain every iteration, which possibly includes calls to fork() in order to open root-squashed NFS files, we can exploit the cache of the previous patch. * src/conf/domain_conf.h (virDomainDiskDefForeachPath): Alter signature. * src/conf/domain_conf.c (virDomainDiskDefForeachPath): Require caller to supply backing chain via disk, if recursion is desired. * src/security/security_dac.c (virSecurityDACSetSecurityImageLabel): Adjust caller. * src/security/security_selinux.c (virSecuritySELinuxSetSecurityImageLabel): Likewise. * src/security/virt-aa-helper.c (get_files): Likewise. * src/qemu/qemu_cgroup.c (qemuSetupDiskCgroup) (qemuTeardownDiskCgroup): Likewise. (qemuSetupCgroup): Pre-populate chain.	2012-10-19 17:35:11 -06:00
Martin Kletzander	ba63d8f7d8	qemu: Pin the emulator when only cpuset is specified According to our recent changes (clarifications), we should be pinning qemu's emulator processes using the <vcpu> 'cpuset' attribute in case there is no <emulatorpin> specified. This however doesn't work entirely as expected and this patch should resolve all the remaining issues.	2012-10-17 17:37:10 +02:00
Jiri Denemark	edc9269a2a	qemu: Implement startupPolicy for USB passed through devices	2012-10-11 15:11:42 +02:00
Eric Blake	4ecb723b9e	maint: fix up copyright notice inconsistencies https://www.gnu.org/licenses/gpl-howto.html recommends that the 'If not, see <url>.' phrase be a separate sentence. * tests/securityselinuxhelper.c: Remove doubled line. * tests/securityselinuxtest.c: Likewise. * globally: s/; If/. If/	2012-09-20 16:30:55 -06:00
Hu Tao	75b198b3e7	use virBitmap to store numa nodemask info.	2012-09-17 14:59:37 -04:00
Hu Tao	f970d8481e	use virBitmap to store cpupin info	2012-09-17 14:59:36 -04:00
Hu Tao	f7e1a546f2	fix bug in qemuSetupCgroupForEmulator Should not return 0 when failed to setup cgroup.	2012-09-11 16:08:41 -06:00
Martin Kletzander	9f86fb9326	qemu: don't pin all the cpus This is another fix for the emulator-pin series. When going through the cputune pinning settings, the current code is trying to pin all the CPUs, even when not all of them are specified. This causes error in the subsequent function which, of course, cannot find the cpu to pin. Since it's enough to pass the correct VCPU ID to the function, the fix is trivial.	2012-09-05 19:25:10 +02:00
Jiri Denemark	774eb45be6	qemu: Don't ignore CPU tuning config if required cgroups are missing When domain XML contains any of the elements for setting up CPU scheduling parameters (period, quota, emulator_period, or emulator_quota) we need cpu cgroup to enforce the configuration. However, the existing code would just ignore silently such settings if either cgroups were not available at all cpu cgroup was not available. Moreover, APIs for manipulating CPU scheduler parameters were already failing if cpu cgroup was not available. This patch makes cpu cgroup mandatory for all domains that use CPU scheduling elements in their XML.	2012-08-31 13:24:02 +02:00
Jiri Denemark	0c7cca36e7	qemu: Fix starting domains with no cpu cgroup If cgroups are enabled in general but cpu cgroup is disabled in qemu.conf or not mounted at all, libvirt would refuse to start any domain even though scheduler parameters are not set in domain XML. This patch makes cpu cgroup mandatory only for domains that actually want to use it.	2012-08-29 16:13:38 +02:00
Martin Kletzander	16ebec2b7c	qemu: fix regression with pinning Commit `4b03d59167` changed the pinning behavior in a way that makes some machines non-startable. The comment mentioning that we cannot control each vcpu when there is not VCPU<-> PID mapping available is true, however, this isn't necessarily an error, because this can be caused by old QEMU without support for "query-cpus" command as well as a software emulated machines that don't create more than one process.	2012-08-27 10:20:42 +02:00
Hu Tao	b65dafa812	qemu: introduce period/quota tuning for emulator This patch introduces support of setting emulator's period and quota to limit cpu bandwidth when the vm starts. Also updates XML Schema for new entries and docs.	2012-08-22 16:52:22 +08:00
Hu Tao	1d4395eb47	limit cpu bandwidth only for vcpus This patch changes the behaviour of xml element cputune.period and cputune.quota to limit cpu bandwidth only for vcpus, and no longer limit cpu bandwidth for the whole guest. The reasons to do this are: - This matches docs of cputune.period and cputune.quota. - The other parts excepting vcpus are treated as "emulator", and there are separate period/quota settings for emulator in the subsequent patches	2012-08-22 16:50:41 +08:00
Tang Chen	a1249489ce	qemu: synchronize emulatorpin info to cgroup Introduce qemuSetupCgroupEmulatorPin() function to add emulator threads pin info to cpuset cgroup, the same as vcpupin. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>	2012-08-22 16:09:26 +08:00
Hu Tao	fe1d32596c	Enable cpuset cgroup and synchronous vcpupin info to cgroup. vcpu threads pin are implemented using sched_setaffinity(), but not controlled by cgroup. This patch does the following things: 1) enable cpuset cgroup 2) reflect all the vcpu threads pin info to cgroup Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>	2012-08-22 15:12:22 +08:00
Wen Congyang	4b03d59167	create a new cgroup and move all emulator threads to the new cgroup Create a new cgroup and move all emulator threads to the new cgroup. And then we can do the other things: 1. limit only vcpu usage rather than the whole qemu 2. limit for emulator threads(include vhost-net threads) Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>	2012-08-22 14:33:59 +08:00
Michal Privoznik	addeb7cd05	qemu: Set reasonable RSS limit on domain startup If there's a memory leak in qemu or qemu is exploited the host's system will sooner or later start trashing instead of killing the bad process. This however has impact on performance and other guests as well. Therefore we should set a reasonable RSS limit even when user hasn't set any. It's better to be secure by default.	2012-08-06 08:06:44 +02:00
Eric Blake	768007aedc	maint: don't permit format strings without % Any time we have a string with no % passed through gettext, a translator can inject a % to cause a stack overread. When there is nothing to format, it's easier to ask for a string that cannot be used as a formatter, by using a trivial "%s" format instead. In the past, we have used --disable-nls to catch some of the offenders, but that doesn't get run very often, and many more uses have crept in. Syntax check to the rescue! The syntax check can catch uses such as virReportError(code, _("split " "string")); by using a sed script to fold context lines into one pattern space before checking for a string without %. This patch is just mechanical insertion of %s; there are probably several messages touched by this patch where we would be better off giving the user more information than a fixed string. * cfg.mk (sc_prohibit_diagnostic_without_format): New rule. * src/datatypes.c (virUnrefConnect, virGetDomain) (virUnrefDomain, virGetNetwork, virUnrefNetwork, virGetInterface) (virUnrefInterface, virGetStoragePool, virUnrefStoragePool) (virGetStorageVol, virUnrefStorageVol, virGetNodeDevice) (virGetSecret, virUnrefSecret, virGetNWFilter, virUnrefNWFilter) (virGetDomainSnapshot, virUnrefDomainSnapshot): Add %s wrapper. * src/lxc/lxc_driver.c (lxcDomainSetBlkioParameters) (lxcDomainGetBlkioParameters): Likewise. * src/conf/domain_conf.c (virSecurityDeviceLabelDefParseXML) (virDomainDiskDefParseXML, virDomainGraphicsDefParseXML): Likewise. * src/conf/network_conf.c (virNetworkDNSHostsDefParseXML) (virNetworkDefParseXML): Likewise. * src/conf/nwfilter_conf.c (virNWFilterIsValidChainName): Likewise. * src/conf/nwfilter_params.c (virNWFilterVarValueCreateSimple) (virNWFilterVarAccessParse): Likewise. * src/libvirt.c (virDomainSave, virDomainSaveFlags) (virDomainRestore, virDomainRestoreFlags) (virDomainSaveImageGetXMLDesc, virDomainSaveImageDefineXML) (virDomainCoreDump, virDomainGetXMLDesc) (virDomainMigrateVersion1, virDomainMigrateVersion2) (virDomainMigrateVersion3, virDomainMigrate, virDomainMigrate2) (virStreamSendAll, virStreamRecvAll) (virDomainSnapshotGetXMLDesc): Likewise. * src/nwfilter/nwfilter_dhcpsnoop.c (virNWFilterSnoopReqLeaseDel) (virNWFilterDHCPSnoopReq): Likewise. * src/openvz/openvz_driver.c (openvzUpdateDevice): Likewise. * src/openvz/openvz_util.c (openvzKBPerPages): Likewise. * src/qemu/qemu_cgroup.c (qemuSetupCgroup): Likewise. * src/qemu/qemu_command.c (qemuBuildHubDevStr, qemuBuildChrChardevStr) (qemuBuildCommandLine): Likewise. * src/qemu/qemu_driver.c (qemuDomainGetPercpuStats): Likewise. * src/qemu/qemu_hotplug.c (qemuDomainAttachNetDevice): Likewise. * src/rpc/virnetsaslcontext.c (virNetSASLSessionGetIdentity): Likewise. * src/rpc/virnetsocket.c (virNetSocketNewConnectUNIX) (virNetSocketSendFD, virNetSocketRecvFD): Likewise. * src/storage/storage_backend_disk.c (virStorageBackendDiskBuildPool): Likewise. * src/storage/storage_backend_fs.c (virStorageBackendFileSystemProbe) (virStorageBackendFileSystemBuild): Likewise. * src/storage/storage_backend_rbd.c (virStorageBackendRBDOpenRADOSConn): Likewise. * src/storage/storage_driver.c (storageVolumeResize): Likewise. * src/test/test_driver.c (testInterfaceChangeBegin) (testInterfaceChangeCommit, testInterfaceChangeRollback): Likewise. * src/vbox/vbox_tmpl.c (vboxListAllDomains): Likewise. * src/xenxs/xen_sxpr.c (xenFormatSxprDisk, xenFormatSxpr): Likewise. * src/xenxs/xen_xm.c (xenXMConfigGetUUID, xenFormatXMDisk) (xenFormatXM): Likewise.	2012-07-26 14:32:30 -06:00
Osier Yang	f9ce7dad60	Desert the FSF address in copyright Per the FSF address could be changed from time to time, and GNU recommends the following now: (http://www.gnu.org/licenses/gpl-howto.html) You should have received a copy of the GNU General Public License along with Foobar. If not, see <http://www.gnu.org/licenses/>. This patch removes the explicit FSF address, and uses above instead (of course, with inserting 'Lesser' before 'General'). Except a bunch of files for security driver, all others are changed automatically, the copyright for securify files are not complete, that's why to do it manually: src/security/security_selinux.h src/security/security_driver.h src/security/security_selinux.c src/security/security_apparmor.h src/security/security_apparmor.c src/security/security_driver.c	2012-07-23 10:50:50 +08:00
Daniel P. Berrange	3b7399b5c9	Replace use of qemuReportError with virReportError Update the QEMU driver to use virReportError instead of the qemuReportError custom macro Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2012-07-19 14:42:28 +01:00
Peter Krempa	4e532f2e3d	qemu: Add missing "%s" before translation macros This patch cleans up some missing "%s" before translation macros, for strings which are const without format specifiers	2012-07-19 14:41:55 +01:00
Eric Blake	0867a87721	build: detect all improper uses of _("%s") The only useful translation of "%s" as a format string is "%s" (I suppose you could claim "%1$s" is also valid, but why bother). So it is not worth translating; fixing this exposes some instances where we were failing to translate real error messages. This makes the fix of commit `097da1ab` more generic, as well as ensuring no future regressions. * cfg.mk (sc_prohibit_useless_translation): New rule. * src/lxc/lxc_driver.c (lxcSetVcpuBWLive): Fix offender. * src/openvz/openvz_conf.c (openvzReadFSConf): Likewise. * src/qemu/qemu_cgroup.c (qemuSetupCgroupForVcpu): Likewise. * src/qemu/qemu_driver.c (qemuSetVcpusBWLive): Likewise. * src/xenapi/xenapi_utils.c (xenapiSessionErrorHandle): Likewise.	2012-07-10 15:49:41 -06:00
tangchen	097da1abbd	Fix a string format bug in qemu_cgroup.c Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>	2012-07-10 17:06:56 +08:00
Osier Yang	be9f6ecb28	qemu: Set memory policy using cgroup if placement is auto Like for 'static' placement, when the memory policy mode is 'strict', set the memory policy by writing the advisory nodeset returned from numad to cgroup file cpuset.mems,	2012-05-15 10:11:14 +08:00
Laine Stump	c18a88ac48	qemu: eliminate "Ignoring open failure" when using root-squash NFS This eliminates the warning message reported in: https://bugzilla.redhat.com/show_bug.cgi?id=624447 It was caused by a failure to open an image file that is not accessible by root (the uid libvirtd is running as) because it's on a root-squash NFS share, owned by a different user, with permissions of 660 (or maybe 600). The solution is to use virFileOpenAs() rather than open(). The codepath that generates the error is during qemuSetupDiskCGroup(), but the actual open() is in a lower-level generic function called from many places (virDomainDiskDefForeachPath), so some other pieces of the code were touched just to add dummy (or possibly useful) uid and gid arguments. Eliminating this warning message has the nice side effect that the requested operation may even succeed (which in this case isn't necessary, but shouldn't hurt anything either).	2012-02-03 16:47:43 -05:00
Hu Tao	9d3a721ad5	use cpuset to manage numa This patch also sets cgroup cpuset parameters for numatune.	2011-12-20 09:32:23 -07:00
Hu Tao	25a5f07c69	qemu: filter blkio 0-device-weight at two other places filter 0-device-weight when: - getting blkio parameters with --config - starting up a domain When testing with blkio, I found these issues: (dom is down) virsh blkiotune dom --device-weights /dev/sda,300,/dev/sdb,500 virsh blkiotune dom --device-weights /dev/sda,300,/dev/sdb,0 virsh blkiotune dom weight : 800 device_weight : /dev/sda,200,/dev/sdb,0 # issue 1: shows 0 device weight of /dev/sdb that may confuse user (continued) virsh start dom # issue 2: If /dev/sdb doesn't exist, libvirt refuses to bring the # dom up because it wants to set the device weight to 0 of a # non-existing device. Since 0 means no weight-limit, we really don't # have to set it.	2011-11-30 12:34:30 -07:00
Hu Tao	93ab58595d	blkiotune: add qemu support for blkiotune.device_weight Implement setting/getting per-device blkio weights in qemu, using the cgroups blkio.weight_device tunable.	2011-11-29 12:26:21 -07:00
Wen Congyang	652e55b7a5	set cpu bandwidth for the vm The cpu bandwidth is applied at the vcpu group level. We should apply it at the vm group level too, because the vm may do heavy I/O, and it will affect the other vm. We apply cpu bandwidth at the vcpu and the vm group level, so we must ensure that max(child_quota) <= parent_quota when we modify cpu bandwidth.	2011-07-26 22:12:57 +08:00
Wen Congyang	d6fa4967bc	fix make syntax-check error	2011-07-21 17:42:44 +08:00
Wen Congyang	c4441fee10	qemu: Implement period and quota tunable XML configuration and parsing This patch implements period and quota tunable XML configuration and parsing. A quota or period of zero will be simply ignored.	2011-07-21 17:11:12 +08:00
Daniel P. Berrange	b43070ebfc	Move qemu_audit.h helpers into shared code The LXC and UML drivers can both make use of auditing. Move the qemu_audit.{c,h} files to src/conf/domain_audit.{c,h} * src/conf/domain_audit.c: Rename from src/qemu/qemu_audit.c * src/conf/domain_audit.h: Rename from src/qemu/qemu_audit.h * src/Makefile.am: Remove qemu_audit.{c,h}, add domain_audit.{c,h} * src/qemu/qemu_audit.h, src/qemu/qemu_cgroup.c, src/qemu/qemu_command.c, src/qemu/qemu_driver.c, src/qemu/qemu_hotplug.c, src/qemu/qemu_migration.c, src/qemu/qemu_process.c: Update for changed audit API names	2011-07-12 17:05:25 +01:00
Eric Blake	4eb17d642e	qemu: reorder checks for safety Detected by Coverity. All existing callers happen to be in range, so this isn't too serious. * src/qemu/qemu_cgroup.c (qemuCgroupControllerActive): Check bounds before dereference.	2011-06-08 05:28:20 -06:00
Lai Jiangshan	b65f37a4a1	libvirt,logging: cleanup VIR_XXX0() These VIR_XXXX0 APIs make us confused, use the non-0-suffix APIs instead. How do these coversions works? The magic is using the gcc extension of ##. When __VA_ARGS__ is empty, "##" will swallow the "," in "fmt," to avoid compile error. example: origin after CPP high_level_api("%d", a_int) low_level_api("%d", a_int) high_level_api("a string") low_level_api("a string") About 400 conversions. 8 special conversions: VIR_XXXX0("") -> VIR_XXXX("msg") (avoid empty format) 2 conversions VIR_XXXX0(string_literal_with_%) -> VIR_XXXX(%->%%) 0 conversions VIR_XXXX0(non_string_literal) -> VIR_XXXX("%s", non_string_literal) (for security) 6 conversions Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>	2011-05-11 12:41:14 -06:00
Eric Blake	29e131dec2	qemu: update qemuCgroupControllerActive signature Clang warned about a dead assignment. In the process, I noticed that we are only using the function for a bool value. I audited all other callers in qemu_{migration,cgroup,driver,hotplug), and all were making the call in a bool context. Also, do bounds checking on the argument. * src/qemu/qemu_cgroup.c (qemuSetupCgroup): Delete dead assignment. (qemuCgroupControllerActive): Change return type to bool. * src/qemu/qemu_cgroup.h (qemuCgroupControllerActive): Likewise.	2011-05-04 09:35:47 -06:00
Osier Yang	0ca16a78af	qemu: Fix improper logic of qemuCgroupSetup It throws errors as long as the cgroup controller is not available, regardless of whether we really want to use it to do setup or not, which is not what we want, fixing it with throwing error when need to use the controller. And change "VIR_WARN" to "qemuReportError" for memory controller incidentally.	2011-04-01 11:41:33 +08:00
Osier Yang	1cc4d0259c	cputune: Support cputune for qemu driver When domain startup, setting cpu affinity and cpu shares according to the cputune xml specified in domain xml. Modify "qemudDomainPinVcpu" to update domain config for vcpupin, and modify "qemuSetSchedulerParameters" to update domain config for cpu shares. v1 - v2: * Use "VIR_ALLOC_N" instead of "VIR_ALLOC_VAR" * But keep raising error when it fails on adding vcpupin xml entry, as I still don't have a better idea yet.	2011-03-29 22:13:46 +08:00
Nikunj A. Dadhania	78ba748ef1	virsh: fix memtune's help message for swap_hard_limit * Correct the documentation for cgroup: the swap_hard_limit indicates mem+swap_hard_limit. * Change cgroup private apis to: virCgroupGet/SetMemSwapHardLimit Signed-off-by: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>	2011-03-17 16:45:06 -06:00
Eric Blake	c52cbe487c	qemu: don't request cgroup ACL access for /dev/net/tun Since libvirt always passes /dev/net/tun to qemu via fd, we should never trigger the cases where qemu tries to directly open the device. Therefore, it is safer to deny the cgroup device ACL. * src/qemu/qemu_cgroup.c (defaultDeviceACL): Remove /dev/net/tun. * src/qemu/qemu.conf (cgroup_device_acl): Reflect this change.	2011-03-10 08:32:43 -07:00
Eric Blake	340ab27dd2	audit: also audit cgroup ACL permissions * src/qemu/qemu_audit.h (qemuAuditCgroupMajor) (qemuAuditCgroupPath): Add parameter. * src/qemu/qemu_audit.c (qemuAuditCgroupMajor) (qemuAuditCgroupPath): Add 'acl=rwm' to cgroup audit entries. * src/qemu/qemu_cgroup.c: Update clients. * src/qemu/qemu_driver.c (qemudDomainSaveFlag): Likewise.	2011-03-09 11:36:59 -07:00
Eric Blake	5564c57528	cgroup: allow fine-tuning of device ACL permissions Adding audit points showed that we were granting too much privilege to qemu; it should not need any mknod rights to recreate any devices. On the other hand, lxc should have all device privileges. The solution is adding a flag parameter. This also lets us restrict write access to read-only disks. * src/util/cgroup.h (virCgroupDevice): Adjust prototypes. * src/util/cgroup.c (virCgroupAllowDevice) (virCgroupAllowDeviceMajor, virCgroupAllowDevicePath) (virCgroupDenyDevice, virCgroupDenyDeviceMajor) (virCgroupDenyDevicePath): Add parameter. * src/qemu/qemu_driver.c (qemudDomainSaveFlag): Update clients. * src/lxc/lxc_controller.c (lxcSetContainerResources): Likewise. * src/qemu/qemu_cgroup.c: Likewise. (qemuSetupDiskPathAllow): Also, honor read-only disks.	2011-03-09 11:35:36 -07:00
Eric Blake	d04916faae	audit: split cgroup audit types to allow more information Device names can be manipulated, so it is better to also log the major/minor device number corresponding to the cgroup ACL changes that libvirt made. This required some refactoring of the relatively new qemu cgroup audit code. Also, qemuSetupChardevCgroup was only auditing on failure, not success. * src/qemu/qemu_audit.h (qemuDomainCgroupAudit): Delete. (qemuAuditCgroup, qemuAuditCgroupMajor, qemuAuditCgroupPath): New prototypes. * src/qemu/qemu_audit.c (qemuDomainCgroupAudit): Rename... (qemuAuditCgroup): ...and drop a parameter. (qemuAuditCgroupMajor, qemuAuditCgroupPath): New functions, to allow listing device major/minor in audit. (qemuAuditGetRdev): New helper function. * src/qemu/qemu_driver.c (qemudDomainSaveFlag): Adjust callers. * src/qemu/qemu_cgroup.c (qemuSetupDiskPathAllow) (qemuSetupHostUsbDeviceCgroup, qemuSetupCgroup) (qemuTeardownDiskPathDeny): Likewise. (qemuSetupChardevCgroup): Likewise, fixing missing audit.	2011-03-09 09:08:10 -07:00
Eric Blake	7c6b22c4d5	qemu: only request sound cgroup ACL when required When a SPICE or VNC graphics controller is present, and sound is piggybacked over a channel to the graphics device rather than directly accessing host hardware, then there is no need to grant host hardware access to that qemu process. * src/qemu/qemu_cgroup.c (qemuSetupCgroup): Prevent sound with spice, and with vnc when vnc_allow_host_audio is 0. Reported by Daniel Berrange.	2011-02-28 09:42:25 -07:00
Eric Blake	6bb98d419f	audit: add qemu hooks for auditing cgroup events * src/qemu/qemu_audit.h (qemuDomainCgroupAudit): New prototype. * src/qemu/qemu_audit.c (qemuDomainCgroupAudit): Implement it. * src/qemu/qemu_driver.c (qemudDomainSaveFlag): Add audit. * src/qemu/qemu_cgroup.c (qemuSetupDiskPathAllow) (qemuSetupChardevCgroup, qemuSetupHostUsbDeviceCgroup) (qemuSetupCgroup, qemuTeardownDiskPathDeny): Likewise.	2011-02-24 13:32:15 -07:00
Eric Blake	b4d3434fc2	audit: prepare qemu for listing vm in cgroup audits * src/qemu/qemu_cgroup.h (struct qemuCgroupData): New helper type. (qemuSetupDiskPathAllow, qemuSetupChardevCgroup) (qemuTeardownDiskPathDeny): Drop unneeded prototypes. (qemuSetupDiskCgroup, qemuTeardownDiskCgroup): Adjust prototype. * src/qemu/qemu_cgroup.c (qemuSetupDiskPathAllow, qemuSetupChardevCgroup) (qemuTeardownDiskPathDeny): Mark static and use new type. (qemuSetupHostUsbDeviceCgroup): Use new type. (qemuSetupDiskCgroup): Alter signature. (qemuSetupCgroup): Adjust caller. * src/qemu/qemu_hotplug.c (qemuDomainAttachHostUsbDevice) (qemuDomainDetachPciDiskDevice, qemuDomainDetachSCSIDiskDevice): Likewise. * src/qemu/qemu_driver.c (qemudDomainAttachDevice) (qemuDomainUpdateDeviceFlags): Likewise.	2011-02-24 13:31:05 -07:00
Eric Blake	061738764d	cgroup: determine when skipping non-devices * src/util/cgroup.c (virCgroupAllowDevicePath) (virCgroupDenyDevicePath): Don't fail with EINVAL for non-devices. * src/qemu/qemu_driver.c (qemudDomainSaveFlag): Update caller. * src/qemu/qemu_cgroup.c (qemuSetupDiskPathAllow) (qemuSetupChardevCgroup, qemuSetupHostUsbDeviceCgroup) (qemuSetupCgroup, qemuTeardownDiskPathDeny): Likewise.	2011-02-24 13:31:05 -07:00
Gui Jianfeng	d9b28a319a	qemu: Implement blkio tunable XML configuration and parsing. Implement blkio tunable XML configuration and parsing. Reviewed-by: "Nikunj A. Dadhania" <nikunj@linux.vnet.ibm.com> Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>	2011-02-08 11:43:45 -07:00
Eric Blake	b96b6f4723	qemu: fix error messages Regression in commit `caa805ea` let a lot of bad messages slip in. * cfg.mk (msg_gen_function): Fix function name. * src/qemu/qemu_cgroup.c (qemuRemoveCgroup): Fix fallout from 'make syntax-check'. * src/qemu/qemu_driver.c (qemudDomainGetInfo) (qemuDomainWaitForMigrationComplete, qemudStartVMDaemon) (qemudDomainSaveFlag, qemudDomainAttachDevice) (qemuDomainUpdateDeviceFlags): Likewise. * src/qemu/qemu_hotplug.c (qemuDomainAttachHostUsbDevice) (qemuDomainDetachPciDiskDevice, qemuDomainDetachSCSIDiskDevice): Likewise.	2011-01-27 20:41:26 -07:00
Eric Blake	98334e7c3a	domain_conf: split source data out from ChrDef This opens up the possibility of reusing the smaller ChrSourceDef for both qemu monitor and a passthrough smartcard device. * src/conf/domain_conf.h (_virDomainChrDef): Factor host details... (_virDomainChrSourceDef): ...into new struct. (virDomainChrSourceDefFree): New prototype. * src/conf/domain_conf.c (virDomainChrDefFree) (virDomainChrDefParseXML, virDomainChrDefFormat): Split... (virDomainChrSourceDefClear, virDomainChrSourceDefFree) (virDomainChrSourceDefParseXML, virDomainChrSourceDefFormat): ...into new functions. (virDomainChrDefParseTargetXML): Update clients to reflect type split. * src/vmx/vmx.c (virVMXParseSerial, virVMXParseParallel) (virVMXFormatSerial, virVMXFormatParallel): Likewise. * src/xen/xen_driver.c (xenUnifiedDomainOpenConsole): Likewise. * src/xen/xend_internal.c (xenDaemonParseSxprChar) (xenDaemonFormatSxprChr): Likewise. * src/vbox/vbox_tmpl.c (vboxDomainDumpXML, vboxAttachSerial) (vboxAttachParallel): Likewise. * src/security/security_dac.c (virSecurityDACSetChardevLabel) (virSecurityDACSetChardevCallback) (virSecurityDACRestoreChardevLabel) (virSecurityDACRestoreChardevCallback): Likewise. * src/security/security_selinux.c (SELinuxSetSecurityChardevLabel) (SELinuxSetSecurityChardevCallback) (SELinuxRestoreSecurityChardevLabel) (SELinuxSetSecurityChardevCallback): Likewise. * src/security/virt-aa-helper.c (get_files): Likewise. * src/lxc/lxc_driver.c (lxcVmStart, lxcDomainOpenConsole): Likewise. * src/uml/uml_conf.c (umlBuildCommandLineChr): Likewise. * src/uml/uml_driver.c (umlIdentifyOneChrPTY, umlIdentifyChrPTY) (umlDomainOpenConsole): Likewise. * src/qemu/qemu_command.c (qemuBuildChrChardevStr) (qemuBuildChrArgStr, qemuBuildCommandLine) (qemuParseCommandLineChr): Likewise. * src/qemu/qemu_domain.c (qemuDomainObjPrivateXMLFormat) (qemuDomainObjPrivateXMLParse): Likewise. * src/qemu/qemu_cgroup.c (qemuSetupChardevCgroup): Likewise. * src/qemu/qemu_hotplug.c (qemuDomainAttachNetDevice): Likewise. * src/qemu/qemu_driver.c (qemudFindCharDevicePTYsMonitor) (qemudFindCharDevicePTYs, qemuPrepareChardevDevice) (qemuPrepareMonitorChr, qemudShutdownVMDaemon) (qemuDomainOpenConsole): Likewise. * src/qemu/qemu_command.h (qemuBuildChrChardevStr) (qemuBuildChrArgStr): Delete, now that they are static. * src/libvirt_private.syms (domain_conf.h): New exports. * cfg.mk (useless_free_options): Update list. * tests/qemuxml2argvtest.c (testCompareXMLToArgvFiles): Update tests.	2011-01-14 09:54:26 -07:00
Daniel P. Berrange	52271cfc28	Move QEMU cgroup helper code out of the QEMU driver The QEMU driver file is far too large. Move all the cgroup helper code out into a separate file. No functional change. * src/qemu/qemu_cgroup.c, src/qemu/qemu_cgroup.h, src/Makefile.am: Add cgroup helper file * src/qemu/qemu_driver.c: Delete cgroup code	2010-12-17 13:48:30 +00:00

1 2 3 4 5

218 Commits