Commit Graph

12863 Commits

Author SHA1 Message Date
Daniel P. Berrange
e7d8ab016b Add support for perf_event and net_cls cgroup controllers
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:32 +01:00
Daniel P. Berrange
ff66b45e2b Replace LXC cgroup mount code with call to virCgroupIsolateMount
The LXC driver currently has code to detect cgroups mounts
and then re-mount them inside the new root filesystem. Replace
this fragile code with a call to virCgroupIsolateMount.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:32 +01:00
Daniel P. Berrange
1da631ecf3 Add an API for re-mounting cgroups, to isolate the process location
Add a virCgroupIsolateMount method which looks at where the
current process is place in the cgroups (eg /system/demo.lxc.libvirt)
and then remounts the cgroups such that this sub-directory
becomes the root directory from the current process' POV.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:32 +01:00
Daniel P. Berrange
83336118db Track symlinks for co-mounted cgroup controllers
If a cgroup controller is co-mounted with another, eg

   /sys/fs/cgroup/cpu,cpuacct

Then it is a requirement that there exist symlinks at

   /sys/fs/cgroup/cpu
   /sys/fs/cgroup/cpuacct

pointing to the real mount point. Add support to virCgroupPtr
to detect and track these symlinks

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:32 +01:00
Daniel P. Berrange
767596bdb4 Remove non-functional code for setting up non-root cgroups
The virCgroupNewDriver method had a 'bool privileged' param.
If a false value was ever passed in, it would simply not
work, since non-root users don't have any privileges to create
new cgroups. Just delete this broken code entirely and make
the QEMU driver skip cgroup setup in non-privileged mode

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
db44eb1b5f Change default cgroup layout for QEMU/LXC and honour XML config
Historically QEMU/LXC guests have been placed in a cgroup layout
that is

   $LOCATION-OF-LIBVIRTD/libvirt/{qemu,lxc}/$VMNAME

This is bad for a number of reasons

 - The cgroup hierarchy gets very deep which seriously
   impacts kernel performance due to cgroups scalability
   limitations.

 - It is hard to setup cgroup policies which apply across
   services and virtual machines, since all VMs are underneath
   the libvirtd service.

To address this the default cgroup location is changed to
be

    /system/$VMNAME.{lxc,qemu}.libvirt

This puts virtual machines at the same level in the hierarchy
as system services, allowing consistent policy to be setup
across all of them.

This also honours the new resource partition location from the
XML configuration, for example

  <resource>
    <partition>/virtualmachines/production</partitions>
  </resource>

will result in the VM being placed at

    /virtualmachines/production/$VMNAME.{lxc,qemu}.libvirt

NB, with the exception of the default, /system, path which
is intended to always exist, libvirt will not attempt to
auto-create the partitions in the XML. It is the responsibility
of the admin/app to configure the partitions. Later libvirt
APIs will provide a way todo this.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
8d4adf3efa Add XML config for resource partitions
Allow VMs to be placed into resource groups using the
following syntax

  <resource>
    <partition>/virtualmachines/production</partition>
  </resource>

A resource cgroup will be backed by some hypervisor specific
functionality, such as cgroups with KVM/LXC.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
aa8604dd45 Add a new virCgroupNewPartition for setting up resource partitions
A resource partition is an absolute cgroup path, ignoring the
current process placement. Expose a virCgroupNewPartition API
for constructing such cgroups

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
109554d714 Cleanup if creating cgroup directories fails
Currently if virCgroupMakeGroup fails, we can get in a situation
where some controllers have been setup, but others not. Ensure
we call virCgroupRemove to remove what we've done upon failure

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
854a004fd6 Add misc extra debugging into cgroups code
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
8d1c141a8d Refactor cgroups internal data structures
Currently the virCgroupPtr struct contains 3 pieces of
information

 - path - path of the cgroup, relative to current process'
   cgroup placement
 - placement - current process' placement in each controller
 - mounts - mount point of each controller

When reading/writing cgroup settings, the path & placement
strings are combined to form the file path. This approach
only works if we assume all cgroups will be relative to
the current process' cgroup placement.

To allow support for managing cgroups at any place in the
heirarchy a change is needed. The 'placement' data should
reflect the absolute path to the cgroup, and the 'path'
value should no longer be used to form the paths to the
cgroup attribute files.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
d14524701a Add a test suite for cgroups functionality
Some aspects of the cgroups setup / detection code are quite subtle
and easy to break. It would greatly benefit from unit testing, but
this is difficult because the test suite won't have privileges to
play around with cgroups. The solution is to use monkey patching
via LD_PRELOAD to override the fopen, open, mkdir, access functions
to redirect access of cgroups files to some magic stubs in the
test suite.

Using this we provide custom content for the /proc/cgroup and
/proc/self/mounts files which report a fixed cgroup setup. We
then override open/mkdir/access so that access to the cgroups
filesystem gets redirected into files in a temporary directory
tree in the test suite build dir.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
04c18d25f1 Rename virCgroupForXXX to virCgroupNewXXX
Rename all the virCgroupForXXX methods to use the form
virCgroupNewXXX since they are all constructors. Also
make sure the output parameter is the last one in the
list, and annotate all pointers as non-null. Fix up
all callers, and make sure they use true/false not 0/1
for the boolean parameters

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
f0e5f92434 Pull definition of structs out of vircgroup.c to vircgrouppriv.h
The definition of structs for cgroups are kept in vircgroup.c since
they are intended to be private from users of the API. To enable
effective testing, however, they need to be accessible. To address
the latter issue, without compronmising the former, this introduces
a new vircgrouppriv.h file to hold the struct definitions.

To prevent other files including this private header, it requires
that __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ be defined before inclusion

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
cfed9ad4fb Store a virCgroupPtr instance in virLXCDomainObjPrivatePtr
Instead of calling virCgroupForDomain every time we need
the virCgrouPtr instance, just do it once at Vm startup
and cache a reference to the object in virLXCDomainObjPrivatePtr
until shutdown of the VM. Removing the virCgroupPtr from
the LXC driver state also means we don't have stale mount
info, if someone mounts the cgroups filesystem after libvirtd
has been started

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
632f78caaf Store a virCgroupPtr instance in qemuDomainObjPrivatePtr
Instead of calling virCgroupForDomain every time we need
the virCgrouPtr instance, just do it once at Vm startup
and cache a reference to the object in qemuDomainObjPrivatePtr
until shutdown of the VM. Removing the virCgroupPtr from
the QEMU driver state also means we don't have stale mount
info, if someone mounts the cgroups filesystem after libvirtd
has been started

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
c9b8cdfec1 Add missing param to virCgroupForDriver stub
The virCgroupForDriver method recently gained an 'int controllers'
parameter, but the stub impl did not

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
035cdaa00b Introduce a virFileDeleteTree method
Introduce a method virFileDeleteTree for recursively deleting
an entire directory tree

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
b7f1d218d8 Conditionally compile storagevolxml2argvtest
Only compile storagevolxml2argvtest if WITH_STORAGE is
set, because it links to that driver

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
c78a2b13a6 Conditionalize use of symlink() function in test suite
On Win32 symlink() is not available, so virstoragetest.c
must be conditionalized to avoid compile failures.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:30 +01:00
Daniel P. Berrange
3f85de5292 Fix signature of dummy virNetlinkCommand stub
The second param of virNetlinkCommand should be
struct nlmsghdr, not unsigned char.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:30 +01:00
Daniel P. Berrange
fd856af62b Add empty stub for virThreadCancel on Win32
Win32 does not like undefined symbols, so define an
empty virThreadCancel impl.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:30 +01:00
Daniel P. Berrange
c03eff7717 Don't enable -fPIE on Win32 platforms
On win32, all code is position independent and adding -fPIE
to the compiler flags results in warnings being printed

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:30 +01:00
Eric Blake
842432390b maint: update to latest gnulib
Upstream gnulib determined that we were needlessly compiling in
gnulib's regex instead of glibc's when targetting new-enough glibc,
because the m4 test was being too strict in requiring a particular
answer to undefined behavior.
https://lists.gnu.org/archive/html/bug-gnulib/2013-04/msg00032.html

* .gnulib: Update to latest, for regex.
2013-04-15 10:25:30 -06:00
Osier Yang
b1ea781eaa Use unsigned int instead of unsigned
Though they are the same thing, mixed use of them is uncomfortable.
"unsigned" is used a lot in old codes, this just tries to change the
ones in utils.
2013-04-15 23:07:08 +08:00
Daniel P. Berrange
e16e2a8bbb Do more complete initialization of libgcrypt
If libvirt makes any gcry_control() calls, then this
prevents gnutls for doing any initialization. As such
we must take care to do full initialization of libcrypt
on a par with what gnutls would have done. In particular
we must disable "sec mem" for cases where the user does
not have mlock() permission. We also skip our init of
libgcrypt if something else (ie the app using libvirt)
has beaten us to it.

https://bugzilla.redhat.com/show_bug.cgi?id=951630

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 12:09:10 +01:00
Peter Krempa
63b68f3cb4 qemu: Report also domain name in error message when domain object wasn't found
Report the errors as:
Domain not found: no domain with matching uuid '41414141-4141-4141-4141-414141414141' (crashtest)
instead of:
Domain not found: no domain with matching uuid '41414141-4141-4141-4141-414141414141'
2013-04-15 09:43:54 +02:00
Peter Krempa
54a99ba867 qemu: Refactor lookup of domain object
Use the helper to lookup the domain object in the remaining places.

This patch also fixes error reporting when the domain was not found in several
functions that were printing the raw UUID buffer instead of the formatted
string. The offending functions were:

qemuDomainGetInterfaceParameters
qemuDomainSetInterfaceParameters
qemuGetSchedulerParametersFlags
qemuSetSchedulerParametersFlags
qemuDomainGetNumaParameters
qemuDomainSetNumaParameters
qemuDomainGetMemoryParameters
qemuDomainSetMemoryParameters
qemuDomainGetBlkioParameters
qemuDomainSetBlkioParameters
qemuDomainGetCPUStats
2013-04-15 09:43:54 +02:00
Osier Yang
2f40ede4cd storage: Fix the indention
Pushed under trivial rule
2013-04-13 15:22:01 +08:00
Osier Yang
93002b9827 cleanup: Change datatype of net->stp to boolean 2013-04-13 13:28:36 +08:00
Osier Yang
f2adc3b435 cleanup: Change datatype of usbdev->allow to boolean 2013-04-13 13:28:36 +08:00
Osier Yang
00b6828dc2 cleanup: Change datatype of graphic's members to boolean 2013-04-13 13:28:36 +08:00
Osier Yang
b044b4d78f cleanup: Change datatype of accel's members to boolean 2013-04-13 13:28:36 +08:00
Stefan Berger
c772feb069 Add test case for TPM passthrough
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:46 -04:00
Stefan Berger
291cfb83f3 TPM support for QEMU command line
For TPM passthrough device support create command line parameters like:

-tpmdev passthrough,id=tpm-tpm0,path=/dev/tpm0,cancel-path=/sys/class/misc/tpm0/device/cancel -device tpm-tis,tpmdev=tpm-tpm0,id=tpm0

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:46 -04:00
Stefan Berger
22feb0d3e7 QEMU Cgroup support for TPM passthrough
Some refactoring for virDomainChrSourceDef type of devices so
we can use common code.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:46 -04:00
Stefan Berger
2c9a063973 Audit the starting of a guest using TPM passthrough
When a VM with a TPM passthrough device is started, the audit daemon
logs the following type of message:

type=VIRT_RESOURCE msg=audit(1365170222.460:3378): pid=16382 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 msg='virt=kvm resrc=dev reason=start vm="TPM-PT" uuid=a4d7cd22-da89-3094-6212-079a48a309a1 device="/dev/tpm0" exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=? res=success'

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:46 -04:00
Stefan Berger
2a40a09220 Add SELinux and DAC labeling support for TPM passthrough
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:46 -04:00
Stefan Berger
f447ff5982 Convert QMP strings into QEMU capability bits
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:45 -04:00
Stefan Berger
6ecff413e1 Parse TPM passthrough XML in the domain XML
Parse the domain XML with TPM passthrough support.
The TPM passthrough XML may look like this:

    <tpm model='tpm-tis'>
      <backend type='passthrough'>
        <device path='/dev/tpm0'/>
      </backend>
    </tpm>


Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:45 -04:00
Stefan Berger
06ba4bff91 Helper functions for host TPM support
Implement helper function to create the TPM's sysfs cancel file.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:45 -04:00
Stefan Berger
5eac4f600c Add documentation and schema for TPM passthrough
Supported TPM passthrough XML may look as follows:

    <tpm model='tpm-tis'>
      <backend type='passthrough'>
        <device path='/dev/tpm0'/>
      </backend>
    </tpm>

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:45 -04:00
Stefan Berger
069219577b Add function to find a needle in a string array
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:45 -04:00
Stefan Berger
ed1f031850 Add QMP probing for TPM
Probe for QEMU's QMP TPM support by querying the lists of
supported TPM models (query-tpm-models) and backend types
(query-tpm-types). 

The setting of the capability flags following the strings
returned from the commands above is only provided in the
patch where domain_conf.c gets TPM support due to dependencies
on functions only introduced there. 

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:45 -04:00
Peter Krempa
039a3283fc conf: Allow for non-contiguous device boot orders
This patch adds the ability to configure non-contiguous boot orders on boot
devices. This allows unplugging devices that have boot order specified without
breaking migration.

The new code now uses a slightly less memory efficient approach to store the
boot order fields in a hashtable instead of a bitmap.
2013-04-12 14:43:12 +02:00
Daniel P. Berrange
e7b0382945 Tweak EOF handling of streams
Typically when you get EOF on a stream, poll will return
POLLIN|POLLHUP at the same time. Thus when we deal with
stream reads, if we see EOF during the read, we can then
clear the VIR_STREAM_EVENT_HANGUP & VIR_STREAM_EVENT_ERROR
event bits.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-12 11:27:57 +01:00
Li Zhang
a6e37aedff Add USB option capability
To avoid the collision for creating USB controllers in machine->init()
and -device xx command line, it needs to set usb=off to avoid one USB
controller created in machine->init(). So that libvirt can use -device
or -usb to create USB controller sucessfully.
So QEMU_CAPS_MACHINE_USB_OPT capability is added, and it is for QEMU
v1.3.0 onwards which supports USB option.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
2013-04-12 10:56:03 +01:00
John Ferlan
6528b7044f Add error handling to optional arguments in cmdCPUStats 2013-04-11 18:53:35 -04:00
John Ferlan
ad86ebb286 Remove extraneous comma in info_cpu_stats and opts_cpu_stats 2013-04-11 18:40:59 -04:00
Jiri Denemark
88624b5d4c qemu: Do not report unsafe migration for local files
When migrating a domain with disk images stored locally (and using
storage migration), we should not complain about unsafe migration no
matter what cache policy is used for that disk.
2013-04-11 21:57:50 +02:00