Commit Graph

600 Commits

Author SHA1 Message Date
Daniel P. Berrange
6263fc5a5b Wire up sysinfo for LXC driver
The sysinfo code used by QEMU is trivially portable to the
LXC driver
2013-04-04 11:07:00 +01:00
Daniel P. Berrange
edd87fa2ea Revert "lxc: Prevent shutting down the host"
This reverts commit c9c87376f2.

Now that we force all containers to have a root filesystem,
there is no way the host's /dev is ever exposed

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-04 10:51:59 +01:00
Daniel P. Berrange
c131525bec Auto-add a root <filesystem> element to LXC containers on startup
Currently the LXC container code has two codepaths, depending on
whether there is a <filesystem> element with a target path of '/'.
If we automatically add a <filesystem> device with src=/ and dst=/,
for any container which has not specified a root filesystem, then
we only need one codepath for setting up the filesystem.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-04 10:51:59 +01:00
Daniel P. Berrange
f7e8653f7e Remove support for old kernels lacking private devpts
Early on kernel support for private devpts was not widespread,
so we had compatibiltiy codepaths. Such old kernels are not
seriously used for LXC these days, so the compat code can go
away

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-04 10:51:59 +01:00
Martin Kletzander
c9c87376f2 lxc: Prevent shutting down the host
When the container has the same '/dev' mount as host (no chroot),
calling domainShutdown(WithFlags) shouldn't shutdown the host it is
running on.
2013-03-23 11:07:57 +01:00
Daniel P. Berrange
8dbe85886c Ensure root filesystem is mounted if a file/block mount.
For a root filesystem with type=file or type=block, the LXC
container was forgetting to actually mount it, before doing
the pivot root step.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-22 17:27:01 +00:00
Daniel P. Berrange
7e1a7444c6 Mount temporary devpts on /var/lib/libvirt/lxc/$NAME.devpts
Currently the lxc controller sets up the devpts instance on
$rootfsdef->src, but this only works if $rootfsdef is using
type=mount. To support type=block or type=file for the root
filesystem, we must use /var/lib/libvirt/lxc/$NAME.devpts
for the temporary devpts mount in the controller
2013-03-22 17:27:01 +00:00
Daniel P. Berrange
05f664b12c Move FUSE mount to /var/lib/libvirt/lxc/$NAME.fuse
Instead of using /var/lib/libvirt/lxc/$NAME for the FUSE
filesystem, use /var/lib/libvirt/lxc/$NAME.fuse. This allows
room for other temporary mounts in the same directory
2013-03-22 17:27:01 +00:00
Daniel P. Berrange
d50cb2b115 Fix thread safety in LXC callback handling
Some of the LXC callbacks did not lock the virDomainObjPtr
instance. This caused transient errors like

error: Failed to start domain busy-mount
error: cannot rename file '/var/run/libvirt/lxc/busy-mount.xml.new' as '/var/run/libvirt/lxc/busy-mount.xml': No such file or directory

as 2 threads tried to update the status file concurrently

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-22 17:27:01 +00:00
Daniel P. Berrange
c5f28d0117 Fix free of uninitialized value in LXC numad setup
The 'nodeset' variable was never initialized, causing a later
VIR_FREE(nodeset) to free uninitialized memory.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-22 11:44:35 +00:00
Gao feng
4dceffadc9 LXC: add cpuset cgroup support for lxc
This patch adds cpuset cgroup support for LXC.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-03-20 19:37:16 +08:00
Gao feng
45e9d27ad8 NUMA: cleanup for numa related codes
Intend to reduce the redundant code,use virNumaSetupMemoryPolicy
to replace virLXCControllerSetupNUMAPolicy and
qemuProcessInitNumaMemoryPolicy.

This patch also moves the numa related codes to the
file virnuma.c and virnuma.h

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-03-20 19:37:00 +08:00
Gao feng
c9759a7b63 LXC: allow uses advisory nodeset from querying numad
Allow lxc using the advisory nodeset from querying numad,
this means if user doesn't specify the numa nodes that
the lxc domain should assign to, libvirt will automatically
bind the lxc domain to the advisory nodeset which queried from
numad.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-03-19 20:03:29 -06:00
Daniel P. Berrange
0a418355cc Do not prematurely close loop devices in LXC controller
The LXC controller is closing loop devices as soon as the
container has started. This is fine if the loop device
was setup as a mounted filesystem, but if we're just passing
through the loop device as a disk, nothing else is keeping
it open. Thus we must keep the loop device FDs open for as
long the libvirt_lxc process is running.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-19 14:46:40 +00:00
Daniel P. Berrange
1760258cc3 Setup LXC cgroups in two phases
Currently the LXC controller creates the cgroup, configures the
resources and adds the task all in one go. This is not sufficiently
flexible for the forthcoming NBD integration. We need to make sure
the NBD process gets into the right cgroup immediately, but we can
not have limits (in particular the device ACL) applied at the point
where we start qemu-nbd. So create a virLXCCgroupCreate method
which creates the cgroup and adds the current task to be called
early, and leave virLXCCgroupSetup to only do resource config.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-19 14:46:35 +00:00
Daniel P. Berrange
403594eb8c Fix generation of systemtap probes for RPC protocols
The naming used in the RPC protocols for the LXC monitor and
lock daemon confused the script used to generate systemtap
helper functions. Rename the LXC monitor protocol symbols to
reduce confusion. Adapt the gensystemtap.pl script to cope
with the LXC monitor / lock daemon naming conversions.

This has no functional impact on RPC wire protocol, since
names are only used in the C layer

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-14 12:42:22 +00:00
Daniel P. Berrange
2f98a7f7ba Avoid closing uninitialized FDs when LXC startup fails
If an LXC domain failed to start because of a bogus SELinux
label, virLXCProcessStart would call VIR_CLOSE(0) by mistake.
This is because the code which initializes the member of the
ttyFDs array to -1 got moved too far away from the place where
the array is first allocated.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-14 12:42:21 +00:00
Daniel P. Berrange
e31f32c6a3 Daemonize fuse thread in libvirt_lxc
In some startup failure modes, the fuse thread may get itself
wedged. This will cause the entire libvirt_lxc process to
hang trying to the join the thread. There is no compelling
reason to wait for the thread to exit if the whole process
is exiting, so just daemonize the fuse thread instead.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-13 15:54:06 +00:00
Daniel P. Berrange
a08810195c Fix query of LXC security label
The virDomainGetSecurityLabel method is currently (mistakenly)
showing the label of the libvirt_lxc process:

...snip...
Security model: selinux
Security DOI:   0
Security label: system_u:system_r:virtd_t:s0-s0:c0.c1023 (permissive)

when it should be showing the init process label

...snip...
Security model: selinux
Security DOI:   0
Security label: system_u:system_r:svirt_t:s0:c724,c995 (permissive)
2013-03-13 15:16:42 +00:00
Peter Krempa
27cf98e2d1 virCaps: conf: start splitting out irrelevat data
The virCaps structure gathered a ton of irrelevant data over time that.
The original reason is that it was propagated to the XML parser
functions.

This patch aims to create a new data structure virDomainXMLConf that
will contain immutable data that are used by the XML parser. This will
allow two things we need:

1) Get rid of the stuff from virCaps

2) Allow us to add callbacks to check and add driver specific stuff
after domain XML is parsed.

This first attempt removes pointers to private data allocation functions
to this new structure and update all callers and function that require
them.
2013-03-13 09:27:14 +01:00
Daniel P. Berrange
32b7e92db6 Add missing break in LXC loop device setup
When setting up disks with loop devices for LXC, one of the
switch cases was missing a 'break' causing it to fallthrough
to an error condition.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-12 11:52:52 +00:00
Guido Günther
531b4fe8d0 Convert HAVE_SELINUX to WITH_SELINUX
these were missed by 63f18f3786
2013-03-11 11:42:21 +01:00
Guido Günther
6082bc27d0 lxc: Init activeUsbHostdevs
otherwise we crash with

 #0  virUSBDeviceListFind (list=0x0, dev=dev@entry=0x8193d70) at util/virusb.c:526
 #1  0xb1a4995b in virLXCPrepareHostdevUSBDevices (driver=driver@entry=0x815d9a0, name=0x815dbf8 "debian-700267", list=list@entry=0x81d8f08) at lxc/lxc_hostdev.c:88
 #2  0xb1a49fce in virLXCPrepareHostUSBDevices (def=0x8193af8, driver=0x815d9a0) at lxc/lxc_hostdev.c:261
 #3  virLXCPrepareHostDevices (driver=driver@entry=0x815d9a0, def=0x8193af8) at lxc/lxc_hostdev.c:328
 #4  0xb1a4c5b1 in virLXCProcessStart (conn=0x817d3f8, driver=driver@entry=0x815d9a0, vm=vm@entry=0x8190908, autoDestroy=autoDestroy@entry=false, reason=reason@entry=VIR_DOMAIN_RUNNING_BOOTED)
     at lxc/lxc_process.c:1068
 #5  0xb1a57e00 in lxcDomainStartWithFlags (dom=dom@entry=0x815e460, flags=flags@entry=0) at lxc/lxc_driver.c:1014
 #6  0xb1a57fc3 in lxcDomainStart (dom=0x815e460) at lxc/lxc_driver.c:1046
 #7  0xb79c8375 in virDomainCreate (domain=domain@entry=0x815e460) at libvirt.c:8450
 #8  0x08078959 in remoteDispatchDomainCreate (args=0x81920a0, rerr=0xb65c21d0, client=0xb0d00490, server=<optimized out>, msg=<optimized out>) at remote_dispatch.h:1066
 #9  remoteDispatchDomainCreateHelper (server=0x80c4928, client=0xb0d00490, msg=0xb0d005b0, rerr=0xb65c21d0, args=0x81920a0, ret=0x815d208) at remote_dispatch.h:1044
 #10 0xb7a36901 in virNetServerProgramDispatchCall (msg=0xb0d005b0, client=0xb0d00490, server=0x80c4928, prog=0x80c6438) at rpc/virnetserverprogram.c:432
 #11 virNetServerProgramDispatch (prog=0x80c6438, server=server@entry=0x80c4928, client=0xb0d00490, msg=0xb0d005b0) at rpc/virnetserverprogram.c:305
 #12 0xb7a300a7 in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x80c4928) at rpc/virnetserver.c:162
 #13 virNetServerHandleJob (jobOpaque=0xb0d00510, opaque=0x80c4928) at rpc/virnetserver.c:183
 #14 0xb7924f98 in virThreadPoolWorker (opaque=opaque@entry=0x80a94b0) at util/virthreadpool.c:144
 #15 0xb7924515 in virThreadHelper (data=0x80a9440) at util/virthreadpthread.c:161
 #16 0xb7887c39 in start_thread (arg=0xb65c2b70) at pthread_create.c:304
 #17 0xb77eb78e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

when adding a domain with a usb device. This is Debian bug

    http://bugs.debian.org/700267
2013-03-11 11:41:58 +01:00
Guido Günther
c8871d8fbd lxc: include sys/stat.h
This fixes the build on Debian Wheezy which otherwise fails with:

  CC     libvirt_driver_lxc_impl_la-lxc_process.lo
  lxc/lxc_process.c: In function 'virLXCProcessGetNsInode':
  lxc/lxc_process.c:648:5: error: implicit declaration of function 'stat' [-Werror=implicit-function-declaration]
  lxc/lxc_process.c:648:5: error: nested extern declaration of 'stat' [-Werror=nested-externs]
  cc1: all warnings being treated as errors
2013-03-08 19:11:32 +01:00
Daniel P. Berrange
ab1ef3bc6c Include pid namespace inode in LXC audit messages
To allow the efficient correlation of container audit messages
with host hosts, include the pid namespace inode in audit
messages.
2013-03-07 19:43:53 +00:00
Daniel P. Berrange
eaf7d4ddff Add support for disks backed by plain files in LXC
By using a loopback device, disks backed by plain files can
be made available to LXC containers. We make no attempt to
auto-detect format if <driver type="raw"/> is not set,
instead we unconditionally treat that as meaning raw. This
is to avoid the security issues inherent with format
auto-detection

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-07 18:56:52 +00:00
Daniel P. Berrange
f0bfb6302d Refactor loop device setup code in LXC
Minor re-factoring of code for setting up loop devices in
the LXC controller

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-07 18:56:52 +00:00
Daniel P. Berrange
09f5e0123f Improve LXC startup error reporting
Currently we rely on a VIR_ERROR message being logged by the
virRaiseError function to report LXC startup errors. This gives
the right message, but is rather ugly and can be truncated
if lots of log messages are written. Change the LXC controller
to explicitly print any virErrorPtr message to stderr. Then
change the driver to skip over anything that looks like a log
message.

The result is that this

error: Failed to start domain busy
error: internal error guest failed to start: 2013-03-04 19:46:42.846+0000: 1734: info : libvirt version: 1.0.2
2013-03-04 19:46:42.846+0000: 1734: error : virFileLoopDeviceAssociate:600 : Unable to open /root/disk.raw: No such file or directory

changes to

error: Failed to start domain busy
error: internal error guest failed to start: Unable to open /root/disk.raw: No such file or directory
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-07 18:56:52 +00:00
Daniel P. Berrange
58e0accd8a Use VIR_MASS_CLOSE in LXC container startup
In the LXC container startup code when switching stdio
streams, we call VIR_FORCE_CLOSE on all FDs. This triggers
a huge number of warnings, but we don't see them because
stdio is closed at this point. strace() however shows them
which can confuse people debugging the code. Switch to
VIR_MASS_CLOSE to avoid this

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-07 18:10:36 +00:00
Daniel P. Berrange
11d926659b Turn virSecurityManager into a virObjectLockable
To enable locking to be introduced to the security manager
objects later, turn virSecurityManager into a virObjectLockable
class

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-02-11 12:33:41 +00:00
Daniel P. Berrange
fed92f08db Turn virCapabilities into a virObject
To enable virCapabilities instances to be reference counted,
turn it into a virObject. All cases of virCapabilitiesFree
turn into virObjectUnref

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-02-08 11:34:26 +00:00
Daniel P. Berrange
0f9ef55814 Convert virPCIDeviceList and virUSBDeviceList into virObjectLockable
To allow modifications to the lists to be synchronized, convert
virPCIDeviceList and virUSBDeviceList into virObjectLockable
classes. The locking, however, will not be self-contained. The
users of these classes will have to call virObjectLock/Unlock
in the critical regions.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-02-05 19:22:26 +00:00
Daniel P. Berrange
77c3015f9c Rename all USB device functions to have a standard name prefix
Rename all the usbDeviceXXX and usbXXXDevice APIs to have a
fixed virUSBDevice name prefix
2013-02-05 19:22:25 +00:00
Daniel P. Berrange
3e86e8f327 Fix leak of usbDevice struct when initializing cgroups
When iterating over USB host devices to setup cgroups, the
usbDevice object was leaked in both LXC and QEMU driers

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-02-05 19:22:25 +00:00
Daniel P. Berrange
eea87129f1 Merge virDomainObjListIsDuplicate into virDomainObjListAdd
The duplicate VM checking should be done atomically with
virDomainObjListAdd, so shoud not be a separate function.
Instead just use flags to indicate what kind of checks are
required.

This pair, used in virDomainCreateXML:

   if (virDomainObjListIsDuplicate(privconn->domains, def, 1) < 0)
     goto cleanup;
   if (!(dom = virDomainObjListAdd(privconn->domains,
                                   privconn->caps,
                                   def, false)))
     goto cleanup;

Changes to

   if (!(dom = virDomainObjListAdd(privconn->domains,
                                   privconn->caps,
                                   def,
                                   VIR_DOMAIN_OBJ_LIST_ADD_CHECK_LIVE,
                                   NULL)))
     goto cleanup;

This pair, used in virDomainRestoreFlags:

   if (virDomainObjListIsDuplicate(privconn->domains, def, 1) < 0)
     goto cleanup;
   if (!(dom = virDomainObjListAdd(privconn->domains,
                                   privconn->caps,
                                   def, true)))
     goto cleanup;

Changes to

   if (!(dom = virDomainObjListAdd(privconn->domains,
                                   privconn->caps,
                                   def,
                                   VIR_DOMAIN_OBJ_LIST_ADD_LIVE |
                                   VIR_DOMAIN_OBJ_LIST_ADD_CHECK_LIVE,
                                   NULL)))
     goto cleanup;

This pair, used in virDomainDefineXML:

   if (virDomainObjListIsDuplicate(privconn->domains, def, 0) < 0)
     goto cleanup;
   if (!(dom = virDomainObjListAdd(privconn->domains,
                                   privconn->caps,
                                   def, false)))
     goto cleanup;

Changes to

   if (!(dom = virDomainObjListAdd(privconn->domains,
                                   privconn->caps,
                                   def,
                                   0, NULL)))
     goto cleanup;
2013-02-05 19:22:25 +00:00
Daniel P. Berrange
37abd47165 Turn virDomainObjList into an opaque virObject
As a step towards making virDomainObjList thread-safe turn it
into an opaque virObject, preventing any direct access to its
internals.

As part of this a new method virDomainObjListForEach is
introduced to replace all existing usage of virHashForEach
2013-02-05 15:49:25 +00:00
Daniel P. Berrange
4f6ed6c33a Rename all domain list APIs to have virDomainObjList prefix
The APIs names for accessing the domain list object are
very inconsistent. Rename them all to have a standard
virDomainObjList prefix.
2013-02-05 15:49:25 +00:00
John Ferlan
73cdac3f72 lxc_process: Avoid passing NULL iface->iname
A followon to commit id: 68dceb635 - if iface->iname is NULL, then
neither virNetDevOpenvswitchRemovePort() nor virNetDevVethDelete()
should be called.  Found by Coverity.
2013-01-23 15:02:06 +01:00
John Ferlan
2e774db80e lxc_driver: Need to check for vm before calling virDomainUnlock(vm) 2013-01-23 15:02:06 +01:00
John Ferlan
e2ea90ce26 lxc: Need to initialize 'dst'
It was possible to call VIR_FREE in cleanup prior to initialization
2013-01-22 17:29:26 +01:00
John Ferlan
15666e026f lxc: Add coverity[dead_error_begin] tag in switch stmts
The use of switch statements inside a bounded for loop resulted in some
false positives regarding the "default:" label which cannot be reached
since each of the other case statements use the possible for loop values.
2013-01-22 16:59:45 +01:00
Daniel P. Berrange
325b02b5a3 Convert virDomainObj, qemuAgent, qemuMonitor, lxcMonitor to virObjectLockable
The  virDomainObj, qemuAgent, qemuMonitor, lxcMonitor classes
all require a mutex, so can be switched to use virObjectLockable

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-01-16 11:02:58 +00:00
Daniel P. Berrange
69218922e8 Allow for multi-level inheritance of virObject classes
Currently all classes must directly inherit from virObject.
This allows for arbitrarily deep hierarchy. There's not much
to this aside from chaining up the 'dispose' handlers from
each class & providing APIs to check types.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-01-15 19:21:31 +00:00
Gao feng
8d63af22de libvirt: lxc: don't mkdir when selinux is disabled
libvirt lxc will fail to start when selinux is disabled.
error: Failed to start domain noroot
error: internal error guest failed to start: PATH=/bin:/sbin TERM=linux container=lxc-libvirt container_uuid=b9873916-3516-c199-8112-1592ff694a9e LIBVIRT_LXC_UUID=b9873916-3516-c199-8112-1592ff694a9e LIBVIRT_LXC_NAME=noroot /bin/sh
2013-01-09 11:04:05.384+0000: 1: info : libvirt version: 1.0.1
2013-01-09 11:04:05.384+0000: 1: error : lxcContainerMountBasicFS:546 : Failed to mkdir /sys/fs/selinux: No such file or directory
2013-01-09 11:04:05.384+0000: 7536: info : libvirt version: 1.0.1
2013-01-09 11:04:05.384+0000: 7536: error : virLXCControllerRun:1466 : error receiving signal from container: Input/output error
2013-01-09 11:04:05.404+0000: 7536: error : virCommandWait:2287 : internal error Child process (ip link del veth1) unexpected exit status 1: Cannot find device "veth1"

fix this problem by checking if selinuxfs is mounted
in host before we try to create dir /sys/fs/selinux.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-01-15 12:01:22 -07:00
Daniel P. Berrange
2b1cd1f148 Add implementation of virDomainLxcOpenNamespace to LXC driver
The virDomainLxcOpenNamespace method needs to open every file
in /proc/$INITPID/ns and return the open file descriptor to the
client application.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-01-15 18:16:54 +00:00
John Ferlan
0b79971b84 lxc: Initialize dst due to potential cleanup usage before setting 2013-01-15 12:12:10 +01:00
Daniel P. Berrange
2ec48f7aa9 Fix build due to previous LXC patch
Mark virDomainLxcEnterNamespace as skipped in python binding
and remove reference to lxcDomainOpenNamespace which doesn't
arrive until a later patch
2013-01-14 16:35:40 +00:00
Daniel P. Berrange
3d1596b048 Introduce an LXC specific public API & library
This patch introduces support for LXC specific public APIs. In
common with what was done for QEMU, this creates a libvirt_lxc.so
library and libvirt/libvirt-lxc.h header file.

The actual APIs are

  int virDomainLxcOpenNamespace(virDomainPtr domain,
                                int **fdlist,
                                unsigned int flags);

  int virDomainLxcEnterNamespace(virDomainPtr domain,
                                 unsigned int nfdlist,
                                 int *fdlist,
                                 unsigned int *noldfdlist,
                                 int **oldfdlist,
                                 unsigned int flags);

which provide a way to use the setns() system call to move the
calling process into the container's namespace. It is not
practical to write in a generically applicable manner. The
nearest that we could get to such an API would be an API which
allows to pass a command + argv to be executed inside a
container. Even if we had such a generic API, this LXC specific
API is still useful, because it allows the caller to maintain
the current process context, in particular any I/O streams they
have open.

NB the virDomainLxcEnterNamespace() API is special in that it
runs client side, so does not involve the internal driver API.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-01-14 13:58:34 +00:00
Daniel P. Berrange
8c1e9be48f Rename HAVE_FUSE to WITH_FUSE 2013-01-14 13:26:47 +00:00
Daniel P. Berrange
bccd4a8cbc Rename HAVE_GNUTLS to WITH_GNUTLS 2013-01-14 13:26:47 +00:00
Daniel P. Berrange
7db9ac8260 Convert HAVE_LIBBLKID to WITH_BLKID
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-01-14 13:26:47 +00:00
Daniel P. Berrange
ef38965c30 Convert HAVE_CAPNG to WITH_CAPNG
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-01-14 13:25:06 +00:00
Daniel P. Berrange
6f736c83e5 Convert HAVE_NUMACTL to WITH_NUMACTL
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-01-14 13:25:06 +00:00
Daniel P. Berrange
63f18f3786 Convert HAVE_SELINUX to WITH_SELINUX
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-01-14 13:24:49 +00:00
Gao feng
ae9874e471 libvirt: lxc: fix incorrect parameter of lxcContainerMountProcFuse
when we has no host's src mapped to container.
there is no .oldroot dir,so libvirt lxc will fail
to start when mouting meminfo.

in this case,the parameter srcprefix of function
lxcContainerMountProcFuse should be NULL.and make
this method handle NULL correctly.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-01-09 15:08:42 +01:00
Daniel P. Berrange
f587c27768 Make TLS support conditional
Add checks for existence of GNUTLS and automatically disable
it if not found.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-01-08 20:57:31 +00:00
Daniel P. Berrange
014afe6501 Rename lxc_protocol.x to lxc_monitor_protocol.x
To avoid confusion between the LXC driver <-> controller
monitor RPC protocol and the libvirt-lxc.so <-> libvirtd public
RPC protocol, rename the former to lxc_monitor_protocol.x

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-01-08 09:21:08 +00:00
John Ferlan
36ac6e37be lxc: Avoid possible NULL dereference on *root prior to opendir().
If running on older Linux without mounted cgroups then its possible that
*root would be NULL.
2013-01-07 17:11:57 -07:00
Daniel P. Berrange
4f1f9d91ab Fix virLXCPrepareHostDevices method
The virLXCPrepareHostDevices method was returning success even
when it reported an error, and failed to handle several host
device types
2013-01-07 18:16:54 +00:00
Daniel P. Berrange
f0e4af91e4 Ensure we always setup a private mount namespace for LXC controller
The code for setting up a private /dev/pts for the containers
is also responsible for making the LXC controller have a
private mount namespace. Unfortunately the /dev/pts code is
not run if launching a container without a custom root. This
causes the LXC FUSE mount to leak into the host FS.
2013-01-07 18:14:34 +00:00
Daniel P. Berrange
f24404a324 Rename virterror.c virterror_internal.h to virerror.{c,h} 2012-12-21 11:19:50 +00:00
Daniel P. Berrange
e861b31275 Rename uuid.{c,h} to viruuid.{c,h} 2012-12-21 11:19:49 +00:00
Daniel P. Berrange
44f6ae27fe Rename util.{c,h} to virutil.{c,h} 2012-12-21 11:19:49 +00:00
Daniel P. Berrange
404174cad3 Rename threads.{c,h} to virthread.{c,h} 2012-12-21 11:19:49 +00:00
Daniel P. Berrange
fde9df8dcc Rename stats_linux.{c,h} to virstatslinux.{c,h} 2012-12-21 11:19:48 +00:00
Daniel P. Berrange
f56c773bf8 Merge processinfo.{c,h} into virprocess.{c,h} 2012-12-21 11:19:45 +00:00
Daniel P. Berrange
ab9b7ec2f6 Rename memory.{c,h} to viralloc.{c,h} 2012-12-21 11:17:14 +00:00
Daniel P. Berrange
936d95d347 Rename logging.{c,h} to virlog.{c,h} 2012-12-21 11:17:14 +00:00
Daniel P. Berrange
ebc8db5189 Rename hostusb.{c,h} to virusb.{c,h} 2012-12-21 11:17:13 +00:00
Daniel P. Berrange
30f3a005ff Rename hooks.{c,h} to virhook.{c,h} 2012-12-21 11:17:13 +00:00
Daniel P. Berrange
0f8454101d Rename conf.{c,h} to virconf.{c,h} 2012-12-21 11:17:13 +00:00
Daniel P. Berrange
04d9510f50 Rename command.{c,h} to vircommand.{c,h} 2012-12-21 11:17:13 +00:00
Daniel P. Berrange
2005f7b552 Rename buf.{c,h} to virbuffer.{c,h}
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-21 11:17:12 +00:00
Daniel P. Berrange
f9c7020c1f Rename cgroup.{h,c} to vircgroup.{h,c}
To bring in line with new naming practice, rename the=
src/util/cgroup.{h,c} files to vircgroup.{h,c}

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-21 11:17:12 +00:00
Daniel P. Berrange
c25c18f71b Convert capabilities / domain_conf to use virArch
Convert the host capabilities and domain config structs to
use the virArch datatype. Update the parsers and all drivers
to take account of datatype change

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-18 16:53:03 +00:00
Daniel P. Berrange
4ad6a01330 Add support for hotplug/unplug of host misc devices in LXC
Wire up the attach/detach device drivers in LXC to support the
hotplug/unplug of host misc devices.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:52 +00:00
Daniel P. Berrange
a5efb31909 Add support for hotplug/unplug of host storage devices in LXC
Wire up the attach/detach device drivers in LXC to support the
hotplug/unplug of host storage devices.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
ed77abc58b Add support for hotplug/unplug of USB host devices in LXC
Wire up the attach/detach device drivers in LXC to support the
hotplug/unplug of USB host devices.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
af7ab7fc5d Add support for hotplug/unplug of NIC devices in LXC
Wire up the attach/detach device drivers in LXC to support the
hotplug/unplug of NICs.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
de858e3fa7 Add support for hotplug/unplug of disk devices in LXC
Wire up the attach/detach device drivers in LXC to support the
hotplug/unplug of disks.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
986c270dac Add support for attach/detach/update hostdev devices in config for LXC
Wire up the attach/detach/update device APIs to support changing
of hostdevs in the persistent config file

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
8cacd8b4ea Add support for attach/detach/update disk devices in config for LXC
Wire up the attach/detach/update device APIs to support changing
of disks in the persistent config file

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
74a909fef1 Add support for attach/detach/update net devices in config for LXC
Wire up the attach/detach/update device APIs to support changing
of network interfaces in the persistent config file

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
d4e5359a1c Add basic driver API framework for device attach/detach support in LXC
This wires up the LXC driver to support the domain device attach/
detach/update APIs, following the same code design as used in
the QEMU driver. No actual changes are possible with this commit,
it is only providing the framework

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
83a9c93807 Add support for misc host device passthrough with LXC
This extends support for host device passthrough with LXC to
cover misc devices. In this case all we need todo is a
mknod in the container's /dev and whitelist the device in
cgroups

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
313669d1c1 Add support for storage host device passthrough with LXC
This extends support for host device passthrough with LXC to
cover storage devices. In this case all we need todo is a
mknod in the container's /dev and whitelist the device in
cgroups

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
95fef5f407 Add support for USB host device passthrough with LXC
This adds support for host device passthrough with the
LXC driver. Since there is only a single kernel image,
it doesn't make sense to pass through PCI devices, but
USB devices are fine. For the latter we merely need to
make the /dev/bus/usb/NNN/MMM character device exist
in the container's /dev

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
368e341ac1 Add support for disks with LXC
Currently LXC guests can be given arbitrary pre-mounted
filesystems, however, for some usecases it is more appropriate
to provide block devices which the container can mount itself.
This first impl only allows for <disk type='block'>, in other
words exposing a host disk device to a container. Since LXC
does not have device namespace virtualization, we are cheating
a little bit. If the XML specifies /dev/sdc4 to be given to
the container as /dev/sda1, when we do the mknod /dev/sda1
in the container's /dev, we actually use the major:minor
number of /dev/sdc4, not /dev/sda1.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
e89c68b8bb Refactor LXC NIC creation to allow reuse by hotplug code
The code for creating veth/macvlan devices is part of the
LXC process startup code. Refactor this a little and export
the methods to the rest of the LXC driver. This allows them
to be reused for NIC hotplug code

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Michal Privoznik
67159f1c60 bandwidth: Create hierarchical shaping classes
These classes can borrow unused bandwidth. Basically,
only egress qdsics can have classes, therefore we can
do this kind of traffic shaping only on host's outgoing,
that is domain's incoming traffic.
2012-12-11 18:36:55 +01:00
Daniel P. Berrange
79b8a56995 Replace polling for active VMs with signalling by drivers
Currently to deal with auto-shutdown libvirtd must periodically
poll all stateful drivers. Thus sucks because it requires
acquiring both the driver lock and locks on every single virtual
machine. Instead pass in a "inhibit" callback to virStateInitialize
which drivers can invoke whenever they want to inhibit shutdown
due to existance of active VMs.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-04 12:14:04 +00:00
Daniel P. Berrange
cbb106f807 Add support for shutdown / reboot APIs in LXC driver
Add support for doing controlled shutdown / reboot in the LXC
driver. The default behaviour is to try talking to /dev/initctl
inside the container's virtual root (/proc/$INITPID/root). This
works with sysvinit or systemd. If that file does not exist
then send SIGTERM (for shutdown) or SIGHUP (for reboot). These
signals are not any kind of particular standard for shutdown
or reboot, just something apps can choose to handle. The new
virDomainSendProcessSignal allows for sending custom signals.

We might allow the choice of SIGTERM/HUP to be configured for
LXC containers via the XML in the future.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-30 19:20:14 +00:00
Daniel P. Berrange
f4ea67f5b3 Turn some dual-state int parameters into booleans
The virStateInitialize method and several cgroups methods were
using an 'int privileged' parameter or similar for dual-state
values. These are better represented with the bool type.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-29 16:14:43 +00:00
Daniel P. Berrange
992ed55fea Implement virDomainSendProcessSignal for LXC driver
Implement the new API for sending signals to processes in a guest
for the LXC driver. Only support sending signals to the init
process for now, because

 - The kernel does not appear to expose the mapping between
   container PID numbers and host PID numbers anywhere in the
   host OS namespace
 - There is no race-free way to validate whether a host PID
   corresponds to a process in a container.

* src/lxc/lxc_driver.c: Allow sending processes signals

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-29 15:50:12 +00:00
Gao feng
df33ecdd9e mount fuse's meminfo file to container's /proc/meminfo
we already have virtualize meminfo for container through fuse filesystem,
add function lxcContainerMountProcFuse to mount this meminfo file to
the container's /proc/meminfo.

So we can isolate container's /proc/meminfo from host now.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2012-11-28 10:28:49 +00:00
Gao feng
d671c0ed1b make /proc/meminfo isolate with host through fuse
with this patch,container's meminfo will be shown based on
containers' mem cgroup.

Right now,it's impossible to virtualize all values in meminfo,
I collect some values such as MemTotal,MemFree,Cached,Active,
Inactive,Active(anon),Inactive(anon),Active(file),Inactive(anon),
Active(file),Inactive(file),Unevictable,SwapTotal,SwapFree.

if I miss something, please let me know.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2012-11-28 10:28:49 +00:00
Gao feng
2a596dac5e add fuse support for libvirt lxc
this patch addes fuse support for libvirt lxc.
we can use fuse filesystem to generate sysinfo dynamically,
So we can isolate /proc/meminfo,cpuinfo and so on through
fuse filesystem.

we mount fuse filesystem for every container.
the mount name is libvirt,mount point is
localstatedir/run/libvirt/lxc/containername.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2012-11-28 10:28:49 +00:00
Daniel P. Berrange
509ce9437f Fix leak of virNetworkPtr in LXC startup failure path
When starting an LXC guest with a virNetwork based NIC device,
if the network was not active, the virNetworkPtr device would
be leaked

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 17:59:28 +00:00
Daniel P. Berrange
9d2bfc1ca7 Ensure transient def is removed if LXC start fails
When starting a container, newDef is initialized to a
copy of 'def', but when startup fails newDef is never
removed. This cause later attempts to use 'virDomainDefine'
to lose the new data being defined.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 17:59:23 +00:00
Daniel P. Berrange
43db9cf4ed Ensure failure to create macvtap device aborts LXC start
A mistaken initialization of 'ret' caused failure to create
macvtap devices to be ignored. The libvirt_lxc process
would later fail to start due to missing devices

Also make sure code checks '< 0' and not '!= 0' since only
-1 is considered an error condition

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 17:02:22 +00:00
Daniel P. Berrange
68dceb635d Avoid crash when LXC start fails with no interface target
If the <interface> device did not contain any <target>
element, LXC would crash on a NULL pointer if starting
the container failed

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 17:02:22 +00:00
Daniel P. Berrange
7c5ba648f7 Treat missing driver cgroup as fatal in LXC driver
The LXC driver relies on use of cgroups to kill off LXC processes
in shutdown. If cgroups aren't available, we're unable to kill
off processes, so we must treat lack of cgroups as a fatal startup
error.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 17:02:22 +00:00
Daniel P. Berrange
8e1f0c38fa Ensure LXC container exits if cgroups setup fails
The code setting up LXC cgroups used an 'rc' variable both
for capturing the return value of methods it calls, and
its own return status. The result was that several failures
in setting up cgroups would actually result in success being
returned.

Use a separate 'ret' for tracking return value as per normal
code design in other parts of libvirt

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 17:02:22 +00:00
Daniel P. Berrange
ea2fec86dd Store initpid in the domain status XML for LXC
The initpid will be required long term to enable LXC to
implement various hotplug operations. Thus it needs to be
persisted in the domain status XML. LXC has not used the
domain status XML before, so this introduces use of the
helpers.
2012-11-27 17:02:22 +00:00
Daniel P. Berrange
a33d8fceee Remove bogus newline at end of debug log message
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 17:02:22 +00:00
Daniel P. Berrange
f999e2fdce Pass virSecurityManagerPtr object further down into LXC setup code
Currently the lxcContainerSetupMounts method uses the
virSecurityManagerPtr instance to obtain the mount options
string and then only passes the string down into methods
it calls. As functionality in LXC grows though, those
methods need to have direct access to the virSecurityManagerPtr
instance. So push the code down a level.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 16:45:09 +00:00
Daniel P. Berrange
3f6470f753 Fix error handling in virSecurityManagerGetMountOptions
The impls of virSecurityManagerGetMountOptions had no way to
return errors, since the code was treating 'NULL' as a success
value. This is somewhat pointless, since the calling code did
not want NULL in the first place and has to translate it into
the empty string "". So change the code so that the impls can
return "" directly, allowing use of NULL for error reporting
once again

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 16:45:04 +00:00
Peter Krempa
99a388e612 lxc: Don't crash if no security driver is specified in libvirt_lxc
When no security driver is specified libvirt_lxc segfaults as a debug
message tries to access security labels for the container that are not
present.

This problem was introduced in commit 6c3cf57d6c.
2012-11-26 15:48:31 +01:00
Peter Krempa
81efb13b4a lxc: Avoid segfault of libvirt_lxc helper on early cleanup paths
Early jumps to the cleanup label caused a crash of the libvirt_lxc
container helper as the cleanup section called
virLXCControllerDeleteInterfaces(ctrl) without checking the ctrl argument
for NULL. The argument was de-referenced soon after.

$ /usr/libexec/libvirt_lxc
/usr/libexec/libvirt_lxc: missing --name argument for configuration
Segmentation fault
2012-11-26 15:48:31 +01:00
Daniel P. Berrange
37db3f5dfe Fix exiting of libvirt_lxc program on container quit
The virLXCControllerClientCloseHook method was mistakenly
assuming that the private data associated with the network
client was the virLXCControllerPtr. In fact it was just a
dummy int, so we were derefencing a bogus struct. The
frequent result of this was that we would never quit, because
we tried to arm a non-existant timer.

Fix the code by removing the dummy private data and just
using the virLXCControllerPtr instance as private data

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-23 10:11:56 +00:00
Daniel P. Berrange
a615833664 Log an audit message with the LXC init pid
Currently the LXC driver logs audit messages when a container
is started or stopped. These audit messages, however, contain
the PID of the libvirt_lxc supervisor process. To enable
sysadmins to correlate with audit messages generated by
processes /inside/ the container, we need to include the
container init process PID.

We can't do this in the main 'start' audit message, since
the init PID is not available at that point. Instead we output
a completely new audit record, that lists both PIDs.

type=VIRT_CONTROL msg=audit(1353433750.071:363): pid=20180 uid=0 auid=501 ses=3 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 msg='virt=lxc op=init vm="busy" uuid=dda7b947-0846-1759-2873-0f375df7d7eb vm-pid=20371 init-pid=20372 exe="/home/berrange/src/virt/libvirt/daemon/.libs/lt-libvirtd" hostname=? addr=? terminal=pts/6 res=success'

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-22 10:46:40 +00:00
Daniel P. Berrange
f33e43c235 Use virNetServerRun instead of custom main loop
The LXC controller code currently directly invokes the
libvirt main loop code. The problem is that this misses
the cleanup of virNetServerClient connections that
virNetServerRun takes care of.

The result is that when libvirtd is stopped, the
libvirt_lxc controller process gets stuck in a I/O loop.
When libvirtd is then started again, it fails to connect
to the controller and thus kills off the entire domain.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-22 08:51:03 +00:00
Viktor Mihajlovski
a2b3d7cff8 qemu, lxc: Change host CPU number detection logic.
The drivers for QEMU and LXC use virNodeGetInfo only to determine
the number of host CPUs. On Linux hosts nodeGetCPUCount has less
overhead.
2012-11-15 08:48:19 -07:00
Daniel P. Berrange
3782814d4a Fix uninitialized variable in virLXCControllerSetupDevPTS
The lack of initialization of 'opts' caused a SEGV in the
cleanup: path if the root->src directory did not exist
2012-11-14 15:39:48 +00:00
Viktor Mihajlovski
b1c88c1476 capabilities: defaultConsoleTargetType can depend on architecture
For S390, the default console target type cannot be of type 'serial'.
It is necessary to at least interpret the 'arch' attribute
value of the os/type element to produce the correct default type.

Therefore we need to extend the signature of defaultConsoleTargetType
to account for architecture. As a consequence all the drivers
supporting this capability function must be updated.

Despite the amount of changed files, the only change in behavior is
that for S390 the default console target type will be 'virtio'.

N.B.: A more future-proof approach could be to to use hypervisor
specific capabilities to determine the best possible console type.
For instance one could add an opaque private data pointer to the
virCaps structure (in case of QEMU to hold capsCache) which could
then be passed to the defaultConsoleTargetType callback to determine
the console target type.
Seems to be however a bit overengineered for the use case...

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-11-09 09:20:59 -07:00
Daniel P. Berrange
1c04f99970 Remove spurious whitespace between function name & open brackets
The libvirt coding standard is to use 'function(...args...)'
instead of 'function (...args...)'. A non-trivial number of
places did not follow this rule and are fixed in this patch.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-02 13:36:49 +00:00
Dan Walsh
2e03b08ead Linux Containers are not allowed to create device nodes.
This needs to be done before the container starts. Turning
off the mknod capability is noticed by systemd, which will
no longer attempt to create device nodes.

This eliminates SELinux AVC messages and ugly failure messages in the journal.
2012-11-01 15:14:25 -06:00
Viktor Mihajlovski
e3ba67037b virNodeGetCPUMap: Implement driver support
Driver support added for:
- test: pretending 8 host CPUS, 3 being online
- qemu, lxc, openvz, uml: using nodeGetCPUMap

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-10-25 11:20:15 -06:00
Daniel P. Berrange
3cfc3d7d2c Add JSON serialization of virNetServerClientPtr objects for process re-exec()
Add two new APIs virNetServerClientNewPostExecRestart and
virNetServerClientPreExecRestart which allow a virNetServerClientPtr
object to be created from a JSON object and saved to a
JSON object, for the purpose of re-exec'ing a process.

This includes serialization of the connected socket associated
with the client

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 15:45:55 +01:00
Michal Privoznik
0dddd680c2 lxc: Correctly report active cgroups
There was an inverted return value in lxcCgroupControllerActive().
The function assumes cgroups are active and do couple of checks
to prove that. If any of them fails, false is returned. Therefore,
at the end, after all checks are done we must return true, not false.
2012-10-02 10:28:38 +02:00
Daniel P. Berrange
5cbb0d37d4 Use size_t instead of int for virDomainDefPtr struct
Many parts of virDomainDefPtr were using 'int' variables as
array length counts. Replace all these with size_t and update
various format strings & API signatures to adapt

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-09-27 10:11:44 +01:00
Daniel P. Berrange
36c1fc189d Fix deadlock in handling EOF in LXC monitor
Depending on the scenario in which LXC containers exit, it is
possible for the EOF callback of the LXC monitor to deadlock
the driver.

  #0  0x00000038a0a0de4d in __lll_lock_wait () from /lib64/libpthread.so.0
  #1  0x00000038a0a09ca6 in _L_lock_840 () from /lib64/libpthread.so.0
  #2  0x00000038a0a09ba8 in pthread_mutex_lock () from /lib64/libpthread.so.0
  #3  0x00007f4bd9579d55 in virMutexLock (m=<optimized out>) at util/threads-pthread.c:85
  #4  0x00007f4bcacc7597 in lxcDriverLock (driver=0x7f4bc40c8290) at lxc/lxc_conf.h:81
  #5  virLXCProcessMonitorEOFNotify (mon=<optimized out>, vm=0x7f4bb4000b00) at lxc/lxc_process.c:581
  #6  0x00007f4bd9645c91 in virNetClientCloseLocked (client=client@entry=0x7f4bb4009e60)
      at rpc/virnetclient.c:554
  #7  0x00007f4bd96460f8 in virNetClientIOEventLoopPassTheBuck (thiscall=0x0, client=0x7f4bb4009e60)
      at rpc/virnetclient.c:1306
  #8  virNetClientIOEventLoopPassTheBuck (client=0x7f4bb4009e60, thiscall=0x0)
      at rpc/virnetclient.c:1287
  #9  0x00007f4bd96467a2 in virNetClientCloseInternal (reason=3, client=0x7f4bb4009e60)
      at rpc/virnetclient.c:589
  #10 virNetClientCloseInternal (client=0x7f4bb4009e60, reason=3) at rpc/virnetclient.c:561
  #11 0x00007f4bcacc4a82 in virLXCMonitorClose (mon=0x7f4bb4000a00) at lxc/lxc_monitor.c:201
  #12 0x00007f4bcacc55ac in virLXCProcessCleanup (reason=<optimized out>, vm=0x7f4bb4000b00,
      driver=0x7f4bc40c8290) at lxc/lxc_process.c:240
  #13 virLXCProcessStop (driver=0x7f4bc40c8290, vm=vm@entry=0x7f4bb4000b00,
      reason=reason@entry=VIR_DOMAIN_SHUTOFF_DESTROYED) at lxc/lxc_process.c:735
  #14 0x00007f4bcacc5bd2 in virLXCProcessAutoDestroyDom (payload=<optimized out>,
      name=0x7f4bb4003c80, opaque=0x7fff41af2df0) at lxc/lxc_process.c:94
  #15 0x00007f4bd9586649 in virHashForEach (table=0x7f4bc409b270,
      iter=iter@entry=0x7f4bcacc5ab0 <virLXCProcessAutoDestroyDom>, data=data@entry=0x7fff41af2df0)
      at util/virhash.c:514
  #16 0x00007f4bcacc52d7 in virLXCProcessAutoDestroyRun (driver=driver@entry=0x7f4bc40c8290,
      conn=conn@entry=0x7f4bb8000ab0) at lxc/lxc_process.c:120
  #17 0x00007f4bcacca628 in lxcClose (conn=0x7f4bb8000ab0) at lxc/lxc_driver.c:128
  #18 0x00007f4bd95e67ab in virReleaseConnect (conn=conn@entry=0x7f4bb8000ab0) at datatypes.c:114

When the driver calls virLXCMonitorClose, there is really no
need for the EOF callback to be invoked in this case, since
the caller can easily handle events itself. In changing this,
the monitor needs to take a deep copy of the callback list,
not merely a reference.

Also adds debug statements in various places to aid
troubleshooting

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-09-27 10:11:44 +01:00
Daniel P. Berrange
dd0371764f Remove pointless virLXCProcessMonitorDestroy method
Asynchronously setting priv->mon to NULL was pointless,
just remove the destroy callback entirely.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-09-26 10:09:57 +01:00
Daniel P. Berrange
09e0cb4218 Convert virLXCMonitor to use virObject
Remove custom reference counting from virLXCMonitor, using
virObject instead

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-09-26 10:09:57 +01:00
Daniel P. Berrange
9467ab6074 Move virProcess{Kill,Abort,TranslateStatus} into virprocess.{c,h}
Continue consolidation of process functions by moving some
helpers out of command.{c,h} into virprocess.{c,h}

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-09-26 10:09:57 +01:00
Daniel P. Berrange
0fb58ef5cd Rename virPid{Abort,Wait} to virProcess{Abort,Wait}
Change "Pid" to "Process" to align with the virProcessKill
API naming prefix

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-09-26 10:09:57 +01:00
Daniel P. Berrange
1532bd498a Fix start of containers with custom root filesystem
A prefix change to unmount the SELinux filesystem broke starting
of LXC containers with a custom root filesystem
2012-09-26 10:09:50 +01:00
Daniel P. Berrange
2b9189e8ad Improve some debugging log messages in LXC mount setup
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-09-21 10:43:25 +01:00
Daniel P. Berrange
c15d893252 Ensure existing selinux mount is removed before mounting new one in LXC
Some kernel versions (at least RHEL-6 2.6.32) do not let you over-mount
an existing selinuxfs instance with a new one. Thus we must unmount the
existing instance inside our namespace.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-09-21 10:27:42 +01:00
Eric Blake
4ecb723b9e maint: fix up copyright notice inconsistencies
https://www.gnu.org/licenses/gpl-howto.html recommends that
the 'If not, see <url>.' phrase be a separate sentence.

* tests/securityselinuxhelper.c: Remove doubled line.
* tests/securityselinuxtest.c: Likewise.
* globally: s/;  If/.  If/
2012-09-20 16:30:55 -06:00
Hu Tao
ee7d23ba4b use virBitmap to store cpumask info. 2012-09-17 14:59:37 -04:00
Hu Tao
75b198b3e7 use virBitmap to store numa nodemask info. 2012-09-17 14:59:37 -04:00
Hu Tao
f1a43a8e41 use virBitmap to store cpu affinity info 2012-09-17 14:59:37 -04:00
Osier Yang
8268a24548 node_memory: Support get/set memory parameters for drivers
Including QEMU, LXC, UML, XEN drivers.
2012-09-17 13:55:22 +08:00
Daniel P. Berrange
a4fd740561 Don't assume use of /sys/fs/cgroup
The introduction of /sys/fs/cgroup came in fairly recent kernels.
Prior to that time distros would pick a custom directory like
/cgroup or /dev/cgroup. We need to auto-detect where this is,
rather than hardcoding it

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-09-07 13:30:20 +01:00
Laine Stump
b3bd5d6c5a network: get vlan info for Open vSwitch interfaces from proper source
This bug was revealed by the crash described in

  https://bugzilla.redhat.com/show_bug.cgi?id=852383

The vlan info pointer sent to virNetDevOpenvswitchAddPort should never
be non-NULL unless there is at least one tag. The factthat such a vlan
info pointer was receveid pointed out that a caller was passing the
wrong pointer. Instead of sending &net->vlan, the result of
virDomainNetGetActualVlan(net) should be sent - that function will
look for vlan info in net->data.network.actual->vlan, and in cany case
return NULL instead of a pointer if the vlan info it finds has no
tags.

Aside from causing the crash, sending a hardcoded &net->vlan has the
effect of ignoring vlan info from a <network> or <portgroup> config.
2012-08-30 18:05:18 +08:00
Marcelo Cerri
6c3cf57d6c Internal refactory of data structures
This patch updates the structures that store information about each
domain and each hypervisor to support multiple security labels and
drivers. It also updates all the remaining code to use the new fields.

Signed-off-by: Marcelo Cerri <mhcerri@linux.vnet.ibm.com>
2012-08-20 19:13:33 +02:00
Kyle Mestery
7d2b91b86a network: add support for setting VLANs on Open vSwitch ports
Add the ability to support VLAN tags for Open vSwitch virtual port
types. To accomplish this, modify virNetDevOpenvswitchAddPort and
virNetDevTapCreateInBridgePort to take a virNetDevVlanPtr
argument. When adding the port to the OVS bridge, setup either a
single VLAN or a trunk port based on the configuration from the
virNetDevVlanPtr.

Signed-off-by: Kyle Mestery <kmestery@cisco.com>
2012-08-17 11:12:29 -04:00
Daniel P. Berrange
39b5e4d4d8 Refactor RPC client private data setup
Currently there is a hook function that is invoked when a
new client connection comes in, which allows an app to
setup private data. This setup will make it difficult to
serialize client state during process re-exec(). Change to
a model where the app registers a callback when creating
the virNetServerPtr instance, which is used to allocate
the client private data immediately during virNetClientPtr
construction.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-08-15 10:59:10 +01:00
Daniel P. Berrange
86f5457d49 Allow sync IO and keepalives to be skipped in RPC client setup
Currently the virNetClientPtr constructor will always register
the async IO event handler and the keepalive objects. In the
case of the lock manager, there will be no event loop available
nor keepalive support required. Split this setup out of the
constructor and into separate methods.

The remote driver will enable async IO and keepalives, while
the LXC driver will only enable async IO

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-08-15 10:58:30 +01:00
Osier Yang
bb705e2519 Destroy virdomainlist.[ch]
As the consensus in:
https://www.redhat.com/archives/libvir-list/2012-July/msg01692.html,
this patch is to destroy conf/virdomainlist.[ch], folding the
helpers into conf/domain_conf.[ch].

* src/Makefile.am:
  - Various indention fixes incidentally
  - Add macro DATATYPES_SOURCES (datatypes.[ch])
  - Link datatypes.[ch] for libvirt_lxc

* src/conf/domain_conf.c:
  - Move all the stuffs from virdomainlist.c into it
  - Use virUnrefDomain and virUnrefDomainSnapshot instead of
    virDomainFree and virDomainSnapshotFree, which are defined
    in libvirt.c, and we don't want to link to it.
  - Remove "if" before "free" the object, as virObjectUnref
    is in the list "useless_free_options".

* src/conf/domain_conf.h:
  - Move all the stuffs from virdomainlist.h into it
  - s/LIST_FILTER/LIST_DOMAINS_FILTER/

* src/libxl/libxl_driver.c:
  - s/LIST_FILTER/LIST_DOMAINS_FILTER/
  - no (include "virdomainlist.h")

* src/libxl/libxl_driver.c: Likewise

* src/lxc/lxc_driver.c: Likewise

* src/openvz/openvz_driver.c: Likewise

* src/parallels/parallels_driver.c: Likewise

* src/qemu/qemu_driver.c: Likewise

* src/test/test_driver.c: Likewise

* src/uml/uml_driver.c: Likewise

* src/vbox/vbox_tmpl.c: Likewise

* src/vmware/vmware_driver.c: Likewise

* tools/virsh-domain-monitor.c: Likewise

* tools/virsh.c: Likewise
2012-08-14 17:27:49 +08:00
Laine Stump
b8a56f12f5 nwfilter: fix crash during filter define when lxc driver failed startup
The meat of this patch is just moving the calls to
virNWFilterRegisterCallbackDriver from each hypervisor's "register"
function into its "initialize" function. The rest is just code
movement to allow that, and a new virNWFilterUnRegisterCallbackDriver
function to undo what the register function does.

The long explanation:

There is an array in nwfilter called callbackDrvArray that has
pointers to a table of functions for each hypervisor driver that are
called by nwfilter. One of those function pointers is to a function
that will lock the hypervisor driver. Entries are added to the table
by calling each driver's "register" function, which happens quite
early in libvirtd's startup.

Sometime later, each driver's "initialize" function is called. This
function allocates a driver object and stores a pointer to it in a
static variable that was previously initialized to NULL. (and here's
the important part...) If the "initialize" function fails, the driver
object is freed, and that pointer set back to NULL (but the entry in
nwfilter's callbackDrvArray is still there).

When the "lock the driver" function mentioned above is called, it
assumes that the driver was successfully loaded, so it blindly tries
to call virMutexLock on "driver->lock".

BUT, if the initialize never happened, or if it failed, "driver" is
NULL. And it just happens that "lock" is always the first field in
driver so it is also NULL.

Boom.

To fix this, the call to virNWFilterRegisterCallbackDriver for each
driver shouldn't be called until the end of its (*already guaranteed
successful*) "initialize" function, not during its "register" function
(which is currently the case). This implies that there should also be
a virNWFilterUnregisterCallbackDriver() function that is called in a
driver's "shutdown" function (although in practice, that function is
currently never called).
2012-08-09 23:28:00 -04:00
Daniel P. Berrange
05e4e7b46e Turn virNetClient* into virObject instances
Make all the virNetClient* objects use virObject APIs for
reference counting

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-08-07 11:47:55 +01:00
Daniel P. Berrange
958499b0c1 Turn virNetServer* into virObject instances
Make all the virNetServer* objects use the virObject APIs
for reference counting

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-08-07 11:47:55 +01:00
Daniel P. Berrange
31cb030ab6 Turn virDomainObjPtr into a virObjectPtr
Switch virDomainObjPtr to use the virObject APIs for reference
counting. The main change is that virObjectUnref does not return
the reference count, merely a bool indicating whether the object
still has any refs left. Checking the return value is also not
mandatory.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-08-07 11:47:41 +01:00
Eric Blake
87de27b7f9 virrandom: make virRandomInitialize an automatic one-shot
All callers used the same initialization seed (well, the new
viratomictest forgot to look at getpid()); so we might as well
make this value automatic.  And while it may feel like we are
giving up functionality, I documented how to get it back in the
unlikely case that you actually need to debug with a fixed
pseudo-random sequence.  I left that crippled by default, so
that a stray environment variable doesn't cause a lack of
randomness to become a security issue.

* src/util/virrandom.c (virRandomInitialize): Rename...
(virRandomOnceInit): ...and make static, with one-shot call.
Document how to do fixed-seed debugging.
* src/util/virrandom.h (virRandomInitialize): Drop prototype.
* src/libvirt_private.syms (virrandom.h): Don't export it.
* src/libvirt.c (virInitialize): Adjust caller.
* src/lxc/lxc_controller.c (main): Likewise.
* src/security/virt-aa-helper.c (main): Likewise.
* src/util/iohelper.c (main): Likewise.
* tests/seclabeltest.c (main): Likewise.
* tests/testutils.c (virtTestMain): Likewise.
* tests/viratomictest.c (mymain): Likewise.
2012-08-06 08:15:13 -06:00
Eric Blake
6f926c5ef6 build: fix build without HAVE_CAPNG
Otherwise, a build may fail with:

lxc/lxc_conatiner.c: In function 'lxcContainerDropCapabilities':
lxc/lxc_container.c:1662:46: error: unused parameter 'keepReboot' [-Werror=unused-parameter]

* src/lxc/lxc_container.c (lxcContainerDropCapabilities): Mark
parameter unused.
2012-07-30 11:59:25 -06:00
Daniel P. Berrange
ac97c2ba4c Improve error message in LXC startup with network is not active
If an LXC container is using a virtual network and that network
is not active, currently the user gets a rather unhelpful
error message about tap device setup failure. Add an explicit
check for whether the network is active, in exactly the same
way as the QEMU driver
2012-07-30 13:09:57 +01:00
Daniel P. Berrange
cb612ee489 Add handling for reboots of LXC containers
The reboot() syscall is allowed by new kernels for LXC containers.
The LXC controller can detect whether a reboot was requested
(instead of a normal shutdown) by looking at the "init" process
exit status. If a reboot was triggered, the exit status will
record SIGHUP as the kill reason.

The LXC controller has cleared all its capabilities, and the
veth network devices will no longer exist at this time. Thus
it cannot restart the container init process itself. Instead
it emits an event which is picked up by the LXC driver in
libvirtd. This will then re-create the container, using the
same configuration as it was previously running with (ie it
will not activate 'newDef').

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-07-30 13:09:56 +01:00
Daniel P. Berrange
b46b1c762a Allow CAP_SYS_REBOOT on new enough kernels
Check whether the reboot() system call is virtualized, and if
it is, then allow the container to keep CAP_SYS_REBOOT.

Based on an original patch by Serge Hallyn

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-07-30 13:07:45 +01:00