Commit Graph

200 Commits

Author SHA1 Message Date
Eric Blake
64b2335c2a maint: fix comma style issues: remaining drivers
Most of our code base uses space after comma but not before;
fix the remaining uses before adding a syntax check.

* src/lxc/lxc_container.c: Consistently use commas.
* src/openvz/openvz_driver.c: Likewise.
* src/openvz/openvz_util.c: Likewise.
* src/remote/remote_driver.c: Likewise.
* src/test/test_driver.c: Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-11-20 09:14:55 -07:00
Peter Krempa
de7b5faf43 conf: Refactor storing and usage of feature flags
Currently we were storing domain feature flags in a bit field as the
they were either enabled or disabled. New features such as paravirtual
spinlocks however can be tri-state as the default option may depend on
hypervisor version.

To allow storing tri-state feature state in the same place instead of
having to declare dedicated variables for each feature this patch
refactors the bit field to an array.
2013-11-08 09:44:42 +01:00
Daniel P. Berrange
9ecbd38c4c Skip any files which are not mounted on the host
Currently the LXC container tries to skip selinux/securityfs
mounts if the directory does not exist in the filesystem,
or if SELinux is disabled.

The former check is flawed because the /sys/fs/selinux
or /sys/kernel/securityfs directories may exist in sysfs
even if the mount type is disabled. Instead of just doing
an access() check, use an virFileIsMounted() to see if
the FS is actually present in the host OS. This also
avoids the need to check is_selinux_enabled().

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-11-05 15:51:48 +08:00
Daniel P. Berrange
bf8874025e Add flag to lxcBasicMounts to control use in user namespaces
Some mounts must be skipped if running inside a user namespace,
since the kernel forbids their use. Instead of strcmp'ing the
filesystem type in the body of the loop, set an explicit flag
in the lxcBasicMounts table.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-11-05 15:51:47 +08:00
Daniel P. Berrange
6d5fdde3dd Remove duplicate entries in lxcBasicMounts array
Currently the lxcBasicMounts array has separate entries for
most mounts, to reflect that we must do a separate mount
operation to make mounts read-only. Remove the duplicate
entries and instead set the MS_RDONLY flag against the main
entry. Then change lxcContainerMountBasicFS to look for the
MS_RDONLY flag, mask it out & do a separate bind mount.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-11-05 15:51:47 +08:00
Daniel P. Berrange
f567a583f3 Remove pointless 'srcpath' variable in lxcContainerMountBasicFS
The 'srcpath' variable is initialized from 'mnt->src' and never
changed thereafter. Some places continue to use 'mnt->src' and
others use 'srcpath'. Remove the pointless 'srcpath' variable
and use 'mnt->src' everywhere.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-11-05 15:51:47 +08:00
Daniel P. Berrange
c6b84a9dee Remove unused 'opts' field from LXC basic mounts struct
The virLXCBasicMountInfo struct contains a 'char *opts'
field passed onto the mount() syscall. Every entry in the
list sets this to NULL though, so it can be removed to
simplify life.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-11-05 15:51:47 +08:00
Gao feng
919374c73e LXC: don't free tty before using it in lxcContainerSetupDevices
Introduced by commit 0f31f7b.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2013-10-29 15:44:56 +01:00
Chen Hanxiao
8e1336fea9 Skip debug message in lxcContainerSetID if no map is set.
The lxcContainerSetID() method prints a misleading log
message about setting the uid/gid when no ID map is
present in the XML config. Skip the debug message in
this case.

Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
2013-10-28 11:19:20 +00:00
Daniel P. Berrange
01100c7f60 Ensure lxcContainerResolveSymlinks reports errors
The lxcContainerResolveSymlinks method merely logged some errors
as debug messages, rather than reporting them as proper errors.
This meant startup failures were not diagnosed at all.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-10-14 15:38:20 +01:00
Daniel P. Berrange
558546fb8f Ensure lxcContainerMain reports errors on stderr
Ensure the lxcContainerMain method reports any errors that
occur during setup to stderr, where libvirtd will pick them
up.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-10-14 15:38:20 +01:00
Ján Tomko
3f029fb531 LXC: Fix handling of RAM filesystem size units
Since 76b644c when the support for RAM filesystems was introduced,
libvirt accepted the following XML:
<source usage='1024' unit='KiB'/>

This was parsed correctly and internally stored in bytes, but it
was formatted as (with an extra 's'):
<source usage='1024' units='KiB'/>
When read again, this was treated as if the units were missing,
meaning libvirt was unable to parse its own XML correctly.

The usage attribute was documented as being in KiB, but it was not
scaled if the unit was missing. Transient domains still worked,
because this was balanced by an extra 'k' in the mount options.

This patch:
Changes the parser to use 'units' instead of 'unit', as the latter
was never documented (fixing persistent domains) and some programs
(libvirt-glib, libvirt-sandbox) already parse the 'units' attribute.

Removes the extra 'k' from the tmpfs mount options, which is needed
because now we parse our own XML correctly.

Changes the default input unit to KiB to match documentation, fixing:
https://bugzilla.redhat.com/show_bug.cgi?id=1015689
2013-10-09 17:44:45 +02:00
Chen Hanxiao
4b2b078a8b lxc: do cleanup when failed to bind fs as read-only
We forgot to do cleanup when lxcContainerMountFSTmpfs
failed to bind fs as read-only.

Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2013-09-30 13:30:43 -06:00
Chen Hanxiao
9a08e2cbc6 LXC: Check the existence of dir before resolving symlinks
If a dir does not exist, raise an immediate error in logs
rather than letting virFileResolveAllLinks fail, since this
gives better error reporting to the user.

Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
2013-09-23 11:22:17 +01:00
Gao feng
1c7037cff4 LXC: don't try to mount selinux filesystem when user namespace enabled
Right now we mount selinuxfs even user namespace is enabled and
ignore the error. But we shouldn't ignore these errors when user
namespace is not enabled.

This patch skips mounting selinuxfs when user namespace enabled.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-09-12 15:18:01 +01:00
Daniel P. Berrange
75235a52bc Ensure root filesystem is recursively mounted readonly
If the guest is configured with

    <filesystem type='mount'>
      <source dir='/'/>
      <target dir='/'/>
      <readonly/>
    </filesystem>

Then any submounts under / should also end up readonly, except
for those setup as basic mounts. eg if the user has /home on a
separate volume, they'd expect /home to be readonly, but we
should not touch the /sys, /proc, etc dirs we setup ourselves.

Users can selectively make sub-mounts read-write again by
simply listing them as new mounts without the <readonly>
flag set

    <filesystem type='mount'>
      <source dir='/home'/>
      <target dir='/home'/>
    </filesystem>

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-12 12:01:49 +01:00
Daniel P. Berrange
f27f5f7edd Move array of mounts out of lxcContainerMountBasicFS
Move the array of basic mounts out of the lxcContainerMountBasicFS
function, to a global variable. This is to allow it to be referenced
by other methods wanting to know what the basic mount paths are.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-09-12 11:52:12 +01:00
Gao feng
66e2adb2ba LXC: introduce lxcContainerUnmountForSharedRoot
Move the unmounting private or useless filesystems for
container to this function.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-09-11 13:09:31 +01:00
Gao feng
4142bf46b8 LXC: umount the temporary filesystem created by libvirt
The devpts, dev and fuse filesystems are mounted temporarily.
there is no need to export them to container if container shares
the root directory with host.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-09-11 13:09:31 +01:00
Chen Hanxiao
744fb50831 LXC: fix typos in lxc_container.c
Fix docs and error message typos in lxc_container.c

Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
2013-09-06 12:14:00 +01:00
Gao feng
1583dfda7c LXC: Don't mount securityfs when user namespace enabled
Right now, securityfs is disallowed to be mounted in non-initial
user namespace, so we must avoid trying to mount securityfs in
a container which has user namespace enabled.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-09-05 12:00:07 +01:00
Daniel P. Berrange
c13a2c282b Ensure that /dev exists in the container root filesystem
If booting a container with a root FS that isn't the host's
root, we must ensure that the /dev mount point exists.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-08-13 16:26:44 +01:00
Daniel P. Berrange
2d07f84302 Honour root prefix in lxcContainerMountFSBlockAuto
The lxcContainerMountFSBlockAuto method can be used to mount the
initial root filesystem, so it cannot assume a prefix of /.oldroot.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-08-13 14:04:28 +01:00
Dan Walsh
6807238d87 Ensure securityfs is mounted readonly in container
If securityfs is available on the host, we should ensure to
mount it read-only in the container. This will avoid systemd
trying to mount it during startup causing SELinux AVCs.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-08-08 14:25:50 +01:00
Daniel P. Berrange
b64dabff27 Report full errors from virCgroupNew*
Instead of returning raw errno values, report full libvirt
errors in virCgroupNew* functions.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-22 13:09:58 +01:00
Daniel P. Berrange
2e832b18d6 LXC: Fix some error reporting in filesystem setup
A couple of places in LXC setup for filesystems did not do
a "goto cleanup" after reporting errors. While fixing this,
also add in many more debug statements to aid troubleshooting

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-22 12:32:07 +01:00
Daniel P. Berrange
11693bc6f0 LXC: Wire up the virDomainCreate{XML}WithFiles methods
Wire up the new virDomainCreate{XML}WithFiles methods in the
LXC driver, so that FDs get passed down to the init process.

The lxc_container code needs to do a little dance in order
to renumber the file descriptors it receives into linear
order, starting from STDERR_FILENO + 1.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-18 12:07:51 +01:00
Michal Privoznik
192a86cadf lxc_container: Don't call virGetGroupList during exec
Commit 75c1256 states that virGetGroupList must not be called
between fork and exec, then commit ee777e99 promptly violated
that for lxc.

Patch originally posted by Eric Blake <eblake@redhat.com>.
2013-07-17 14:26:09 +02:00
Gao feng
f87be04fd8 LXC: Create host devices for container on host side
Otherwise the container will fail to start if we
enable user namespace, since there is no rights to
do mknod in uninit user namespace.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-07-16 09:59:24 -06:00
Gao feng
14a0c4084d LXC: Move virLXCControllerChown to lxc_container.c
lxc driver will use this function to change the owner
of hot added devices.

Move virLXCControllerChown to lxc_container.c and Rename
it to lxcContainerChown.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-07-16 09:59:14 -06:00
Gao feng
7161f0a385 LXC: Setup disks for container on host side
Since mknod in container is forbidden, we should setup disks
on host side.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-07-16 09:57:38 -06:00
Daniel P. Berrange
f45dbdb213 Add a couple of debug statements to LXC driver
When failing to start a container due to inaccessible root
filesystem path, we did not log any meaningful error. Add a
few debug statements to assist diagnosis

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-12 11:06:08 +01:00
Eric Blake
ee777e9949 util: make virSetUIDGID async-signal-safe
https://bugzilla.redhat.com/show_bug.cgi?id=964358

POSIX states that multi-threaded apps should not use functions
that are not async-signal-safe between fork and exec, yet we
were using getpwuid_r and initgroups.  Although rare, it is
possible to hit deadlock in the child, when it tries to grab
a mutex that was already held by another thread in the parent.
I actually hit this deadlock when testing multiple domains
being started in parallel with a command hook, with the following
backtrace in the child:

 Thread 1 (Thread 0x7fd56bbf2700 (LWP 3212)):
 #0  __lll_lock_wait ()
     at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
 #1  0x00007fd5761e7388 in _L_lock_854 () from /lib64/libpthread.so.0
 #2  0x00007fd5761e7257 in __pthread_mutex_lock (mutex=0x7fd56be00360)
     at pthread_mutex_lock.c:61
 #3  0x00007fd56bbf9fc5 in _nss_files_getpwuid_r (uid=0, result=0x7fd56bbf0c70,
     buffer=0x7fd55c2a65f0 "", buflen=1024, errnop=0x7fd56bbf25b8)
     at nss_files/files-pwd.c:40
 #4  0x00007fd575aeff1d in __getpwuid_r (uid=0, resbuf=0x7fd56bbf0c70,
     buffer=0x7fd55c2a65f0 "", buflen=1024, result=0x7fd56bbf0cb0)
     at ../nss/getXXbyYY_r.c:253
 #5  0x00007fd578aebafc in virSetUIDGID (uid=0, gid=0) at util/virutil.c:1031
 #6  0x00007fd578aebf43 in virSetUIDGIDWithCaps (uid=0, gid=0, capBits=0,
     clearExistingCaps=true) at util/virutil.c:1388
 #7  0x00007fd578a9a20b in virExec (cmd=0x7fd55c231f10) at util/vircommand.c:654
 #8  0x00007fd578a9dfa2 in virCommandRunAsync (cmd=0x7fd55c231f10, pid=0x0)
     at util/vircommand.c:2247
 #9  0x00007fd578a9d74e in virCommandRun (cmd=0x7fd55c231f10, exitstatus=0x0)
     at util/vircommand.c:2100
 #10 0x00007fd56326fde5 in qemuProcessStart (conn=0x7fd53c000df0,
     driver=0x7fd55c0dc4f0, vm=0x7fd54800b100, migrateFrom=0x0, stdin_fd=-1,
     stdin_path=0x0, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE,
     flags=1) at qemu/qemu_process.c:3694
 ...

The solution is to split the work of getpwuid_r/initgroups into the
unsafe portions (getgrouplist, called pre-fork) and safe portions
(setgroups, called post-fork).

* src/util/virutil.h (virSetUIDGID, virSetUIDGIDWithCaps): Adjust
signature.
* src/util/virutil.c (virSetUIDGID): Add parameters.
(virSetUIDGIDWithCaps): Adjust clients.
* src/util/vircommand.c (virExec): Likewise.
* src/util/virfile.c (virFileAccessibleAs, virFileOpenForked)
(virDirCreate): Likewise.
* src/security/security_dac.c (virSecurityDACSetProcessLabel):
Likewise.
* src/lxc/lxc_container.c (lxcContainerSetID): Likewise.
* configure.ac (AC_CHECK_FUNCS_ONCE): Check for setgroups, not
initgroups.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-07-11 15:46:42 -06:00
John Ferlan
8283ef9ea2 testutils: Resolve Coverity issues
Recent changes uncovered a NEGATIVE_RETURNS in the return from sysconf()
when processing a for loop in virtTestCaptureProgramExecChild() in
testutils.c

Code review uncovered 3 other code paths with the same condition that
weren't found by Covirity, so fixed those as well.
2013-07-11 14:18:11 -04:00
Gao feng
46a46563ca LXC: remove some incorrect setting ATTRIBUTE_UNUSED
these parameters shouldn't be marked as ATTRIBUTE_UNUSED.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-07-11 13:43:31 +02:00
Daniel P. Berrange
a4b57dfb9e Convert 'int i' to 'size_t i' in src/lxc/ files
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-10 17:55:16 +01:00
Michal Privoznik
56965922ab Adapt to VIR_ALLOC and virAsprintf in src/lxc/* 2013-07-10 11:07:32 +02:00
Gao feng
468ee0bc4d LXC: hostdev: create parent directory for hostdev
Create parent directroy for hostdev automatically when we
start a lxc domain or attach a hostdev to a lxc domain.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-07-09 11:16:20 +01:00
Gao feng
c0d8c7c885 LXC: hostdev: introduce lxcContainerSetupHostdevCapsMakePath
This helper function is used to create parent directory for
the hostdev which will be added to the container. If the
parent directory of this hostdev doesn't exist, the mknod of
the hostdev will fail. eg with /dev/net/tun

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-07-09 11:15:11 +01:00
Gao feng
e7b3349f5a LXC: fix memory leak when userns configuration is incorrect
We forgot to free the stack when Kernel doesn't
support user namespace.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-07-03 12:19:50 +01:00
Daniel P. Berrange
1165e39ca3 Add some misc debugging to LXC startup
Add some debug logging of LXC wait/continue messages
and uid/gid map update code.
2013-07-02 14:00:13 +01:00
Daniel P. Berrange
293f717028 Ignore failure to mount SELinux filesystem in container
User namespaces will deny the ability to mount the SELinux
filesystem. This is harmless for libvirt's LXC needs, so the
error can be ignored.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-07-02 14:00:13 +01:00
Gao feng
e1d32bb955 LXC: Creating devices for container on host side
user namespace doesn't allow to create devices in
uninit userns. We should create devices on host side.

We first mount tmpfs on dev directroy under state dir
of container. then create devices under this dev dir.

Finally in container, mount the dev directroy created
on host to the /dev/ directroy of container.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-07-02 11:20:04 +01:00
Gao feng
9a085a228c LXC: introduce virLXCControllerSetupUserns and lxcContainerSetID
This patch introduces new helper function
virLXCControllerSetupUserns, in this function,
we set the files uid_map and gid_map of the init
task of container.

lxcContainerSetID is used for creating cred for
tasks running in container. Since after setuid/setgid,
we may be a new user. This patch calls lxcContainerSetUserns
at first to make sure the new created files belong to
right user.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-07-02 11:20:04 +01:00
Gao feng
8b58336eec LXC: enable user namespace only when user set the uidmap
User namespace will be enabled only when the idmap exist
in configuration.

If you want disable user namespace,just remove these
elements from XML.

If kernel doesn't support user namespace and idmap exist
in configuration file, libvirt lxc will start failed and
return "Kernel doesn't support user namespace" message.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-07-02 11:20:04 +01:00
Richard Weinberger
1133404c73 LXC: s/chroot/chdir in lxcContainerPivotRoot()
...fixes a trivial copy&paste error.

Signed-off-by: Richard Weinberger <richard@nod.at>
2013-06-14 11:24:41 +02:00
Daniel P. Berrange
61e672b23e Remove legacy code for single-instance devpts filesystem
Earlier commit f7e8653f dropped support for using LXC with
kernels having single-instance devpts filesystem from the
LXC controller. It forgot to remove the same code from the
LXC container setup.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-06-05 14:01:54 +01:00
Osier Yang
1ea88abd7e src/lxc: Remove the whitespace before ";" 2013-05-21 23:41:45 +08:00
Gao feng
eae1c286a1 LXC: remove unnecessary check on root filesystem
After commit c131525bec
"Auto-add a root <filesystem> element to LXC containers on startup"
for libvirt lxc, root must be existent.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-05-20 12:45:01 -06:00
Daniel P. Berrange
63ea1e5432 Re-add selinux/selinux.h to lxc_container.c
Re-add the selinux header to lxc_container.c since other
functions now use it, beyond the patch that was just
reverted.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-05-17 10:59:25 +01:00