This allows a container-type domain to have exclusive access to one of
the host's NICs.
Wire <hostdev caps=net> with the lxc_controller - when moving the newly
created veth devices into a new namespace, also look for any hostdev
devices that should be moved. Note: once the container domain has been
destroyed, there is no code that moves the interfaces back to the
original namespace. This does happen, though, probably due to default
cleanup on namespace destruction.
Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com>
Currently the LXC container code has two codepaths, depending on
whether there is a <filesystem> element with a target path of '/'.
If we automatically add a <filesystem> device with src=/ and dst=/,
for any container which has not specified a root filesystem, then
we only need one codepath for setting up the filesystem.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
For a root filesystem with type=file or type=block, the LXC
container was forgetting to actually mount it, before doing
the pivot root step.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the lxc controller sets up the devpts instance on
$rootfsdef->src, but this only works if $rootfsdef is using
type=mount. To support type=block or type=file for the root
filesystem, we must use /var/lib/libvirt/lxc/$NAME.devpts
for the temporary devpts mount in the controller
Instead of using /var/lib/libvirt/lxc/$NAME for the FUSE
filesystem, use /var/lib/libvirt/lxc/$NAME.fuse. This allows
room for other temporary mounts in the same directory
In the LXC container startup code when switching stdio
streams, we call VIR_FORCE_CLOSE on all FDs. This triggers
a huge number of warnings, but we don't see them because
stdio is closed at this point. strace() however shows them
which can confuse people debugging the code. Switch to
VIR_MASS_CLOSE to avoid this
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
libvirt lxc will fail to start when selinux is disabled.
error: Failed to start domain noroot
error: internal error guest failed to start: PATH=/bin:/sbin TERM=linux container=lxc-libvirt container_uuid=b9873916-3516-c199-8112-1592ff694a9e LIBVIRT_LXC_UUID=b9873916-3516-c199-8112-1592ff694a9e LIBVIRT_LXC_NAME=noroot /bin/sh
2013-01-09 11:04:05.384+0000: 1: info : libvirt version: 1.0.1
2013-01-09 11:04:05.384+0000: 1: error : lxcContainerMountBasicFS:546 : Failed to mkdir /sys/fs/selinux: No such file or directory
2013-01-09 11:04:05.384+0000: 7536: info : libvirt version: 1.0.1
2013-01-09 11:04:05.384+0000: 7536: error : virLXCControllerRun:1466 : error receiving signal from container: Input/output error
2013-01-09 11:04:05.404+0000: 7536: error : virCommandWait:2287 : internal error Child process (ip link del veth1) unexpected exit status 1: Cannot find device "veth1"
fix this problem by checking if selinuxfs is mounted
in host before we try to create dir /sys/fs/selinux.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
when we has no host's src mapped to container.
there is no .oldroot dir,so libvirt lxc will fail
to start when mouting meminfo.
in this case,the parameter srcprefix of function
lxcContainerMountProcFuse should be NULL.and make
this method handle NULL correctly.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Convert the host capabilities and domain config structs to
use the virArch datatype. Update the parsers and all drivers
to take account of datatype change
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This extends support for host device passthrough with LXC to
cover misc devices. In this case all we need todo is a
mknod in the container's /dev and whitelist the device in
cgroups
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This extends support for host device passthrough with LXC to
cover storage devices. In this case all we need todo is a
mknod in the container's /dev and whitelist the device in
cgroups
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This adds support for host device passthrough with the
LXC driver. Since there is only a single kernel image,
it doesn't make sense to pass through PCI devices, but
USB devices are fine. For the latter we merely need to
make the /dev/bus/usb/NNN/MMM character device exist
in the container's /dev
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently LXC guests can be given arbitrary pre-mounted
filesystems, however, for some usecases it is more appropriate
to provide block devices which the container can mount itself.
This first impl only allows for <disk type='block'>, in other
words exposing a host disk device to a container. Since LXC
does not have device namespace virtualization, we are cheating
a little bit. If the XML specifies /dev/sdc4 to be given to
the container as /dev/sda1, when we do the mknod /dev/sda1
in the container's /dev, we actually use the major:minor
number of /dev/sdc4, not /dev/sda1.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
we already have virtualize meminfo for container through fuse filesystem,
add function lxcContainerMountProcFuse to mount this meminfo file to
the container's /proc/meminfo.
So we can isolate container's /proc/meminfo from host now.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Currently the lxcContainerSetupMounts method uses the
virSecurityManagerPtr instance to obtain the mount options
string and then only passes the string down into methods
it calls. As functionality in LXC grows though, those
methods need to have direct access to the virSecurityManagerPtr
instance. So push the code down a level.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The impls of virSecurityManagerGetMountOptions had no way to
return errors, since the code was treating 'NULL' as a success
value. This is somewhat pointless, since the calling code did
not want NULL in the first place and has to translate it into
the empty string "". So change the code so that the impls can
return "" directly, allowing use of NULL for error reporting
once again
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The libvirt coding standard is to use 'function(...args...)'
instead of 'function (...args...)'. A non-trivial number of
places did not follow this rule and are fixed in this patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This needs to be done before the container starts. Turning
off the mknod capability is noticed by systemd, which will
no longer attempt to create device nodes.
This eliminates SELinux AVC messages and ugly failure messages in the journal.
Continue consolidation of process functions by moving some
helpers out of command.{c,h} into virprocess.{c,h}
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some kernel versions (at least RHEL-6 2.6.32) do not let you over-mount
an existing selinuxfs instance with a new one. Thus we must unmount the
existing instance inside our namespace.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://www.gnu.org/licenses/gpl-howto.html recommends that
the 'If not, see <url>.' phrase be a separate sentence.
* tests/securityselinuxhelper.c: Remove doubled line.
* tests/securityselinuxtest.c: Likewise.
* globally: s/; If/. If/
The introduction of /sys/fs/cgroup came in fairly recent kernels.
Prior to that time distros would pick a custom directory like
/cgroup or /dev/cgroup. We need to auto-detect where this is,
rather than hardcoding it
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Otherwise, a build may fail with:
lxc/lxc_conatiner.c: In function 'lxcContainerDropCapabilities':
lxc/lxc_container.c:1662:46: error: unused parameter 'keepReboot' [-Werror=unused-parameter]
* src/lxc/lxc_container.c (lxcContainerDropCapabilities): Mark
parameter unused.
Check whether the reboot() system call is virtualized, and if
it is, then allow the container to keep CAP_SYS_REBOOT.
Based on an original patch by Serge Hallyn
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Per the FSF address could be changed from time to time, and GNU
recommends the following now: (http://www.gnu.org/licenses/gpl-howto.html)
You should have received a copy of the GNU General Public License
along with Foobar. If not, see <http://www.gnu.org/licenses/>.
This patch removes the explicit FSF address, and uses above instead
(of course, with inserting 'Lesser' before 'General').
Except a bunch of files for security driver, all others are changed
automatically, the copyright for securify files are not complete,
that's why to do it manually:
src/security/security_selinux.h
src/security/security_driver.h
src/security/security_selinux.c
src/security/security_apparmor.h
src/security/security_apparmor.c
src/security/security_driver.c
Basically within a Secure Linux Container (virt-sandbox) we want all content
that the process within the container can write to be labeled the same. We
are labeling the physical disk correctly but when we create "RAM" based file
systems
libvirt is not labeling them, and they are defaulting to tmpfs_t, which will
will not allow the processes to write. This patch labels the RAM based file
systems correctly.
Previous commits added code to unmount the existing /proc,
/sys and /dev hierarchies on the root filesystem of the
container. This should only have been done if the container's
root filesystem was the same as the host's root. ie if
the root source is '/'. As it is, this causes LXC containersr
to fail to start if their root source is not '/'
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the veth device name state into the virLXCControllerPtr
object and stop passing it around. Also use size_t instead
of unsigned int for the array length parameters.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Since we are mounting a new /dev in the container, we must
remove any sub-mounts like /dev/shm, /dev/mqueue, etc,
otherwise they'll be recorded in /proc/mounts, but not be
accessible to applications.
Currently libvirt-lxc checks to see if the destination exists and is a
directory. If it is not a directory then the mount fails. Since
libvirt-lxc can bind mount files on an inode, this patch is needed to
allow us to bind mount files on files. Currently we want to bind mount
on top of /etc/machine-id, and /etc/adjtime
If the destination of the mount point does not exists, it checks if the
src is a directory and then attempts to create a directory, otherwise it
creates an empty file for the destination. The code will then bind mount
over the destination.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently you can configure LXC to bind a host directory to
a guest directory, but not to bind a guest directory to a
guest directory. While the guest container init could do
this itself, allowing it in the libvirt XML means a stricter
SELinux policy can be written
Introduce a new syntax for filesystems to allow use of a RAM
filesystem
<filesystem type='ram'>
<source usage='10' units='MiB'/>
<target dir='/mnt'/>
</filesystem>
The usage units default to KiB to limit consumption of host memory.
* docs/formatdomain.html.in: Document new syntax
* docs/schemas/domaincommon.rng: Add new attributes
* src/conf/domain_conf.c: Parsing/formatting of RAM filesystems
* src/lxc/lxc_container.c: Mounting of RAM filesystems
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>