docs: Convert 'drvlxc' page to rST

Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>
2025-03-03 23:53:51 +00:00 · 2022-03-10 17:57:53 +01:00 · 2022-03-10 17:57:53 +01:00 · 05a514b0b3
commit 05a514b0b3
parent c4611b327e
3 changed files with 671 additions and 823 deletions
--- a/docs/drvlxc.html.in
+++ b/docs/drvlxc.html.in
@ -1,822 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE html>
-<html xmlns="http://www.w3.org/1999/xhtml">
-  <body>
-    <h1>LXC container driver</h1>
-
-    <ul id="toc"></ul>
-
-<p>
-The libvirt LXC driver manages "Linux Containers". At their simplest, containers
-can just be thought of as a collection of processes, separated from the main
-host processes via a set of resource namespaces and constrained via control
-groups resource tunables. The libvirt LXC driver has no dependency on the LXC
-userspace tools hosted on sourceforge.net. It directly utilizes the relevant
-kernel features to build the container environment. This allows for sharing
-of many libvirt technologies across both the QEMU/KVM and LXC drivers. In
-particular sVirt for mandatory access control, auditing of operations,
-integration with control groups and many other features.
-</p>
-
-<h2><a id="cgroups">Control groups Requirements</a></h2>
-
-<p>
-In order to control the resource usage of processes inside containers, the
-libvirt LXC driver requires that certain cgroups controllers are mounted on
-the host OS. The minimum required controllers are 'cpuacct', 'memory' and
-'devices', while recommended extra controllers are 'cpu', 'freezer' and
-'blkio'. Libvirt will not mount the cgroups filesystem itself, leaving
-this up to the init system to take care of. Systemd will do the right thing
-in this respect, while for other init systems the <code>cgconfig</code>
-init service will be required. For further information, consult the general
-libvirt <a href="cgroups.html">cgroups documentation</a>.
-</p>
-
-<h2><a id="namespaces">Namespace requirements</a></h2>
-
-<p>
-In order to separate processes inside a container from those in the
-primary "host" OS environment, the libvirt LXC driver requires that
-certain kernel namespaces are compiled in. Libvirt currently requires
-the 'mount', 'ipc', 'pid', and 'uts' namespaces to be available. If
-separate network interfaces are desired, then the 'net' namespace is
-required. If the guest configuration declares a
-<a href="formatdomain.html#elementsOSContainer">UID or GID mapping</a>,
-the 'user' namespace will be enabled to apply these. <strong>A suitably
-configured UID/GID mapping is a pre-requisite to making containers
-secure, in the absence of sVirt confinement.</strong>
-</p>
-
-<h2><a id="init">Default container setup</a></h2>
-
-<h3><a id="cliargs">Command line arguments</a></h3>
-
-<p>
-When the container "init" process is started, it will typically
-not be given any command line arguments (eg the equivalent of
-the bootloader args visible in <code>/proc/cmdline</code>). If
-any arguments are desired, then must be explicitly set in the
-container XML configuration via one or more <code>initarg</code>
-elements. For example, to run <code>systemd --unit emergency.service</code>
-would use the following XML
-</p>
-
-<pre>
-&lt;os&gt;
-  &lt;type arch='x86_64'&gt;exe&lt;/type&gt;
-  &lt;init&gt;/bin/systemd&lt;/init&gt;
-  &lt;initarg&gt;--unit&lt;/initarg&gt;
-  &lt;initarg&gt;emergency.service&lt;/initarg&gt;
-&lt;/os&gt;
-</pre>
-
-<h3><a id="envvars">Environment variables</a></h3>
-
-<p>
-When the container "init" process is started, it will be given several useful
-environment variables. The following standard environment variables are mandated
-by <a href="https://www.freedesktop.org/wiki/Software/systemd/ContainerInterface">systemd container interface</a>
-to be provided by all container technologies on Linux.
-</p>
-
-<dl>
-<dt><code>container</code></dt>
-<dd>The fixed string <code>libvirt-lxc</code> to identify libvirt as the creator</dd>
-<dt><code>container_uuid</code></dt>
-<dd>The UUID assigned to the container by libvirt</dd>
-<dt><code>PATH</code></dt>
-<dd>The fixed string <code>/bin:/usr/bin</code></dd>
-<dt><code>TERM</code></dt>
-<dd>The fixed string <code>linux</code></dd>
-<dt><code>HOME</code></dt>
-<dd>The fixed string <code>/</code></dd>
-</dl>
-
-<p>
-In addition to the standard variables, the following libvirt specific
-environment variables are also provided
-</p>
-
-<dl>
-<dt><code>LIBVIRT_LXC_NAME</code></dt>
-<dd>The name assigned to the container by libvirt</dd>
-<dt><code>LIBVIRT_LXC_UUID</code></dt>
-<dd>The UUID assigned to the container by libvirt</dd>
-<dt><code>LIBVIRT_LXC_CMDLINE</code></dt>
-<dd>The unparsed command line arguments specified in the container configuration.
-Use of this is discouraged, in favour of passing arguments directly to the
-container init process via the <code>initarg</code> config element.</dd>
-</dl>
-
-<h3><a id="fsmounts">Filesystem mounts</a></h3>
-
-<p>
-In the absence of any explicit configuration, the container will
-inherit the host OS filesystem mounts. A number of mount points will
-be made read only, or re-mounted with new instances to provide
-container specific data. The following special mounts are setup
-by libvirt
-</p>
-
-<ul>
-<li><code>/dev</code> a new "tmpfs" pre-populated with authorized device nodes</li>
-<li><code>/dev/pts</code> a new private "devpts" instance for console devices</li>
-<li><code>/sys</code> the host "sysfs" instance remounted read-only</li>
-<li><code>/proc</code> a new instance of the "proc" filesystem</li>
-<li><code>/proc/sys</code> the host "/proc/sys" bind-mounted read-only</li>
-<li><code>/sys/fs/selinux</code> the host "selinux" instance remounted read-only</li>
-<li><code>/sys/fs/cgroup/NNNN</code> the host cgroups controllers bind-mounted to
-only expose the sub-tree associated with the container</li>
-<li><code>/proc/meminfo</code> a FUSE backed file reflecting memory limits of the container</li>
-</ul>
-
-
-<h3><a id="devnodes">Device nodes</a></h3>
-
-<p>
-The container init process will be started with <code>CAP_MKNOD</code>
-capability removed and blocked from re-acquiring it. As such it will
-not be able to create any device nodes in <code>/dev</code> or anywhere
-else in its filesystems. Libvirt itself will take care of pre-populating
-the <code>/dev</code> filesystem with any devices that the container
-is authorized to use. The current devices that will be made available
-to all containers are
-</p>
-
-<ul>
-<li><code>/dev/zero</code></li>
-<li><code>/dev/null</code></li>
-<li><code>/dev/full</code></li>
-<li><code>/dev/random</code></li>
-<li><code>/dev/urandom</code></li>
-<li><code>/dev/stdin</code> symlinked to <code>/proc/self/fd/0</code></li>
-<li><code>/dev/stdout</code> symlinked to <code>/proc/self/fd/1</code></li>
-<li><code>/dev/stderr</code> symlinked to <code>/proc/self/fd/2</code></li>
-<li><code>/dev/fd</code> symlinked to <code>/proc/self/fd</code></li>
-<li><code>/dev/ptmx</code> symlinked to <code>/dev/pts/ptmx</code></li>
-<li><code>/dev/console</code> symlinked to <code>/dev/pts/0</code></li>
-</ul>
-
-<p>
-In addition, for every console defined in the guest configuration,
-a symlink will be created from <code>/dev/ttyN</code> symlinked to
-the corresponding <code>/dev/pts/M</code> pseudo TTY device. The
-first console will be <code>/dev/tty1</code>, with further consoles
-numbered incrementally from there.
-</p>
-
-<p>
-Since /dev/ttyN and /dev/console are linked to the pts devices. The
-tty device of login program is pts device. The pam module securetty
-may prevent root user from logging in container. If you want root
-user to log in container successfully, add the pts device to the file
-/etc/securetty of container.
-</p>
-
-<p>
-Further block or character devices will be made available to containers
-depending on their configuration.
-</p>
-
-<h2><a id="security">Security considerations</a></h2>
-
-<p>
-The libvirt LXC driver is fairly flexible in how it can be configured,
-and as such does not enforce a requirement for strict security
-separation between a container and the host. This allows it to be used
-in scenarios where only resource control capabilities are important,
-and resource sharing is desired. Applications wishing to ensure secure
-isolation between a container and the host must ensure that they are
-writing a suitable configuration.
-</p>
-
-<h3><a id="securenetworking">Network isolation</a></h3>
-
-<p>
-If the guest configuration does not list any network interfaces,
-the <code>network</code> namespace will not be activated, and thus
-the container will see all the host's network interfaces. This will
-allow apps in the container to bind to/connect from TCP/UDP addresses
-and ports from the host OS. It also allows applications to access
-UNIX domain sockets associated with the host OS, which are in the
-abstract namespace. If access to UNIX domains sockets in the abstract
-namespace is not wanted, then applications should set the
-<code>&lt;privnet/&gt;</code> flag in the
-<code>&lt;features&gt;....&lt;/features&gt;</code> element.
-</p>
-
-<h3><a id="securefs">Filesystem isolation</a></h3>
-
-<p>
-If the guest configuration does not list any filesystems, then
-the container will be set up with a root filesystem that matches
-the host's root filesystem. As noted earlier, only a few locations
-such as <code>/dev</code>, <code>/proc</code> and <code>/sys</code>
-will be altered. This means that, in the absence of restrictions
-from sVirt, a process running as user/group N:M inside the container
-will be able to access almost exactly the same files as a process
-running as user/group N:M in the host.
-</p>
-
-<p>
-There are multiple options for restricting this. It is possible to
-simply map the existing root filesystem through to the container in
-read-only mode. Alternatively a completely separate root filesystem
-can be configured for the guest. In both cases, further sub-mounts
-can be applied to customize the content that is made visible. Note
-that in the absence of sVirt controls, it is still possible for the
-root user in a container to unmount any sub-mounts applied. The user
-namespace feature can also be used to restrict access to files based
-on the UID/GID mappings.
-</p>
-
-<p>
-Sharing the host filesystem tree, also allows applications to access
-UNIX domains sockets associated with the host OS, which are in the
-filesystem namespaces. It should be noted that a number of init
-systems including at least <code>systemd</code> and <code>upstart</code>
-have UNIX domain socket which are used to control their operation.
-Thus, if the directory/filesystem holding their UNIX domain socket is
-exposed to the container, it will be possible for a user in the container
-to invoke operations on the init service in the same way it could if
-outside the container. This also applies to other applications in the
-host which use UNIX domain sockets in the filesystem, such as DBus,
-Libvirtd, and many more. If this is not desired, then applications
-should either specify the UID/GID mapping in the configuration to
-enable user namespaces and thus block access to the UNIX domain socket
-based on permissions, or should ensure the relevant directories have
-a bind mount to hide them. This is particularly important for the
-<code>/run</code> or <code>/var/run</code> directories.
-</p>
-
-
-<h3><a id="secureusers">User and group isolation</a></h3>
-
-<p>
-If the guest configuration does not list any ID mapping, then the
-user and group IDs used inside the container will match those used
-outside the container. In addition, the capabilities associated with
-a process in the container will infer the same privileges they would
-for a process in the host. This has obvious implications for security,
-since a root user inside the container will be able to access any
-file owned by root that is visible to the container, and perform more
-or less any privileged kernel operation. In the absence of additional
-protection from sVirt, this means that the root user inside a container
-is effectively as powerful as the root user in the host. There is no
-security isolation of the root user.
-</p>
-
-<p>
-The ID mapping facility was introduced to allow for stricter control
-over the privileges of users inside the container. It allows apps to
-define rules such as "user ID 0 in the container maps to user ID 1000
-in the host". In addition the privileges associated with capabilities
-are somewhat reduced so that they cannot be used to escape from the
-container environment. A full description of user namespaces is outside
-the scope of this document, however LWN has
-<a href="https://lwn.net/Articles/532593/">a good write-up on the topic</a>.
-From the libvirt point of view, the key thing to remember is that defining
-an ID mapping for users and groups in the container XML configuration
-causes libvirt to activate the user namespace feature.
-</p>
-
-
-<h2><a id="configFiles">Location of configuration files</a></h2>
-
-<p>
-The LXC driver comes with sane default values. However, during its
-initialization it reads a configuration file which offers system
-administrator to override some of that default. The file is located
-under <code>/etc/libvirt/lxc.conf</code>
-</p>
-
-
-<h2><a id="activation">Systemd Socket Activation Integration</a></h2>
-
-<p>
-The libvirt LXC driver provides the ability to pass across pre-opened file
-descriptors when starting LXC guests. This allows for libvirt LXC to support
-systemd's <a href="http://0pointer.de/blog/projects/socket-activated-containers.html">socket
-activation capability</a>, where an incoming client connection
-in the host OS will trigger the startup of a container, which runs another
-copy of systemd which gets passed the server socket, and then activates the
-actual service handler in the container.
-</p>
-
-<p>
-Let us assume that you already have a LXC guest created, running
-a systemd instance as PID 1 inside the container, which has an
-SSHD service configured. The goal is to automatically activate
-the container when the first SSH connection is made. The first
-step is to create a couple of unit files for the host OS systemd
-instance. The <code>/etc/systemd/system/mycontainer.service</code>
-unit file specifies how systemd will start the libvirt LXC container
-</p>
-
-<pre>
-[Unit]
-Description=My little container
-
-[Service]
-ExecStart=/usr/bin/virsh -c lxc:///system start --pass-fds 3 mycontainer
-ExecStop=/usr/bin/virsh -c lxc:///system destroy mycontainer
-Type=oneshot
-RemainAfterExit=yes
-KillMode=none
-</pre>
-
-<p>
-The <code>--pass-fds 3</code> argument specifies that the file
-descriptor number 3 that <code>virsh</code> inherits from systemd,
-is to be passed into the container. Since <code>virsh</code> will
-exit immediately after starting the container, the <code>RemainAfterExit</code>
-and <code>KillMode</code> settings must be altered from their defaults.
-</p>
-
-<p>
-Next, the <code>/etc/systemd/system/mycontainer.socket</code> unit
-file is created to get the host systemd to listen on port 23 for
-TCP connections. When this unit file is activated by the first
-incoming connection, it will cause the <code>mycontainer.service</code>
-unit to be activated with the FD corresponding to the listening TCP
-socket passed in as FD 3.
-</p>
-
-<pre>
-[Unit]
-Description=The SSH socket of my little container
-
-[Socket]
-ListenStream=23
-</pre>
-
-<p>
-Port 23 was picked here so that the container doesn't conflict
-with the host's SSH which is on the normal port 22. That's it
-in terms of host side configuration.
-</p>
-
-<p>
-Inside the container, the <code>/etc/systemd/system/sshd.socket</code>
-unit file must be created
-</p>
-
-<pre>
-[Unit]
-Description=SSH Socket for Per-Connection Servers
-
-[Socket]
-ListenStream=23
-Accept=yes
-</pre>
-
-<p>
-The <code>ListenStream</code> value listed in this unit file, must
-match the value used in the host file. When systemd in the container
-receives the pre-opened FD from libvirt during container startup, it
-looks at the <code>ListenStream</code> values to figure out which
-FD to give to which service. The actual service to start is defined
-by a correspondingly named <code>/etc/systemd/system/sshd@.service</code>
-</p>
-
-<pre>
-[Unit]
-Description=SSH Per-Connection Server for %I
-
-[Service]
-ExecStart=-/usr/sbin/sshd -i
-StandardInput=socket
-</pre>
-
-<p>
-Finally, make sure this SSH service is set to start on boot of the container,
-by running the following command inside the container:
-</p>
-
-<pre>
-# mkdir -p /etc/systemd/system/sockets.target.wants/
-# ln -s /etc/systemd/system/sshd.socket /etc/systemd/system/sockets.target.wants/
-</pre>
-
-<p>
-This example shows how to activate the container based on an incoming
-SSH connection. If the container was also configured to have an httpd
-service, it may be desirable to activate it upon either an httpd or a
-sshd connection attempt. In this case, the <code>mycontainer.socket</code>
-file in the host would simply list multiple socket ports. Inside the
-container a separate <code>xxxxx.socket</code> file would need to be
-created for each service, with a corresponding <code>ListenStream</code>
-value set.
-</p>
-
-<!--
-<h2>Container configuration</h2>
-
-<h3>Init process</h3>
-
-<h3>Console devices</h3>
-
-<h3>Filesystem devices</h3>
-
-<h3>Disk devices</h3>
-
-<h3>Block devices</h3>
-
-<h3>USB devices</h3>
-
-<h3>Character devices</h3>
-
-<h3>Network devices</h3>
-->
-
-<h2>Container security</h2>
-
-<h3>sVirt SELinux</h3>
-
-<p>
-In the absence of the "user" namespace being used, containers cannot
-be considered secure against exploits of the host OS. The sVirt SELinux
-driver provides a way to secure containers even when the "user" namespace
-is not used. The cost is that writing a policy to allow execution of
-arbitrary OS is not practical. The SELinux sVirt policy is typically
-tailored to work with a simpler application confinement use case,
-as provided by the "libvirt-sandbox" project.
-</p>
-
-<h3>Auditing</h3>
-
-<p>
-The LXC driver is integrated with libvirt's auditing subsystem, which
-causes audit messages to be logged whenever there is an operation
-performed against a container which has impact on host resources.
-So for example, start/stop, device hotplug will all log audit messages
-providing details about what action occurred and any resources
-associated with it. There are the following 3 types of audit messages
-</p>
-
-<ul>
-<li><code>VIRT_MACHINE_ID</code> - details of the SELinux process and
-image security labels assigned to the container.</li>
-<li><code>VIRT_CONTROL</code> - details of an action / operation
-performed against a container. There are the following types of
-operation
-  <ul>
-    <li><code>op=start</code> - a container has been started. Provides
-      the machine name, uuid and PID of the <code>libvirt_lxc</code>
-      controller process</li>
-    <li><code>op=init</code> - the init PID of the container has been
-      started. Provides the machine name, uuid and PID of the
-      <code>libvirt_lxc</code> controller process and PID of the
-      init process (in the host PID namespace)</li>
-    <li><code>op=stop</code> - a container has been stopped. Provides
-      the machine name, uuid</li>
-  </ul>
-</li>
-<li><code>VIRT_RESOURCE</code> - details of a host resource
-associated with a container action.</li>
-</ul>
-
-<h3>Device access</h3>
-
-<p>
-All containers are launched with the CAP_MKNOD capability cleared
-and removed from the bounding set. Libvirt will ensure that the
-/dev filesystem is pre-populated with all devices that a container
-is allowed to use. In addition, the cgroup "device" controller is
-configured to block read/write/mknod from all devices except those
-that a container is authorized to use.
-</p>
-
-<h2><a id="exconfig">Example configurations</a></h2>
-
-<h3>Example config version 1</h3>
-<p></p>
-<pre>
-&lt;domain type='lxc'&gt;
-  &lt;name&gt;vm1&lt;/name&gt;
-  &lt;memory&gt;500000&lt;/memory&gt;
-  &lt;os&gt;
-    &lt;type&gt;exe&lt;/type&gt;
-    &lt;init&gt;/bin/sh&lt;/init&gt;
-  &lt;/os&gt;
-  &lt;vcpu&gt;1&lt;/vcpu&gt;
-  &lt;clock offset='utc'/&gt;
-  &lt;on_poweroff&gt;destroy&lt;/on_poweroff&gt;
-  &lt;on_reboot&gt;restart&lt;/on_reboot&gt;
-  &lt;on_crash&gt;destroy&lt;/on_crash&gt;
-  &lt;devices&gt;
-    &lt;emulator&gt;/usr/libexec/libvirt_lxc&lt;/emulator&gt;
-    &lt;interface type='network'&gt;
-      &lt;source network='default'/&gt;
-    &lt;/interface&gt;
-    &lt;console type='pty' /&gt;
-  &lt;/devices&gt;
-&lt;/domain&gt;
-</pre>
-
-<p>
-In the &lt;emulator&gt; element, be sure you specify the correct path
-to libvirt_lxc, if it does not live in /usr/libexec on your system.
-</p>
-
-<p>
-The next example assumes there is a private root filesystem
-(perhaps hand-crafted using busybox, or installed from media,
-debootstrap, whatever) under /opt/vm-1-root:
-</p>
-<p></p>
-<pre>
-&lt;domain type='lxc'&gt;
-  &lt;name&gt;vm1&lt;/name&gt;
-  &lt;memory&gt;32768&lt;/memory&gt;
-  &lt;os&gt;
-    &lt;type&gt;exe&lt;/type&gt;
-    &lt;init&gt;/init&lt;/init&gt;
-  &lt;/os&gt;
-  &lt;vcpu&gt;1&lt;/vcpu&gt;
-  &lt;clock offset='utc'/&gt;
-  &lt;on_poweroff&gt;destroy&lt;/on_poweroff&gt;
-  &lt;on_reboot&gt;restart&lt;/on_reboot&gt;
-  &lt;on_crash&gt;destroy&lt;/on_crash&gt;
-  &lt;devices&gt;
-    &lt;emulator&gt;/usr/libexec/libvirt_lxc&lt;/emulator&gt;
-    &lt;filesystem type='mount'&gt;
-      &lt;source dir='/opt/vm-1-root'/&gt;
-      &lt;target dir='/'/&gt;
-    &lt;/filesystem&gt;
-    &lt;interface type='network'&gt;
-      &lt;source network='default'/&gt;
-    &lt;/interface&gt;
-    &lt;console type='pty' /&gt;
-  &lt;/devices&gt;
-&lt;/domain&gt;
-</pre>
-
-<h2><a id="capabilities">Altering the available capabilities</a></h2>
-
-<p>
-By default the libvirt LXC driver drops some capabilities among which CAP_MKNOD.
-However <span class="since">since 1.2.6</span> libvirt can be told to keep or
-drop some capabilities using a domain configuration like the following:
-</p>
-<pre>
-...
-&lt;features&gt;
-  &lt;capabilities policy='default'&gt;
-    &lt;mknod state='on'/&gt;
-    &lt;sys_chroot state='off'/&gt;
-  &lt;/capabilities&gt;
-&lt;/features&gt;
-...
-</pre>
-<p>
-The capabilities children elements are named after the capabilities as defined in
-<code>man 7 capabilities</code>. An <code>off</code> state tells libvirt to drop the
-capability, while an <code>on</code> state will force to keep the capability even though
-this one is dropped by default.
-</p>
-<p>
-The <code>policy</code> attribute can be one of <code>default</code>, <code>allow</code>
-or <code>deny</code>. It defines the default rules for capabilities: either keep the
-default behavior that is dropping a few selected capabilities, or keep all capabilities
-or drop all capabilities. The interest of <code>allow</code> and <code>deny</code> is that
-they guarantee that all capabilities will be kept (or removed) even if new ones are added
-later.
-</p>
-<p>
-The following example, drops all capabilities but CAP_MKNOD:
-</p>
-<pre>
-...
-&lt;features&gt;
-  &lt;capabilities policy='deny'&gt;
-    &lt;mknod state='on'/&gt;
-  &lt;/capabilities&gt;
-&lt;/features&gt;
-...
-</pre>
-<p>
-Note that allowing capabilities that are normally dropped by default can seriously
-affect the security of the container and the host.
-</p>
-
-<h2><a id="share">Inherit namespaces</a></h2>
-
-<p>
-Libvirt allows you to inherit the namespace from container/process just like lxc tools
-or docker provides to share the network namespace. The following can be used to share
-required namespaces. If we want to share only one then the other namespaces can be ignored.
-The netns option is specific to sharenet. It can be used in cases we want to use existing network namespace
-rather than creating new network namespace for the container. In this case privnet option will be
-ignored.
-</p>
-<pre>
-&lt;domain type='lxc' xmlns:lxc='http://libvirt.org/schemas/domain/lxc/1.0'&gt;
-...
-&lt;lxc:namespace&gt;
-  &lt;lxc:sharenet type='netns' value='red'/&gt;
-  &lt;lxc:shareuts type='name' value='container1'/&gt;
-  &lt;lxc:shareipc type='pid' value='12345'/&gt;
-&lt;/lxc:namespace&gt;
-&lt;/domain&gt;
-</pre>
-
-<p>
-The use of namespace passthrough requires libvirt >= 1.2.19
-</p>
-
-<h2><a id="usage">Container usage / management</a></h2>
-
-<p>
-As with any libvirt virtualization driver, LXC containers can be
-managed via a wide variety of libvirt based tools. At the lowest
-level the <code>virsh</code> command can be used to perform many
-tasks, by passing the <code>-c lxc:///system</code> argument. As an
-alternative to repeating the URI with every command, the <code>LIBVIRT_DEFAULT_URI</code>
-environment variable can be set to <code>lxc:///system</code>. The
-examples that follow outline some common operations with virsh
-and LXC. For further details about usage of virsh consult its
-manual page.
-</p>
-
-<h3><a id="usageSave">Defining (saving) container configuration</a></h3>
-
-<p>
-The <code>virsh define</code> command takes an XML configuration
-document and loads it into libvirt, saving the configuration on disk
-</p>
-
-<pre>
-# virsh -c lxc:///system define myguest.xml
-</pre>
-
-<h3><a id="usageView">Viewing container configuration</a></h3>
-
-<p>
-The <code>virsh dumpxml</code> command can be used to view the
-current XML configuration of a container. By default the XML
-output reflects the current state of the container. If the
-container is running, it is possible to explicitly request the
-persistent configuration, instead of the current live configuration
-using the <code>--inactive</code> flag
-</p>
-
-<pre>
-# virsh -c lxc:///system dumpxml myguest
-</pre>
-
-<h3><a id="usageStart">Starting containers</a></h3>
-
-<p>
-The <code>virsh start</code> command can be used to start a
-container from a previously defined persistent configuration
-</p>
-
-<pre>
-# virsh -c lxc:///system start myguest
-</pre>
-
-<p>
-It is also possible to start so called "transient" containers,
-which do not require a persistent configuration to be saved
-by libvirt, using the <code>virsh create</code> command.
-</p>
-
-<pre>
-# virsh -c lxc:///system create myguest.xml
-</pre>
-
-
-<h3><a id="usageStop">Stopping containers</a></h3>
-
-<p>
-The <code>virsh shutdown</code> command can be used
-to request a graceful shutdown of the container. By default
-this command will first attempt to send a message to the
-init process via the <code>/dev/initctl</code> device node.
-If no such device node exists, then it will send SIGTERM
-to PID 1 inside the container.
-</p>
-
-<pre>
-# virsh -c lxc:///system shutdown myguest
-</pre>
-
-<p>
-If the container does not respond to the graceful shutdown
-request, it can be forcibly stopped using the <code>virsh destroy</code>
-</p>
-
-<pre>
-# virsh -c lxc:///system destroy myguest
-</pre>
-
-
-<h3><a id="usageReboot">Rebooting a container</a></h3>
-
-<p>
-The <code>virsh reboot</code> command can be used
-to request a graceful shutdown of the container. By default
-this command will first attempt to send a message to the
-init process via the <code>/dev/initctl</code> device node.
-If no such device node exists, then it will send SIGHUP
-to PID 1 inside the container.
-</p>
-
-<pre>
-# virsh -c lxc:///system reboot myguest
-</pre>
-
-<h3><a id="usageDelete">Undefining (deleting) a container configuration</a></h3>
-
-<p>
-The <code>virsh undefine</code> command can be used to delete the
-persistent configuration of a container. If the guest is currently
-running, this will turn it into a "transient" guest.
-</p>
-
-<pre>
-# virsh -c lxc:///system undefine myguest
-</pre>
-
-<h3><a id="usageConnect">Connecting to a container console</a></h3>
-
-<p>
-The <code>virsh console</code> command can be used to connect
-to the text console associated with a container.
-</p>
-
-<pre>
-# virsh -c lxc:///system console myguest
-</pre>
-
-<p>
-If the container has been configured with multiple console devices,
-then the <code>--devname</code> argument can be used to choose the
-console to connect to.
-In LXC, multiple consoles will be named
-as 'console0', 'console1', 'console2', etc.
-</p>
-
-<pre>
-# virsh -c lxc:///system console myguest --devname console1
-</pre>
-
-<h3><a id="usageEnter">Running commands in a container</a></h3>
-
-<p>
-The <code>virsh lxc-enter-namespace</code> command can be used
-to enter the namespaces and security context of a container
-and then execute an arbitrary command.
-</p>
-
-<pre>
-# virsh -c lxc:///system lxc-enter-namespace myguest -- /bin/ls -al /dev
-</pre>
-
-<h3><a id="usageTop">Monitoring container utilization</a></h3>
-
-<p>
-The <code>virt-top</code> command can be used to monitor the
-activity and resource utilization of all containers on a
-host
-</p>
-
-<pre>
-# virt-top -c lxc:///system
-</pre>
-
-<h3><a id="usageConvert">Converting LXC container configuration</a></h3>
-
-<p>
-The <code>virsh domxml-from-native</code> command can be used to convert
-most of the LXC container configuration into a domain XML fragment
-</p>
-
-<pre>
-# virsh -c lxc:///system domxml-from-native lxc-tools /var/lib/lxc/myguest/config
-</pre>
-
-<p>
-This conversion has some limitations due to the fact that the
-domxml-from-native command output has to be independent of the host. Here
-are a few things to take care of before converting:
-</p>
-
-<ul>
-<li>
-Replace the fstab file referenced by <tt>lxc.mount</tt> by the corresponding
-lxc.mount.entry lines.
-</li>
-<li>
-Replace all relative sizes of tmpfs mount entries to absolute sizes. Also
-make sure that tmpfs entries all have a size option (default is 50%).
-</li>
-<li>
-Define <tt>lxc.cgroup.memory.limit_in_bytes</tt> to properly limit the memory
-available to the container. The conversion will use 64MiB as the default.
-</li>
-</ul>
-
-  </body>
-</html>
--- a/docs/drvlxc.rst
+++ b/docs/drvlxc.rst
@ -0,0 +1,670 @@
+.. role:: since
+
+====================
+LXC container driver
+====================
+
+.. contents::
+
+The libvirt LXC driver manages "Linux Containers". At their simplest, containers
+can just be thought of as a collection of processes, separated from the main
+host processes via a set of resource namespaces and constrained via control
+groups resource tunables. The libvirt LXC driver has no dependency on the LXC
+userspace tools hosted on sourceforge.net. It directly utilizes the relevant
+kernel features to build the container environment. This allows for sharing of
+many libvirt technologies across both the QEMU/KVM and LXC drivers. In
+particular sVirt for mandatory access control, auditing of operations,
+integration with control groups and many other features.
+
+Control groups Requirements
+---------------------------
+
+In order to control the resource usage of processes inside containers, the
+libvirt LXC driver requires that certain cgroups controllers are mounted on the
+host OS. The minimum required controllers are 'cpuacct', 'memory' and 'devices',
+while recommended extra controllers are 'cpu', 'freezer' and 'blkio'. Libvirt
+will not mount the cgroups filesystem itself, leaving this up to the init system
+to take care of. Systemd will do the right thing in this respect, while for
+other init systems the ``cgconfig`` init service will be required. For further
+information, consult the general libvirt `cgroups
+documentation <cgroups.html>`__.
+
+Namespace requirements
+----------------------
+
+In order to separate processes inside a container from those in the primary
+"host" OS environment, the libvirt LXC driver requires that certain kernel
+namespaces are compiled in. Libvirt currently requires the 'mount', 'ipc',
+'pid', and 'uts' namespaces to be available. If separate network interfaces are
+desired, then the 'net' namespace is required. If the guest configuration
+declares a `UID or GID mapping <formatdomain.html#elementsOSContainer>`__, the
+'user' namespace will be enabled to apply these. **A suitably configured UID/GID
+mapping is a pre-requisite to making containers secure, in the absence of sVirt
+confinement.**
+
+Default container setup
+-----------------------
+
+Command line arguments
+~~~~~~~~~~~~~~~~~~~~~~
+
+When the container "init" process is started, it will typically not be given any
+command line arguments (eg the equivalent of the bootloader args visible in
+``/proc/cmdline``). If any arguments are desired, then must be explicitly set in
+the container XML configuration via one or more ``initarg`` elements. For
+example, to run ``systemd --unit emergency.service`` would use the following XML
+
+::
+
+   <os>
+     <type arch='x86_64'>exe</type>
+     <init>/bin/systemd</init>
+     <initarg>--unit</initarg>
+     <initarg>emergency.service</initarg>
+   </os>
+
+Environment variables
+~~~~~~~~~~~~~~~~~~~~~
+
+When the container "init" process is started, it will be given several useful
+environment variables. The following standard environment variables are mandated
+by `systemd container
+interface <https://www.freedesktop.org/wiki/Software/systemd/ContainerInterface>`__
+to be provided by all container technologies on Linux.
+
+``container``
+   The fixed string ``libvirt-lxc`` to identify libvirt as the creator
+``container_uuid``
+   The UUID assigned to the container by libvirt
+``PATH``
+   The fixed string ``/bin:/usr/bin``
+``TERM``
+   The fixed string ``linux``
+``HOME``
+   The fixed string ``/``
+
+In addition to the standard variables, the following libvirt specific
+environment variables are also provided
+
+``LIBVIRT_LXC_NAME``
+   The name assigned to the container by libvirt
+``LIBVIRT_LXC_UUID``
+   The UUID assigned to the container by libvirt
+``LIBVIRT_LXC_CMDLINE``
+   The unparsed command line arguments specified in the container configuration.
+   Use of this is discouraged, in favour of passing arguments directly to the
+   container init process via the ``initarg`` config element.
+
+Filesystem mounts
+~~~~~~~~~~~~~~~~~
+
+In the absence of any explicit configuration, the container will inherit the
+host OS filesystem mounts. A number of mount points will be made read only, or
+re-mounted with new instances to provide container specific data. The following
+special mounts are setup by libvirt
+
+-  ``/dev`` a new "tmpfs" pre-populated with authorized device nodes
+-  ``/dev/pts`` a new private "devpts" instance for console devices
+-  ``/sys`` the host "sysfs" instance remounted read-only
+-  ``/proc`` a new instance of the "proc" filesystem
+-  ``/proc/sys`` the host "/proc/sys" bind-mounted read-only
+-  ``/sys/fs/selinux`` the host "selinux" instance remounted read-only
+-  ``/sys/fs/cgroup/NNNN`` the host cgroups controllers bind-mounted to only
+   expose the sub-tree associated with the container
+-  ``/proc/meminfo`` a FUSE backed file reflecting memory limits of the
+   container
+
+Device nodes
+~~~~~~~~~~~~
+
+The container init process will be started with ``CAP_MKNOD`` capability removed
+and blocked from re-acquiring it. As such it will not be able to create any
+device nodes in ``/dev`` or anywhere else in its filesystems. Libvirt itself
+will take care of pre-populating the ``/dev`` filesystem with any devices that
+the container is authorized to use. The current devices that will be made
+available to all containers are
+
+-  ``/dev/zero``
+-  ``/dev/null``
+-  ``/dev/full``
+-  ``/dev/random``
+-  ``/dev/urandom``
+-  ``/dev/stdin`` symlinked to ``/proc/self/fd/0``
+-  ``/dev/stdout`` symlinked to ``/proc/self/fd/1``
+-  ``/dev/stderr`` symlinked to ``/proc/self/fd/2``
+-  ``/dev/fd`` symlinked to ``/proc/self/fd``
+-  ``/dev/ptmx`` symlinked to ``/dev/pts/ptmx``
+-  ``/dev/console`` symlinked to ``/dev/pts/0``
+
+In addition, for every console defined in the guest configuration, a symlink
+will be created from ``/dev/ttyN`` symlinked to the corresponding ``/dev/pts/M``
+pseudo TTY device. The first console will be ``/dev/tty1``, with further
+consoles numbered incrementally from there.
+
+Since /dev/ttyN and /dev/console are linked to the pts devices. The tty device
+of login program is pts device. The pam module securetty may prevent root user
+from logging in container. If you want root user to log in container
+successfully, add the pts device to the file /etc/securetty of container.
+
+Further block or character devices will be made available to containers
+depending on their configuration.
+
+Security considerations
+-----------------------
+
+The libvirt LXC driver is fairly flexible in how it can be configured, and as
+such does not enforce a requirement for strict security separation between a
+container and the host. This allows it to be used in scenarios where only
+resource control capabilities are important, and resource sharing is desired.
+Applications wishing to ensure secure isolation between a container and the host
+must ensure that they are writing a suitable configuration.
+
+Network isolation
+~~~~~~~~~~~~~~~~~
+
+If the guest configuration does not list any network interfaces, the ``network``
+namespace will not be activated, and thus the container will see all the host's
+network interfaces. This will allow apps in the container to bind to/connect
+from TCP/UDP addresses and ports from the host OS. It also allows applications
+to access UNIX domain sockets associated with the host OS, which are in the
+abstract namespace. If access to UNIX domains sockets in the abstract namespace
+is not wanted, then applications should set the ``<privnet/>`` flag in the
+``<features>....</features>`` element.
+
+Filesystem isolation
+~~~~~~~~~~~~~~~~~~~~
+
+If the guest configuration does not list any filesystems, then the container
+will be set up with a root filesystem that matches the host's root filesystem.
+As noted earlier, only a few locations such as ``/dev``, ``/proc`` and ``/sys``
+will be altered. This means that, in the absence of restrictions from sVirt, a
+process running as user/group N:M inside the container will be able to access
+almost exactly the same files as a process running as user/group N:M in the
+host.
+
+There are multiple options for restricting this. It is possible to simply map
+the existing root filesystem through to the container in read-only mode.
+Alternatively a completely separate root filesystem can be configured for the
+guest. In both cases, further sub-mounts can be applied to customize the content
+that is made visible. Note that in the absence of sVirt controls, it is still
+possible for the root user in a container to unmount any sub-mounts applied. The
+user namespace feature can also be used to restrict access to files based on the
+UID/GID mappings.
+
+Sharing the host filesystem tree, also allows applications to access UNIX
+domains sockets associated with the host OS, which are in the filesystem
+namespaces. It should be noted that a number of init systems including at least
+``systemd`` and ``upstart`` have UNIX domain socket which are used to control
+their operation. Thus, if the directory/filesystem holding their UNIX domain
+socket is exposed to the container, it will be possible for a user in the
+container to invoke operations on the init service in the same way it could if
+outside the container. This also applies to other applications in the host which
+use UNIX domain sockets in the filesystem, such as DBus, Libvirtd, and many
+more. If this is not desired, then applications should either specify the
+UID/GID mapping in the configuration to enable user namespaces and thus block
+access to the UNIX domain socket based on permissions, or should ensure the
+relevant directories have a bind mount to hide them. This is particularly
+important for the ``/run`` or ``/var/run`` directories.
+
+User and group isolation
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+If the guest configuration does not list any ID mapping, then the user and group
+IDs used inside the container will match those used outside the container. In
+addition, the capabilities associated with a process in the container will infer
+the same privileges they would for a process in the host. This has obvious
+implications for security, since a root user inside the container will be able
+to access any file owned by root that is visible to the container, and perform
+more or less any privileged kernel operation. In the absence of additional
+protection from sVirt, this means that the root user inside a container is
+effectively as powerful as the root user in the host. There is no security
+isolation of the root user.
+
+The ID mapping facility was introduced to allow for stricter control over the
+privileges of users inside the container. It allows apps to define rules such as
+"user ID 0 in the container maps to user ID 1000 in the host". In addition the
+privileges associated with capabilities are somewhat reduced so that they cannot
+be used to escape from the container environment. A full description of user
+namespaces is outside the scope of this document, however LWN has `a good
+write-up on the topic <https://lwn.net/Articles/532593/>`__. From the libvirt
+point of view, the key thing to remember is that defining an ID mapping for
+users and groups in the container XML configuration causes libvirt to activate
+the user namespace feature.
+
+Location of configuration files
+-------------------------------
+
+The LXC driver comes with sane default values. However, during its
+initialization it reads a configuration file which offers system administrator
+to override some of that default. The file is located under
+``/etc/libvirt/lxc.conf``
+
+Systemd Socket Activation Integration
+-------------------------------------
+
+The libvirt LXC driver provides the ability to pass across pre-opened file
+descriptors when starting LXC guests. This allows for libvirt LXC to support
+systemd's `socket activation
+capability <http://0pointer.de/blog/projects/socket-activated-containers.html>`__,
+where an incoming client connection in the host OS will trigger the startup of a
+container, which runs another copy of systemd which gets passed the server
+socket, and then activates the actual service handler in the container.
+
+Let us assume that you already have a LXC guest created, running a systemd
+instance as PID 1 inside the container, which has an SSHD service configured.
+The goal is to automatically activate the container when the first SSH
+connection is made. The first step is to create a couple of unit files for the
+host OS systemd instance. The ``/etc/systemd/system/mycontainer.service`` unit
+file specifies how systemd will start the libvirt LXC container
+
+::
+
+   [Unit]
+   Description=My little container
+
+   [Service]
+   ExecStart=/usr/bin/virsh -c lxc:///system start --pass-fds 3 mycontainer
+   ExecStop=/usr/bin/virsh -c lxc:///system destroy mycontainer
+   Type=oneshot
+   RemainAfterExit=yes
+   KillMode=none
+
+The ``--pass-fds 3`` argument specifies that the file descriptor number 3 that
+``virsh`` inherits from systemd, is to be passed into the container. Since
+``virsh`` will exit immediately after starting the container, the
+``RemainAfterExit`` and ``KillMode`` settings must be altered from their
+defaults.
+
+Next, the ``/etc/systemd/system/mycontainer.socket`` unit file is created to get
+the host systemd to listen on port 23 for TCP connections. When this unit file
+is activated by the first incoming connection, it will cause the
+``mycontainer.service`` unit to be activated with the FD corresponding to the
+listening TCP socket passed in as FD 3.
+
+::
+
+   [Unit]
+   Description=The SSH socket of my little container
+
+   [Socket]
+   ListenStream=23
+
+Port 23 was picked here so that the container doesn't conflict with the host's
+SSH which is on the normal port 22. That's it in terms of host side
+configuration.
+
+Inside the container, the ``/etc/systemd/system/sshd.socket`` unit file must be
+created
+
+::
+
+   [Unit]
+   Description=SSH Socket for Per-Connection Servers
+
+   [Socket]
+   ListenStream=23
+   Accept=yes
+
+The ``ListenStream`` value listed in this unit file, must match the value used
+in the host file. When systemd in the container receives the pre-opened FD from
+libvirt during container startup, it looks at the ``ListenStream`` values to
+figure out which FD to give to which service. The actual service to start is
+defined by a correspondingly named ``/etc/systemd/system/sshd@.service``
+
+::
+
+   [Unit]
+   Description=SSH Per-Connection Server for %I
+
+   [Service]
+   ExecStart=-/usr/sbin/sshd -i
+   StandardInput=socket
+
+Finally, make sure this SSH service is set to start on boot of the container, by
+running the following command inside the container:
+
+::
+
+   # mkdir -p /etc/systemd/system/sockets.target.wants/
+   # ln -s /etc/systemd/system/sshd.socket /etc/systemd/system/sockets.target.wants/
+
+This example shows how to activate the container based on an incoming SSH
+connection. If the container was also configured to have an httpd service, it
+may be desirable to activate it upon either an httpd or a sshd connection
+attempt. In this case, the ``mycontainer.socket`` file in the host would simply
+list multiple socket ports. Inside the container a separate ``xxxxx.socket``
+file would need to be created for each service, with a corresponding
+``ListenStream`` value set.
+
+Container security
+------------------
+
+sVirt SELinux
+~~~~~~~~~~~~~
+
+In the absence of the "user" namespace being used, containers cannot be
+considered secure against exploits of the host OS. The sVirt SELinux driver
+provides a way to secure containers even when the "user" namespace is not used.
+The cost is that writing a policy to allow execution of arbitrary OS is not
+practical. The SELinux sVirt policy is typically tailored to work with a simpler
+application confinement use case, as provided by the "libvirt-sandbox" project.
+
+Auditing
+~~~~~~~~
+
+The LXC driver is integrated with libvirt's auditing subsystem, which causes
+audit messages to be logged whenever there is an operation performed against a
+container which has impact on host resources. So for example, start/stop, device
+hotplug will all log audit messages providing details about what action occurred
+and any resources associated with it. There are the following 3 types of audit
+messages
+
+-  ``VIRT_MACHINE_ID`` - details of the SELinux process and image security
+   labels assigned to the container.
+-  ``VIRT_CONTROL`` - details of an action / operation performed against a
+   container. There are the following types of operation
+
+   -  ``op=start`` - a container has been started. Provides the machine name,
+      uuid and PID of the ``libvirt_lxc`` controller process
+   -  ``op=init`` - the init PID of the container has been started. Provides the
+      machine name, uuid and PID of the ``libvirt_lxc`` controller process and
+      PID of the init process (in the host PID namespace)
+   -  ``op=stop`` - a container has been stopped. Provides the machine name,
+      uuid
+
+-  ``VIRT_RESOURCE`` - details of a host resource associated with a container
+   action.
+
+Device access
+~~~~~~~~~~~~~
+
+All containers are launched with the CAP_MKNOD capability cleared and removed
+from the bounding set. Libvirt will ensure that the /dev filesystem is
+pre-populated with all devices that a container is allowed to use. In addition,
+the cgroup "device" controller is configured to block read/write/mknod from all
+devices except those that a container is authorized to use.
+
+Example configurations
+----------------------
+
+Example config version 1
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+   <domain type='lxc'>
+     <name>vm1</name>
+     <memory>500000</memory>
+     <os>
+       <type>exe</type>
+       <init>/bin/sh</init>
+     </os>
+     <vcpu>1</vcpu>
+     <clock offset='utc'/>
+     <on_poweroff>destroy</on_poweroff>
+     <on_reboot>restart</on_reboot>
+     <on_crash>destroy</on_crash>
+     <devices>
+       <emulator>/usr/libexec/libvirt_lxc</emulator>
+       <interface type='network'>
+         <source network='default'/>
+       </interface>
+       <console type='pty' />
+     </devices>
+   </domain>
+
+In the <emulator> element, be sure you specify the correct path to libvirt_lxc,
+if it does not live in /usr/libexec on your system.
+
+The next example assumes there is a private root filesystem (perhaps
+hand-crafted using busybox, or installed from media, debootstrap, whatever)
+under /opt/vm-1-root:
+
+::
+
+   <domain type='lxc'>
+     <name>vm1</name>
+     <memory>32768</memory>
+     <os>
+       <type>exe</type>
+       <init>/init</init>
+     </os>
+     <vcpu>1</vcpu>
+     <clock offset='utc'/>
+     <on_poweroff>destroy</on_poweroff>
+     <on_reboot>restart</on_reboot>
+     <on_crash>destroy</on_crash>
+     <devices>
+       <emulator>/usr/libexec/libvirt_lxc</emulator>
+       <filesystem type='mount'>
+         <source dir='/opt/vm-1-root'/>
+         <target dir='/'/>
+       </filesystem>
+       <interface type='network'>
+         <source network='default'/>
+       </interface>
+       <console type='pty' />
+     </devices>
+   </domain>
+
+Altering the available capabilities
+-----------------------------------
+
+By default the libvirt LXC driver drops some capabilities among which CAP_MKNOD.
+However :since:`since 1.2.6` libvirt can be told to keep or drop some
+capabilities using a domain configuration like the following:
+
+::
+
+   ...
+   <features>
+     <capabilities policy='default'>
+       <mknod state='on'/>
+       <sys_chroot state='off'/>
+     </capabilities>
+   </features>
+   ...
+
+The capabilities children elements are named after the capabilities as defined
+in ``man 7 capabilities``. An ``off`` state tells libvirt to drop the
+capability, while an ``on`` state will force to keep the capability even though
+this one is dropped by default.
+
+The ``policy`` attribute can be one of ``default``, ``allow`` or ``deny``. It
+defines the default rules for capabilities: either keep the default behavior
+that is dropping a few selected capabilities, or keep all capabilities or drop
+all capabilities. The interest of ``allow`` and ``deny`` is that they guarantee
+that all capabilities will be kept (or removed) even if new ones are added
+later.
+
+The following example, drops all capabilities but CAP_MKNOD:
+
+::
+
+   ...
+   <features>
+     <capabilities policy='deny'>
+       <mknod state='on'/>
+     </capabilities>
+   </features>
+   ...
+
+Note that allowing capabilities that are normally dropped by default can
+seriously affect the security of the container and the host.
+
+Inherit namespaces
+------------------
+
+Libvirt allows you to inherit the namespace from container/process just like lxc
+tools or docker provides to share the network namespace. The following can be
+used to share required namespaces. If we want to share only one then the other
+namespaces can be ignored. The netns option is specific to sharenet. It can be
+used in cases we want to use existing network namespace rather than creating new
+network namespace for the container. In this case privnet option will be
+ignored.
+
+::
+
+   <domain type='lxc' xmlns:lxc='http://libvirt.org/schemas/domain/lxc/1.0'>
+   ...
+   <lxc:namespace>
+     <lxc:sharenet type='netns' value='red'/>
+     <lxc:shareuts type='name' value='container1'/>
+     <lxc:shareipc type='pid' value='12345'/>
+   </lxc:namespace>
+   </domain>
+
+The use of namespace passthrough requires libvirt >= 1.2.19
+
+Container usage / management
+----------------------------
+
+As with any libvirt virtualization driver, LXC containers can be managed via a
+wide variety of libvirt based tools. At the lowest level the ``virsh`` command
+can be used to perform many tasks, by passing the ``-c lxc:///system`` argument.
+As an alternative to repeating the URI with every command, the
+``LIBVIRT_DEFAULT_URI`` environment variable can be set to ``lxc:///system``.
+The examples that follow outline some common operations with virsh and LXC. For
+further details about usage of virsh consult its manual page.
+
+Defining (saving) container configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ``virsh define`` command takes an XML configuration document and loads it
+into libvirt, saving the configuration on disk
+
+::
+
+   # virsh -c lxc:///system define myguest.xml
+
+Viewing container configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ``virsh dumpxml`` command can be used to view the current XML configuration
+of a container. By default the XML output reflects the current state of the
+container. If the container is running, it is possible to explicitly request the
+persistent configuration, instead of the current live configuration using the
+``--inactive`` flag
+
+::
+
+   # virsh -c lxc:///system dumpxml myguest
+
+Starting containers
+~~~~~~~~~~~~~~~~~~~
+
+The ``virsh start`` command can be used to start a container from a previously
+defined persistent configuration
+
+::
+
+   # virsh -c lxc:///system start myguest
+
+It is also possible to start so called "transient" containers, which do not
+require a persistent configuration to be saved by libvirt, using the
+``virsh create`` command.
+
+::
+
+   # virsh -c lxc:///system create myguest.xml
+
+Stopping containers
+~~~~~~~~~~~~~~~~~~~
+
+The ``virsh shutdown`` command can be used to request a graceful shutdown of the
+container. By default this command will first attempt to send a message to the
+init process via the ``/dev/initctl`` device node. If no such device node
+exists, then it will send SIGTERM to PID 1 inside the container.
+
+::
+
+   # virsh -c lxc:///system shutdown myguest
+
+If the container does not respond to the graceful shutdown request, it can be
+forcibly stopped using the ``virsh destroy``
+
+::
+
+   # virsh -c lxc:///system destroy myguest
+
+Rebooting a container
+~~~~~~~~~~~~~~~~~~~~~
+
+The ``virsh reboot`` command can be used to request a graceful shutdown of the
+container. By default this command will first attempt to send a message to the
+init process via the ``/dev/initctl`` device node. If no such device node
+exists, then it will send SIGHUP to PID 1 inside the container.
+
+::
+
+   # virsh -c lxc:///system reboot myguest
+
+Undefining (deleting) a container configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ``virsh undefine`` command can be used to delete the persistent
+configuration of a container. If the guest is currently running, this will turn
+it into a "transient" guest.
+
+::
+
+   # virsh -c lxc:///system undefine myguest
+
+Connecting to a container console
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ``virsh console`` command can be used to connect to the text console
+associated with a container.
+
+::
+
+   # virsh -c lxc:///system console myguest
+
+If the container has been configured with multiple console devices, then the
+``--devname`` argument can be used to choose the console to connect to. In LXC,
+multiple consoles will be named as 'console0', 'console1', 'console2', etc.
+
+::
+
+   # virsh -c lxc:///system console myguest --devname console1
+
+Running commands in a container
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ``virsh lxc-enter-namespace`` command can be used to enter the namespaces
+and security context of a container and then execute an arbitrary command.
+
+::
+
+   # virsh -c lxc:///system lxc-enter-namespace myguest -- /bin/ls -al /dev
+
+Monitoring container utilization
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ``virt-top`` command can be used to monitor the activity and resource
+utilization of all containers on a host
+
+::
+
+   # virt-top -c lxc:///system
+
+Converting LXC container configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ``virsh domxml-from-native`` command can be used to convert most of the LXC
+container configuration into a domain XML fragment
+
+::
+
+   # virsh -c lxc:///system domxml-from-native lxc-tools /var/lib/lxc/myguest/config
+
+This conversion has some limitations due to the fact that the domxml-from-native
+command output has to be independent of the host. Here are a few things to take
+care of before converting:
+
+-  Replace the fstab file referenced by lxc.mount by the corresponding
+   lxc.mount.entry lines.
+-  Replace all relative sizes of tmpfs mount entries to absolute sizes. Also
+   make sure that tmpfs entries all have a size option (default is 50%).
+-  Define lxc.cgroup.memory.limit_in_bytes to properly limit the memory
+   available to the container. The conversion will use 64MiB as the default.
--- a/docs/meson.build
+++ b/docs/meson.build
@ -22,7 +22,6 @@ docs_html_in_files = [
  'csharp',
  'dbus',
  'docs',
-  'drvlxc',
  'drvnodedev',
  'drvopenvz',
  'drvsecret',
@ -80,6 +79,7 @@ docs_rst_files = [
  'drvch',
  'drvesx',
  'drvhyperv',
+  'drvlxc',
  'drvqemu',
  'errors',
  'formatbackup',