diff --git a/docs/drvqemu.html b/docs/drvqemu.html index d207e8e442..f95e08fcca 100644 --- a/docs/drvqemu.html +++ b/docs/drvqemu.html @@ -142,6 +142,21 @@ Deployment pre-requisites
+ There are multiple layers to security in the QEMU driver, allowing for + flexibility in the use of QEMU based virtual machines. +
++ As explained above there are two ways to access the QEMU driver + in libvirt. The "qemu:///session" family of URIs connect to a + libvirtd instance running as the same user/group ID as the client + application. Thus the QEMU instances spawned from this driver will + share the same privileges as the client application. The intended + use case for this driver is desktop virtualization, with virtual + machines storing their disk imags in the user's home directory and + being managed from the local desktop login session. +
++ The "qemu:///system" family of URIs connect to a + libvirtd instance running as the privileged system account 'root'. + Thus the QEMU instances spawned from this driver may have much + higher privileges than the client application managing them. + The intended use case for this driver is server virtualization, + where the virtual machines may need to be connected to host + resources (block, PCI, USB, network devices) whose access requires + elevated privileges. +
++ In the "session" instance, the POSIX DAC model restricts QEMU virtual + machines (and libvirtd in general) to only have access to resources + with the same user/group ID as the client application. There is no + finer level of configuration possible for the "session" instances. +
+
+ In the "system" instance, libvirt releases from 0.7.0 onwards allow
+ control over the user/group that the QEMU virtual machines are run
+ as. A build of libvirt with no configuration parameters set will
+ still run QEMU processes as root:root. It is possible to change
+ this default by using the --with-qemu-user=$USERNAME and
+ --with-qemu-group=$GROUPNAME arguments to 'configure' during
+ build. It is strongly recommended that vendors build with both
+ of these arguments set to 'qemu'. Regardless of this build time
+ default, administrators can set a per-host default setting in
+ the /etc/libvirt/qemu.conf
configuration file via
+ the user=$USERNAME
and group=$GROUPNAME
+ parameters. When a non-root user or group is configured, the
+ libvirt QEMU driver will change uid/gid to match immediately
+ before executing the QEMU binary for a virtual machine.
+
+ If QEMU virtual machines from the "system" instance are being + run as non-root, there will be greater restrictions on what + host resources the QEMU process will be able to access. The + libvirtd daemon will attempt to manage permissions on resources + to minise the likelihood of unintentionale security denials, + but the administrator / application developer must be aware of + some of the consequences / restrictions. +
+
+ The directories /var/run/libvirt/qemu/
,
+ /var/lib/libvirt/qemu/
and
+ /var/cache/libvirt/qemu/
must all have their
+ ownership set to match the user / group ID that QEMU
+ guests will be run as. If the vendor has set a non-root
+ user/group for the QEMU driver at build time, the
+ permissions should be set automatically at install time.
+ If a host administrator customizes user/group in
+ /etc/libvirt/qemu.conf
, they will need to
+ manually set the ownership on these directories.
+
+ When attaching PCI and USB devices to a QEMU guest,
+ QEMU will need to access files in /dev/bus/usb
+ and /sys/bus/devices
. The libvirtd daemon
+ will automatically set the ownership on specific devices
+ that are assigned to a guest at start time. There should
+ not be any need for administrator changes in this respect.
+
+ Any files/devices used as guest disk images must be + accessible to the user/group ID that QEMU guests are + configured to run as. The libvirtd daemon will automatically + set the ownership of the file/device path to the correct + user/group ID. Applications / administrators must be aware + though that the parent directory permissions may still + deny access. The directories containing disk images + must either have their ownership set to match the user/group + configured for QEMU, or their UNIX file permissions must + have the 'execute/search' bit enabled for 'others'. +
++ The simplest option is the latter one, of just enabling + the 'execute/search' bit. For any directory to be used + for storing disk images, this can be achived by running + the following command on the directory itself, and any + parent directories +
++ chmod o+x /path/to/directory ++
+ In particular note that if using the "system" instance + and attempting to store disk images in a user home + directory, the default permissions on $HOME are typically + too restrictive to allow access. +
++ The libvirt QEMU driver has a build time option allowing it to use + the libcap-ng + library to manage process capabilities. If this build option is + enabled, then the QEMU driver will use this to ensure that all + process capabilities are dropped before executing a QEMU virtual + machine. Process capabilities are what gives the 'root' account + its high power, in particular the CAP_DAC_OVERRIDE capability + is what allows a process running as 'root' to access files owned + by any user. +
++ If the QEMU driver is configured to run virtual machines as non-root, + then they will already loose all their process capabilities at time + of startup. The Linux capability feature is thus aimed primarily at + the scenario where the QEMU processes are running as root. In this + case, before launching a QEMU virtual machine, libvirtd will use + libcap-ng APIs to drop all process capabilities. It is important + for administrators to note that this implies the QEMU process will + only be able to access files owned by root, and + not files owned by any other user. +
+
+ Thus, if a vendor / distributor has configured their libvirt package
+ to run as 'qemu' by default, a number of changes will be required
+ before an administrator can change a host to run guests as root.
+ In particular it will be neccessary to change ownership on the
+ directories /var/run/libvirt/qemu/
,
+ /var/lib/libvirt/qemu/
and
+ /var/cache/libvirt/qemu/
back to root, in addition
+ to changing the /etc/libvirt/qemu.conf
settings.
+
+ The basic SELinux protection for QEMU virtual machines is intended to + protect the host OS from a compromised virtual machine process. There + is no protection between guests. +
+
+ In the basic model, all QEMU virtual machines run under the confined
+ domain root:system_r:qemu_t
. It is required that any
+ disk image assigned to a QEMU virtual machine is labelled with
+ system_u:object_r:virt_image_t
. In a default deployment,
+ package vendors/distributor will typically ensure that the directory
+ /var/lib/libvirt/images
has this label, such that any
+ disk images created in this directory will automatically inherit the
+ correct labelling. If attempting to use disk images in another
+ location, the user/administrator must ensure the directory has be
+ given this requisite label. Likewise physical block devices must
+ be labelled system_u:object_r:virt_image_t
.
+
+ Not all filesystems allow for labelling of individual files. In
+ particular NFS, VFat and NTFS have no support for labelling. In
+ these cases administrators must use the 'context' option when
+ mounting the filesystem to set the default label to
+ system_u:object_r:virt_image_t
. In the case of
+ NFS, there is an alternative option, of enabling the virt_use_nfs
+ SELinux boolean.
+
+ The SELinux sVirt protection for QEMU virtual machines builds to the + basic level of protection, to also allow individual guests to be + protected from each other. +
+
+ In the sVirt model, each QEMU virtual machine runs under its own
+ confined domain, which is based on system_u:system_r:svirt_t:s0
+ with a unique category appended, eg, system_u:system_r:svirt_t:s0:c34,c44
.
+ The rules are setup such that a domain can only access files which are
+ labelled with the matching category level, eg
+ system_u:object_r:svirt_image_t:s0:c34,c44
. This prevents one
+ QEMU process accessing any file resources that are prevent to another QEMU
+ process.
+
+ There are two ways of assigning labels to virtual machines under sVirt.
+ In the default setup, if sVirt is enabled, guests will get an automatically
+ assigned unique label each time they are booted. The libvirtd daemon will
+ also automatically relabel exclusive access disk images to match this
+ label. Disks that are marked as <shared> will get a generic
+ label system_u:system_r:svirt_image_t:s0
allowing all guests
+ read/write access them, while disks marked as <readonly> will
+ get a generic label system_u:system_r:svirt_content_t:s0
+ which allows all guests read-only access.
+
+ With statically assigned labels, the application should include the + desired guest and file labels in the XML at time of creating the + guest with libvirt. In this scenario the application is responsible + for ensuring the disk images & similar resources are suitably + labelled to match, libvirtd will not attempt any relabelling. +
+
+ If the sVirt security model is active, then the node capabilties
+ XML will include its details. If a virtual machine is currently
+ protected by the security model, then the guest XML will include
+ its assigned labels. If enabled at compile time, the sVirt security
+ model will always be activated if SELinux is available on the host
+ OS. To disable sVirt, and revert to the basic level of SELinux
+ protection (host protection only), the /etc/libvirt/qemu.conf
+ file can be used to change the setting to security_driver="none"
+
+ Recent Linux kernels have a capability known as "cgroups" which is used + for resource management. It is implemented via a number of "controllers", + each controller covering a specific task/functional area. One of the + available controllers is the "devices" controller, which is able to + setup whitelists of block/character devices that a cgroup should be + allowed to access. If the "devices" controller is mounted on a host, + then libvirt will automatically create a dedicated cgroup for each + QEMU virtual machine and setup the device whitelist so that the QEMU + process can only access shared devices, and explicitly disks images + backed by block devices. +
++ The list of shared devices a guest is allowed access to is +
++ /dev/null, /dev/full, /dev/zero, + /dev/random, /dev/urandom, + /dev/ptmx, /dev/kvm, /dev/kqemu, + /dev/rtc, /dev/hpet, /dev/net/tun ++
+ In the event of unanticipated needs arising, this can be customized
+ via the /etc/libvirt/qemu.conf
file.
+ To mount the cgroups device controller, the following command
+ should be run as root, prior to starting libvirtd
+
+ mkdir /dev/cgroup + mount -t cgroup none /dev/cgroup -o devices ++
+ libvirt will then place each virtual machine in a cgroup at
+ /dev/cgroup/libvirt/qemu/$VMNAME/
+
+ There are multiple layers to security in the QEMU driver, allowing for + flexibility in the use of QEMU based virtual machines. +
+ ++ As explained above there are two ways to access the QEMU driver + in libvirt. The "qemu:///session" family of URIs connect to a + libvirtd instance running as the same user/group ID as the client + application. Thus the QEMU instances spawned from this driver will + share the same privileges as the client application. The intended + use case for this driver is desktop virtualization, with virtual + machines storing their disk imags in the user's home directory and + being managed from the local desktop login session. +
+ ++ The "qemu:///system" family of URIs connect to a + libvirtd instance running as the privileged system account 'root'. + Thus the QEMU instances spawned from this driver may have much + higher privileges than the client application managing them. + The intended use case for this driver is server virtualization, + where the virtual machines may need to be connected to host + resources (block, PCI, USB, network devices) whose access requires + elevated privileges. +
+ ++ In the "session" instance, the POSIX DAC model restricts QEMU virtual + machines (and libvirtd in general) to only have access to resources + with the same user/group ID as the client application. There is no + finer level of configuration possible for the "session" instances. +
+ +
+ In the "system" instance, libvirt releases from 0.7.0 onwards allow
+ control over the user/group that the QEMU virtual machines are run
+ as. A build of libvirt with no configuration parameters set will
+ still run QEMU processes as root:root. It is possible to change
+ this default by using the --with-qemu-user=$USERNAME and
+ --with-qemu-group=$GROUPNAME arguments to 'configure' during
+ build. It is strongly recommended that vendors build with both
+ of these arguments set to 'qemu'. Regardless of this build time
+ default, administrators can set a per-host default setting in
+ the /etc/libvirt/qemu.conf
configuration file via
+ the user=$USERNAME
and group=$GROUPNAME
+ parameters. When a non-root user or group is configured, the
+ libvirt QEMU driver will change uid/gid to match immediately
+ before executing the QEMU binary for a virtual machine.
+
+ If QEMU virtual machines from the "system" instance are being + run as non-root, there will be greater restrictions on what + host resources the QEMU process will be able to access. The + libvirtd daemon will attempt to manage permissions on resources + to minise the likelihood of unintentionale security denials, + but the administrator / application developer must be aware of + some of the consequences / restrictions. +
+ +
+ The directories /var/run/libvirt/qemu/
,
+ /var/lib/libvirt/qemu/
and
+ /var/cache/libvirt/qemu/
must all have their
+ ownership set to match the user / group ID that QEMU
+ guests will be run as. If the vendor has set a non-root
+ user/group for the QEMU driver at build time, the
+ permissions should be set automatically at install time.
+ If a host administrator customizes user/group in
+ /etc/libvirt/qemu.conf
, they will need to
+ manually set the ownership on these directories.
+
+ When attaching PCI and USB devices to a QEMU guest,
+ QEMU will need to access files in /dev/bus/usb
+ and /sys/bus/devices
. The libvirtd daemon
+ will automatically set the ownership on specific devices
+ that are assigned to a guest at start time. There should
+ not be any need for administrator changes in this respect.
+
+ Any files/devices used as guest disk images must be + accessible to the user/group ID that QEMU guests are + configured to run as. The libvirtd daemon will automatically + set the ownership of the file/device path to the correct + user/group ID. Applications / administrators must be aware + though that the parent directory permissions may still + deny access. The directories containing disk images + must either have their ownership set to match the user/group + configured for QEMU, or their UNIX file permissions must + have the 'execute/search' bit enabled for 'others'. +
++ The simplest option is the latter one, of just enabling + the 'execute/search' bit. For any directory to be used + for storing disk images, this can be achived by running + the following command on the directory itself, and any + parent directories +
++ chmod o+x /path/to/directory ++
+ In particular note that if using the "system" instance + and attempting to store disk images in a user home + directory, the default permissions on $HOME are typically + too restrictive to allow access. +
++ The libvirt QEMU driver has a build time option allowing it to use + the libcap-ng + library to manage process capabilities. If this build option is + enabled, then the QEMU driver will use this to ensure that all + process capabilities are dropped before executing a QEMU virtual + machine. Process capabilities are what gives the 'root' account + its high power, in particular the CAP_DAC_OVERRIDE capability + is what allows a process running as 'root' to access files owned + by any user. +
+ ++ If the QEMU driver is configured to run virtual machines as non-root, + then they will already loose all their process capabilities at time + of startup. The Linux capability feature is thus aimed primarily at + the scenario where the QEMU processes are running as root. In this + case, before launching a QEMU virtual machine, libvirtd will use + libcap-ng APIs to drop all process capabilities. It is important + for administrators to note that this implies the QEMU process will + only be able to access files owned by root, and + not files owned by any other user. +
+ +
+ Thus, if a vendor / distributor has configured their libvirt package
+ to run as 'qemu' by default, a number of changes will be required
+ before an administrator can change a host to run guests as root.
+ In particular it will be neccessary to change ownership on the
+ directories /var/run/libvirt/qemu/
,
+ /var/lib/libvirt/qemu/
and
+ /var/cache/libvirt/qemu/
back to root, in addition
+ to changing the /etc/libvirt/qemu.conf
settings.
+
+ The basic SELinux protection for QEMU virtual machines is intended to + protect the host OS from a compromised virtual machine process. There + is no protection between guests. +
+ +
+ In the basic model, all QEMU virtual machines run under the confined
+ domain root:system_r:qemu_t
. It is required that any
+ disk image assigned to a QEMU virtual machine is labelled with
+ system_u:object_r:virt_image_t
. In a default deployment,
+ package vendors/distributor will typically ensure that the directory
+ /var/lib/libvirt/images
has this label, such that any
+ disk images created in this directory will automatically inherit the
+ correct labelling. If attempting to use disk images in another
+ location, the user/administrator must ensure the directory has be
+ given this requisite label. Likewise physical block devices must
+ be labelled system_u:object_r:virt_image_t
.
+
+ Not all filesystems allow for labelling of individual files. In
+ particular NFS, VFat and NTFS have no support for labelling. In
+ these cases administrators must use the 'context' option when
+ mounting the filesystem to set the default label to
+ system_u:object_r:virt_image_t
. In the case of
+ NFS, there is an alternative option, of enabling the virt_use_nfs
+ SELinux boolean.
+
+ The SELinux sVirt protection for QEMU virtual machines builds to the + basic level of protection, to also allow individual guests to be + protected from each other. +
+ +
+ In the sVirt model, each QEMU virtual machine runs under its own
+ confined domain, which is based on system_u:system_r:svirt_t:s0
+ with a unique category appended, eg, system_u:system_r:svirt_t:s0:c34,c44
.
+ The rules are setup such that a domain can only access files which are
+ labelled with the matching category level, eg
+ system_u:object_r:svirt_image_t:s0:c34,c44
. This prevents one
+ QEMU process accessing any file resources that are prevent to another QEMU
+ process.
+
+ There are two ways of assigning labels to virtual machines under sVirt.
+ In the default setup, if sVirt is enabled, guests will get an automatically
+ assigned unique label each time they are booted. The libvirtd daemon will
+ also automatically relabel exclusive access disk images to match this
+ label. Disks that are marked as <shared> will get a generic
+ label system_u:system_r:svirt_image_t:s0
allowing all guests
+ read/write access them, while disks marked as <readonly> will
+ get a generic label system_u:system_r:svirt_content_t:s0
+ which allows all guests read-only access.
+
+ With statically assigned labels, the application should include the + desired guest and file labels in the XML at time of creating the + guest with libvirt. In this scenario the application is responsible + for ensuring the disk images & similar resources are suitably + labelled to match, libvirtd will not attempt any relabelling. +
+ +
+ If the sVirt security model is active, then the node capabilties
+ XML will include its details. If a virtual machine is currently
+ protected by the security model, then the guest XML will include
+ its assigned labels. If enabled at compile time, the sVirt security
+ model will always be activated if SELinux is available on the host
+ OS. To disable sVirt, and revert to the basic level of SELinux
+ protection (host protection only), the /etc/libvirt/qemu.conf
+ file can be used to change the setting to security_driver="none"
+
+ Recent Linux kernels have a capability known as "cgroups" which is used + for resource management. It is implemented via a number of "controllers", + each controller covering a specific task/functional area. One of the + available controllers is the "devices" controller, which is able to + setup whitelists of block/character devices that a cgroup should be + allowed to access. If the "devices" controller is mounted on a host, + then libvirt will automatically create a dedicated cgroup for each + QEMU virtual machine and setup the device whitelist so that the QEMU + process can only access shared devices, and explicitly disks images + backed by block devices. +
+ ++ The list of shared devices a guest is allowed access to is +
+ ++ /dev/null, /dev/full, /dev/zero, + /dev/random, /dev/urandom, + /dev/ptmx, /dev/kvm, /dev/kqemu, + /dev/rtc, /dev/hpet, /dev/net/tun ++ +
+ In the event of unanticipated needs arising, this can be customized
+ via the /etc/libvirt/qemu.conf
file.
+ To mount the cgroups device controller, the following command
+ should be run as root, prior to starting libvirtd
+
+ mkdir /dev/cgroup + mount -t cgroup none /dev/cgroup -o devices ++ +
+ libvirt will then place each virtual machine in a cgroup at
+ /dev/cgroup/libvirt/qemu/$VMNAME/
+
The QEMU driver currently supports a single native