mirror of
https://gitlab.com/libvirt/libvirt.git
synced 2024-11-03 20:01:16 +00:00
429281e7b7
Specifically, list sub-elements and where they can be used. In addition, describe supported machine types for Xen. Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Jim Fehlig <jfehlig@suse.com>
8776 lines
365 KiB
XML
8776 lines
365 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
||
<!DOCTYPE html>
|
||
<html xmlns="http://www.w3.org/1999/xhtml">
|
||
<body>
|
||
<h1>Domain XML format</h1>
|
||
|
||
<ul id="toc"></ul>
|
||
|
||
<p>
|
||
This section describes the XML format used to represent domains, there are
|
||
variations on the format based on the kind of domains run and the options
|
||
used to launch them. For hypervisor specific details consult the
|
||
<a href="drivers.html">driver docs</a>
|
||
</p>
|
||
|
||
|
||
<h2><a id="elements">Element and attribute overview</a></h2>
|
||
|
||
<p>
|
||
The root element required for all virtual machines is
|
||
named <code>domain</code>. It has two attributes, the
|
||
<a id="attributeDomainType"><code>type</code></a>
|
||
specifies the hypervisor used for running
|
||
the domain. The allowed values are driver specific, but
|
||
include "xen", "kvm", "qemu", "lxc" and "kqemu". The
|
||
second attribute is <code>id</code> which is a unique
|
||
integer identifier for the running guest machine. Inactive
|
||
machines have no id value.
|
||
</p>
|
||
|
||
|
||
<h3><a id="elementsMetadata">General metadata</a></h3>
|
||
|
||
<pre>
|
||
<domain type='kvm' id='1'>
|
||
<name>MyGuest</name>
|
||
<uuid>4dea22b3-1d52-d8f3-2516-782e98ab3fa0</uuid>
|
||
<genid>43dc0cf8-809b-4adb-9bea-a9abb5f3d90e</genid>
|
||
<title>A short description - title - of the domain</title>
|
||
<description>Some human readable description</description>
|
||
<metadata>
|
||
<app1:foo xmlns:app1="http://app1.org/app1/">..</app1:foo>
|
||
<app2:bar xmlns:app2="http://app1.org/app2/">..</app2:bar>
|
||
</metadata>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><code>name</code></dt>
|
||
<dd>The content of the <code>name</code> element provides
|
||
a short name for the virtual machine. This name should
|
||
consist only of alpha-numeric characters and is required
|
||
to be unique within the scope of a single host. It is
|
||
often used to form the filename for storing the persistent
|
||
configuration file. <span class="since">Since 0.0.1</span></dd>
|
||
<dt><code>uuid</code></dt>
|
||
<dd>The content of the <code>uuid</code> element provides
|
||
a globally unique identifier for the virtual machine.
|
||
The format must be RFC 4122 compliant,
|
||
eg <code>3e3fce45-4f53-4fa7-bb32-11f34168b82b</code>.
|
||
If omitted when defining/creating a new machine, a random
|
||
UUID is generated. It is also possible to provide the UUID
|
||
via a <a href="#elementsSysinfo"><code>sysinfo</code></a>
|
||
specification. <span class="since">Since 0.0.1, sysinfo
|
||
since 0.8.7</span></dd>
|
||
|
||
<dt><code>genid</code></dt>
|
||
<dd><span class="since">Since 4.4.0</span>, the <code>genid</code>
|
||
element can be used to add a Virtual Machine Generation ID which
|
||
exposes a 128-bit, cryptographically random, integer value identifier,
|
||
referred to as a Globally Unique Identifier (GUID) using the same
|
||
format as the <code>uuid</code>. The value is used to help notify
|
||
the guest operating system when the virtual machine is re-executing
|
||
something that has already executed before, such as:
|
||
|
||
<ul>
|
||
<li>VM starts executing a snapshot</li>
|
||
<li>VM is recovered from backup</li>
|
||
<li>VM is failover in a disaster recovery environment</li>
|
||
<li>VM is imported, copied, or cloned</li>
|
||
</ul>
|
||
|
||
The guest operating system notices the change and is then able to
|
||
react as appropriate by marking its copies of distributed databases
|
||
as dirty, re-initializing its random number generator, etc.
|
||
|
||
<p>
|
||
The libvirt XML parser will accept both a provided GUID value
|
||
or just <genid/> in which case a GUID will be generated
|
||
and saved in the XML. For the transitions such as above, libvirt
|
||
will change the GUID before re-executing.</p></dd>
|
||
|
||
<dt><code>title</code></dt>
|
||
<dd>The optional element <code>title</code> provides space for a
|
||
short description of the domain. The title should not contain
|
||
any newlines. <span class="since">Since 0.9.10</span>.</dd>
|
||
|
||
<dt><code>description</code></dt>
|
||
<dd>The content of the <code>description</code> element provides a
|
||
human readable description of the virtual machine. This data is not
|
||
used by libvirt in any way, it can contain any information the user
|
||
wants. <span class="since">Since 0.7.2</span></dd>
|
||
|
||
<dt><code>metadata</code></dt>
|
||
<dd>The <code>metadata</code> node can be used by applications
|
||
to store custom metadata in the form of XML
|
||
nodes/trees. Applications must use custom namespaces on their
|
||
XML nodes/trees, with only one top-level element per namespace
|
||
(if the application needs structure, they should have
|
||
sub-elements to their namespace
|
||
element). <span class="since">Since 0.9.10</span></dd>
|
||
</dl>
|
||
|
||
<h3><a id="elementsOS">Operating system booting</a></h3>
|
||
|
||
<p>
|
||
There are a number of different ways to boot virtual machines
|
||
each with their own pros and cons.
|
||
</p>
|
||
|
||
<h4><a id="elementsOSBIOS">BIOS bootloader</a></h4>
|
||
|
||
<p>
|
||
Booting via the BIOS is available for hypervisors supporting
|
||
full virtualization. In this case the BIOS has a boot order
|
||
priority (floppy, harddisk, cdrom, network) determining where
|
||
to obtain/find the boot image.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<os>
|
||
<type>hvm</type>
|
||
<loader readonly='yes' secure='no' type='rom'>/usr/lib/xen/boot/hvmloader</loader>
|
||
<nvram template='/usr/share/OVMF/OVMF_VARS.fd'>/var/lib/libvirt/nvram/guest_VARS.fd</nvram>
|
||
<boot dev='hd'/>
|
||
<boot dev='cdrom'/>
|
||
<bootmenu enable='yes' timeout='3000'/>
|
||
<smbios mode='sysinfo'/>
|
||
<bios useserial='yes' rebootTimeout='0'/>
|
||
</os>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><code>type</code></dt>
|
||
<dd>The content of the <code>type</code> element specifies the
|
||
type of operating system to be booted in the virtual machine.
|
||
<code>hvm</code> indicates that the OS is one designed to run
|
||
on bare metal, so requires full virtualization. <code>linux</code>
|
||
(badly named!) refers to an OS that supports the Xen 3 hypervisor
|
||
guest ABI. There are also two optional attributes, <code>arch</code>
|
||
specifying the CPU architecture to virtualization,
|
||
and <a id="attributeOSTypeMachine"><code>machine</code></a> referring
|
||
to the machine type. The <a href="formatcaps.html">Capabilities XML</a>
|
||
provides details on allowed values for
|
||
these. <span class="since">Since 0.0.1</span></dd>
|
||
<dt><a id="elementLoader"><code>loader</code></a></dt>
|
||
<dd>The optional <code>loader</code> tag refers to a firmware blob,
|
||
which is specified by absolute path,
|
||
used to assist the domain creation process. It is used by Xen
|
||
fully virtualized domains as well as setting the QEMU BIOS file
|
||
path for QEMU/KVM domains. <span class="since">Xen since 0.1.0,
|
||
QEMU/KVM since 0.9.12</span> Then, <span class="since">since
|
||
1.2.8</span> it's possible for the element to have two
|
||
optional attributes: <code>readonly</code> (accepted values are
|
||
<code>yes</code> and <code>no</code>) to reflect the fact that the
|
||
image should be writable or read-only. The second attribute
|
||
<code>type</code> accepts values <code>rom</code> and
|
||
<code>pflash</code>. It tells the hypervisor where in the guest
|
||
memory the file should be mapped. For instance, if the loader
|
||
path points to an UEFI image, <code>type</code> should be
|
||
<code>pflash</code>. Moreover, some firmwares may
|
||
implement the Secure boot feature. Attribute
|
||
<code>secure</code> can be used then to control it.
|
||
<span class="since">Since 2.1.0</span></dd>
|
||
<dt><code>nvram</code></dt>
|
||
<dd>Some UEFI firmwares may want to use a non-volatile memory to store
|
||
some variables. In the host, this is represented as a file and the
|
||
absolute path to the file is stored in this element. Moreover, when the
|
||
domain is started up libvirt copies so called master NVRAM store file
|
||
defined in <code>qemu.conf</code>. If needed, the <code>template</code>
|
||
attribute can be used to per domain override map of master NVRAM stores
|
||
from the config file. Note, that for transient domains if the NVRAM file
|
||
has been created by libvirt it is left behind and it is management
|
||
application's responsibility to save and remove file (if needed to be
|
||
persistent). <span class="since">Since 1.2.8</span></dd>
|
||
<dt><code>boot</code></dt>
|
||
<dd>The <code>dev</code> attribute takes one of the values "fd", "hd",
|
||
"cdrom" or "network" and is used to specify the next boot device
|
||
to consider. The <code>boot</code> element can be repeated multiple
|
||
times to setup a priority list of boot devices to try in turn.
|
||
Multiple devices of the same type are sorted according to their
|
||
targets while preserving the order of buses. After defining the
|
||
domain, its XML configuration returned by libvirt (through
|
||
virDomainGetXMLDesc) lists devices in the sorted order. Once sorted,
|
||
the first device is marked as bootable. Thus, e.g., a domain
|
||
configured to boot from "hd" with vdb, hda, vda, and hdc disks
|
||
assigned to it will boot from vda (the sorted list is vda, vdb, hda,
|
||
hdc). Similar domain with hdc, vda, vdb, and hda disks will boot from
|
||
hda (sorted disks are: hda, hdc, vda, vdb). It can be tricky to
|
||
configure in the desired way, which is why per-device boot elements
|
||
(see <a href="#elementsDisks">disks</a>,
|
||
<a href="#elementsNICS">network interfaces</a>, and
|
||
<a href="#elementsHostDev">USB and PCI devices</a> sections below) were
|
||
introduced and they are the preferred way providing full control over
|
||
booting order. The <code>boot</code> element and per-device boot
|
||
elements are mutually exclusive. <span class="since">Since 0.1.3,
|
||
per-device boot since 0.8.8</span>
|
||
</dd>
|
||
<dt><code>smbios</code></dt>
|
||
<dd>How to populate SMBIOS information visible in the guest.
|
||
The <code>mode</code> attribute must be specified, and is either
|
||
"emulate" (let the hypervisor generate all values), "host" (copy
|
||
all of Block 0 and Block 1, except for the UUID, from the host's
|
||
SMBIOS values;
|
||
the <a href="html/libvirt-libvirt-host.html#virConnectGetSysinfo">
|
||
<code>virConnectGetSysinfo</code></a> call can be
|
||
used to see what values are copied), or "sysinfo" (use the values in
|
||
the <a href="#elementsSysinfo">sysinfo</a> element). If not
|
||
specified, the hypervisor default is used. <span class="since">
|
||
Since 0.8.7</span>
|
||
</dd>
|
||
</dl>
|
||
<p>Up till here the BIOS/UEFI configuration knobs are generic enough to
|
||
be implemented by majority (if not all) firmwares out there. However,
|
||
from now on not every single setting makes sense to all firmwares. For
|
||
instance, <code>rebootTimeout</code> doesn't make sense for UEFI,
|
||
<code>useserial</code> might not be usable with a BIOS firmware that
|
||
doesn't produce any output onto serial line, etc. Moreover, firmwares
|
||
don't usually export their capabilities for libvirt (or users) to check.
|
||
And the set of their capabilities can change with every new release.
|
||
Hence users are advised to try the settings they use before relying on
|
||
them in production.</p>
|
||
<dl>
|
||
<dt><code>bootmenu</code></dt>
|
||
<dd> Whether or not to enable an interactive boot menu prompt on guest
|
||
startup. The <code>enable</code> attribute can be either "yes" or "no".
|
||
If not specified, the hypervisor default is used. <span class="since">
|
||
Since 0.8.3</span>
|
||
Additional attribute <code>timeout</code> takes the number of milliseconds
|
||
the boot menu should wait until it times out. Allowed values are numbers
|
||
in range [0, 65535] inclusive and it is ignored unless <code>enable</code>
|
||
is set to "yes". <span class="since">Since 1.2.8</span>
|
||
</dd>
|
||
<dt><code>bios</code></dt>
|
||
<dd>This element has attribute <code>useserial</code> with possible
|
||
values <code>yes</code> or <code>no</code>. It enables or disables
|
||
Serial Graphics Adapter which allows users to see BIOS messages
|
||
on a serial port. Therefore, one needs to have
|
||
<a href="#elementCharSerial">serial port</a> defined.
|
||
<span class="since">Since 0.9.4</span>.
|
||
<span class="since">Since 0.10.2 (QEMU only)</span> there is
|
||
another attribute, <code>rebootTimeout</code> that controls
|
||
whether and after how long the guest should start booting
|
||
again in case the boot fails (according to BIOS). The value is
|
||
in milliseconds with maximum of <code>65535</code> and special
|
||
value <code>-1</code> disables the reboot.
|
||
</dd>
|
||
</dl>
|
||
|
||
<h4><a id="elementsOSBootloader">Host bootloader</a></h4>
|
||
|
||
<p>
|
||
Hypervisors employing paravirtualization do not usually emulate
|
||
a BIOS, and instead the host is responsible to kicking off the
|
||
operating system boot. This may use a pseudo-bootloader in the
|
||
host to provide an interface to choose a kernel for the guest.
|
||
An example is <code>pygrub</code> with Xen. The Bhyve hypervisor
|
||
also uses a host bootloader, either <code>bhyveload</code> or
|
||
<code>grub-bhyve</code>.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<bootloader>/usr/bin/pygrub</bootloader>
|
||
<bootloader_args>--append single</bootloader_args>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><code>bootloader</code></dt>
|
||
<dd>The content of the <code>bootloader</code> element provides
|
||
a fully qualified path to the bootloader executable in the
|
||
host OS. This bootloader will be run to choose which kernel
|
||
to boot. The required output of the bootloader is dependent
|
||
on the hypervisor in use. <span class="since">Since 0.1.0</span></dd>
|
||
<dt><code>bootloader_args</code></dt>
|
||
<dd>The optional <code>bootloader_args</code> element allows
|
||
command line arguments to be passed to the bootloader.
|
||
<span class="since">Since 0.2.3</span>
|
||
</dd>
|
||
|
||
</dl>
|
||
|
||
<h4><a id="elementsOSKernel">Direct kernel boot</a></h4>
|
||
|
||
<p>
|
||
When installing a new guest OS it is often useful to boot directly
|
||
from a kernel and initrd stored in the host OS, allowing command
|
||
line arguments to be passed directly to the installer. This capability
|
||
is usually available for both para and full virtualized guests.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<os>
|
||
<type>hvm</type>
|
||
<loader>/usr/lib/xen/boot/hvmloader</loader>
|
||
<kernel>/root/f8-i386-vmlinuz</kernel>
|
||
<initrd>/root/f8-i386-initrd</initrd>
|
||
<cmdline>console=ttyS0 ks=http://example.com/f8-i386/os/</cmdline>
|
||
<dtb>/root/ppc.dtb</dtb>
|
||
<acpi>
|
||
<table type='slic'>/path/to/slic.dat</table>
|
||
</acpi>
|
||
</os>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><code>type</code></dt>
|
||
<dd>This element has the same semantics as described earlier in the
|
||
<a href="#elementsOSBIOS">BIOS boot section</a></dd>
|
||
<dt><code>loader</code></dt>
|
||
<dd>This element has the same semantics as described earlier in the
|
||
<a href="#elementsOSBIOS">BIOS boot section</a></dd>
|
||
<dt><code>kernel</code></dt>
|
||
<dd>The contents of this element specify the fully-qualified path
|
||
to the kernel image in the host OS.</dd>
|
||
<dt><code>initrd</code></dt>
|
||
<dd>The contents of this element specify the fully-qualified path
|
||
to the (optional) ramdisk image in the host OS.</dd>
|
||
<dt><code>cmdline</code></dt>
|
||
<dd>The contents of this element specify arguments to be passed to
|
||
the kernel (or installer) at boot time. This is often used to
|
||
specify an alternate primary console (eg serial port), or the
|
||
installation media source / kickstart file</dd>
|
||
<dt><code>dtb</code></dt>
|
||
<dd>The contents of this element specify the fully-qualified path
|
||
to the (optional) device tree binary (dtb) image in the host OS.
|
||
<span class="since">Since 1.0.4</span></dd>
|
||
<dt><code>acpi</code></dt>
|
||
<dd>The <code>table</code> element contains a fully-qualified path
|
||
to the ACPI table. The <code>type</code> attribute contains the
|
||
ACPI table type (currently only <code>slic</code> is supported)
|
||
<span class="since">Since 1.3.5 (QEMU only)</span></dd>
|
||
</dl>
|
||
|
||
<h4><a id="elementsOSContainer">Container boot</a></h4>
|
||
|
||
<p>
|
||
When booting a domain using container based virtualization, instead
|
||
of a kernel / boot image, a path to the init binary is required, using
|
||
the <code>init</code> element. By default this will be launched with
|
||
no arguments. To specify the initial argv, use the <code>initarg</code>
|
||
element, repeated as many time as is required. The <code>cmdline</code>
|
||
element, if set will be used to provide an equivalent to <code>/proc/cmdline</code>
|
||
but will not affect init argv.
|
||
</p>
|
||
<p>
|
||
To set environment variables, use the <code>initenv</code> element, one
|
||
for each variable.
|
||
</p>
|
||
<p>
|
||
To set a custom work directory for the init, use the <code>initdir</code>
|
||
element.
|
||
</p>
|
||
<p>
|
||
To run the init command as a given user or group, use the <code>inituser</code>
|
||
or <code>initgroup</code> elements respectively. Both elements can be provided
|
||
either a user (resp. group) id or a name. Prefixing the user or group id with
|
||
a <code>+</code> will force it to be considered like a numeric value. Without
|
||
this, it will be first tried as a user or group name.
|
||
</p>
|
||
|
||
<pre>
|
||
<os>
|
||
<type arch='x86_64'>exe</type>
|
||
<init>/bin/systemd</init>
|
||
<initarg>--unit</initarg>
|
||
<initarg>emergency.service</initarg>
|
||
<initenv name='MYENV'>some value</initenv>
|
||
<initdir>/my/custom/cwd</initdir>
|
||
<inituser>tester</inituser>
|
||
<initgroup>1000</initgroup>
|
||
</os>
|
||
</pre>
|
||
|
||
|
||
<p>
|
||
If you want to enable user namespace, set the <code>idmap</code> element.
|
||
The <code>uid</code> and <code>gid</code> elements have three attributes:
|
||
</p>
|
||
|
||
<dl>
|
||
<dt><code>start</code></dt>
|
||
<dd>First user ID in container. It must be '0'.</dd>
|
||
<dt><code>target</code></dt>
|
||
<dd>The first user ID in container will be mapped to this target user
|
||
ID in host.</dd>
|
||
<dt><code>count</code></dt>
|
||
<dd>How many users in container are allowed to map to host's user.</dd>
|
||
</dl>
|
||
|
||
<pre>
|
||
<idmap>
|
||
<uid start='0' target='1000' count='10'/>
|
||
<gid start='0' target='1000' count='10'/>
|
||
</idmap>
|
||
</pre>
|
||
|
||
|
||
<h3><a id="elementsSysinfo">SMBIOS System Information</a></h3>
|
||
|
||
<p>
|
||
Some hypervisors allow control over what system information is
|
||
presented to the guest (for example, SMBIOS fields can be
|
||
populated by a hypervisor and inspected via
|
||
the <code>dmidecode</code> command in the guest). The
|
||
optional <code>sysinfo</code> element covers all such categories
|
||
of information. <span class="since">Since 0.8.7</span>
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<os>
|
||
<smbios mode='sysinfo'/>
|
||
...
|
||
</os>
|
||
<sysinfo type='smbios'>
|
||
<bios>
|
||
<entry name='vendor'>LENOVO</entry>
|
||
</bios>
|
||
<system>
|
||
<entry name='manufacturer'>Fedora</entry>
|
||
<entry name='product'>Virt-Manager</entry>
|
||
<entry name='version'>0.9.4</entry>
|
||
</system>
|
||
<baseBoard>
|
||
<entry name='manufacturer'>LENOVO</entry>
|
||
<entry name='product'>20BE0061MC</entry>
|
||
<entry name='version'>0B98401 Pro</entry>
|
||
<entry name='serial'>W1KS427111E</entry>
|
||
</baseBoard>
|
||
<chassis>
|
||
<entry name='manufacturer'>Dell Inc.</entry>
|
||
<entry name='version'>2.12</entry>
|
||
<entry name='serial'>65X0XF2</entry>
|
||
<entry name='asset'>40000101</entry>
|
||
<entry name='sku'>Type3Sku1</entry>
|
||
</chassis>
|
||
<oemStrings>
|
||
<entry>myappname:some arbitrary data</entry>
|
||
<entry>otherappname:more arbitrary data</entry>
|
||
</oemStrings>
|
||
</sysinfo>
|
||
...</pre>
|
||
|
||
<p>
|
||
The <code>sysinfo</code> element has a mandatory
|
||
attribute <code>type</code> that determine the layout of
|
||
sub-elements, with supported values of:
|
||
</p>
|
||
|
||
<dl>
|
||
<dt><code>smbios</code></dt>
|
||
<dd>Sub-elements call out specific SMBIOS values, which will
|
||
affect the guest if used in conjunction with
|
||
the <code>smbios</code> sub-element of
|
||
the <a href="#elementsOS"><code>os</code></a> element. Each
|
||
sub-element of <code>sysinfo</code> names a SMBIOS block, and
|
||
within those elements can be a list of <code>entry</code>
|
||
elements that describe a field within the block. The following
|
||
blocks and entries are recognized:
|
||
<dl>
|
||
<dt><code>bios</code></dt>
|
||
<dd>
|
||
This is block 0 of SMBIOS, with entry names drawn from:
|
||
<dl>
|
||
<dt><code>vendor</code></dt>
|
||
<dd>BIOS Vendor's Name</dd>
|
||
<dt><code>version</code></dt>
|
||
<dd>BIOS Version</dd>
|
||
<dt><code>date</code></dt>
|
||
<dd>BIOS release date. If supplied, is in either mm/dd/yy or
|
||
mm/dd/yyyy format. If the year portion of the string is
|
||
two digits, the year is assumed to be 19yy.</dd>
|
||
<dt><code>release</code></dt>
|
||
<dd>System BIOS Major and Minor release number values
|
||
concatenated together as one string separated by
|
||
a period, for example, 10.22.</dd>
|
||
</dl>
|
||
</dd>
|
||
<dt><code>system</code></dt>
|
||
<dd>
|
||
This is block 1 of SMBIOS, with entry names drawn from:
|
||
<dl>
|
||
<dt><code>manufacturer</code></dt>
|
||
<dd>Manufacturer of BIOS</dd>
|
||
<dt><code>product</code></dt>
|
||
<dd>Product Name</dd>
|
||
<dt><code>version</code></dt>
|
||
<dd>Version of the product</dd>
|
||
<dt><code>serial</code></dt>
|
||
<dd>Serial number</dd>
|
||
<dt><code>uuid</code></dt>
|
||
<dd>Universal Unique ID number. If this entry is provided
|
||
alongside a top-level
|
||
<a href="#elementsMetadata"><code>uuid</code></a> element,
|
||
then the two values must match.</dd>
|
||
<dt><code>sku</code></dt>
|
||
<dd>SKU number to identify a particular configuration.</dd>
|
||
<dt><code>family</code></dt>
|
||
<dd>Identify the family a particular computer belongs to.</dd>
|
||
</dl>
|
||
</dd>
|
||
<dt><code>baseBoard</code></dt>
|
||
<dd>
|
||
This is block 2 of SMBIOS. This element can be repeated multiple
|
||
times to describe all the base boards; however, not all
|
||
hypervisors necessarily support the repetition. The element can
|
||
have the following children:
|
||
<dl>
|
||
<dt><code>manufacturer</code></dt>
|
||
<dd>Manufacturer of BIOS</dd>
|
||
<dt><code>product</code></dt>
|
||
<dd>Product Name</dd>
|
||
<dt><code>version</code></dt>
|
||
<dd>Version of the product</dd>
|
||
<dt><code>serial</code></dt>
|
||
<dd>Serial number</dd>
|
||
<dt><code>asset</code></dt>
|
||
<dd>Asset tag</dd>
|
||
<dt><code>location</code></dt>
|
||
<dd>Location in chassis</dd>
|
||
</dl>
|
||
NB: Incorrectly supplied entries for the
|
||
<code>bios</code>, <code>system</code> or <code>baseBoard</code>
|
||
blocks will be ignored without error. Other than <code>uuid</code>
|
||
validation and <code>date</code> format checking, all values are
|
||
passed as strings to the hypervisor driver.
|
||
</dd>
|
||
<dt><code>chassis</code></dt>
|
||
<dd>
|
||
<span class="since">Since 4.1.0,</span> this is block 3 of
|
||
SMBIOS, with entry names drawn from:
|
||
<dl>
|
||
<dt><code>manufacturer</code></dt>
|
||
<dd>Manufacturer of Chassis</dd>
|
||
<dt><code>version</code></dt>
|
||
<dd>Version of the Chassis</dd>
|
||
<dt><code>serial</code></dt>
|
||
<dd>Serial number</dd>
|
||
<dt><code>asset</code></dt>
|
||
<dd>Asset tag</dd>
|
||
<dt><code>sku</code></dt>
|
||
<dd>SKU number</dd>
|
||
</dl>
|
||
</dd>
|
||
<dt><code>oemStrings</code></dt>
|
||
<dd>
|
||
This is block 11 of SMBIOS. This element should appear once and
|
||
can have multiple <code>entry</code> child elements, each providing
|
||
arbitrary string data. There are no restrictions on what data can
|
||
be provided in the entries, however, if the data is intended to be
|
||
consumed by an application in the guest, it is recommended to use
|
||
the application name as a prefix in the string. (<span class="since">Since 4.1.0</span>)
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
</dl>
|
||
|
||
<h3><a id="elementsCPUAllocation">CPU Allocation</a></h3>
|
||
|
||
<pre>
|
||
<domain>
|
||
...
|
||
<vcpu placement='static' cpuset="1-4,^3,6" current="1">2</vcpu>
|
||
<vcpus>
|
||
<vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
|
||
<vcpu id='1' enabled='no' hotpluggable='yes'/>
|
||
</vcpus>
|
||
...
|
||
</domain>
|
||
</pre>
|
||
|
||
<dl>
|
||
<dt><code>vcpu</code></dt>
|
||
<dd>The content of this element defines the maximum number of virtual
|
||
CPUs allocated for the guest OS, which must be between 1 and
|
||
the maximum supported by the hypervisor.
|
||
<dl>
|
||
<dt><code>cpuset</code></dt>
|
||
<dd>
|
||
The optional attribute <code>cpuset</code> is a comma-separated
|
||
list of physical CPU numbers that domain process and virtual CPUs
|
||
can be pinned to by default. (NB: The pinning policy of domain
|
||
process and virtual CPUs can be specified separately by
|
||
<code>cputune</code>. If the attribute <code>emulatorpin</code>
|
||
of <code>cputune</code> is specified, the <code>cpuset</code>
|
||
specified by <code>vcpu</code> here will be ignored. Similarly,
|
||
for virtual CPUs which have the <code>vcpupin</code> specified,
|
||
the <code>cpuset</code> specified by <code>cpuset</code> here
|
||
will be ignored. For virtual CPUs which don't have
|
||
<code>vcpupin</code> specified, each will be pinned to the physical
|
||
CPUs specified by <code>cpuset</code> here).
|
||
Each element in that list is either a single CPU number,
|
||
a range of CPU numbers, or a caret followed by a CPU number to
|
||
be excluded from a previous range.
|
||
<span class="since">Since 0.4.4</span>
|
||
</dd>
|
||
<dt><code>current</code></dt>
|
||
<dd>
|
||
The optional attribute <code>current</code> can
|
||
be used to specify whether fewer than the maximum number of
|
||
virtual CPUs should be enabled.
|
||
<span class="since">Since 0.8.5</span>
|
||
</dd>
|
||
<dt><code>placement</code></dt>
|
||
<dd>
|
||
The optional attribute <code>placement</code> can be used to
|
||
indicate the CPU placement mode for domain process. The value can
|
||
be either "static" or "auto", but defaults to <code>placement</code>
|
||
of <code>numatune</code> or "static" if <code>cpuset</code> is
|
||
specified. Using "auto" indicates the domain process will be pinned
|
||
to the advisory nodeset from querying numad and the value of
|
||
attribute <code>cpuset</code> will be ignored if it's specified.
|
||
If both <code>cpuset</code> and <code>placement</code> are not
|
||
specified or if <code>placement</code> is "static", but no
|
||
<code>cpuset</code> is specified, the domain process will be
|
||
pinned to all the available physical CPUs.
|
||
<span class="since">Since 0.9.11 (QEMU and KVM only)</span>
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
<dt><code>vcpus</code></dt>
|
||
<dd>
|
||
The vcpus element allows to control state of individual vCPUs.
|
||
|
||
The <code>id</code> attribute specifies the vCPU id as used by libvirt
|
||
in other places such as vCPU pinning, scheduler information and NUMA
|
||
assignment. Note that the vCPU ID as seen in the guest may differ from
|
||
libvirt ID in certain cases. Valid IDs are from 0 to the maximum vCPU
|
||
count as set by the <code>vcpu</code> element minus 1.
|
||
|
||
The <code>enabled</code> attribute allows to control the state of the
|
||
vCPU. Valid values are <code>yes</code> and <code>no</code>.
|
||
|
||
<code>hotpluggable</code> controls whether given vCPU can be hotplugged
|
||
and hotunplugged in cases when the CPU is enabled at boot. Note that
|
||
all disabled vCPUs must be hotpluggable. Valid values are
|
||
<code>yes</code> and <code>no</code>.
|
||
|
||
<code>order</code> allows to specify the order to add the online vCPUs.
|
||
For hypervisors/platforms that require to insert multiple vCPUs at once
|
||
the order may be duplicated across all vCPUs that need to be
|
||
enabled at once. Specifying order is not necessary, vCPUs are then
|
||
added in an arbitrary order. If order info is used, it must be used for
|
||
all online vCPUs. Hypervisors may clear or update ordering information
|
||
during certain operations to assure valid configuration.
|
||
|
||
Note that hypervisors may create hotpluggable vCPUs differently from
|
||
boot vCPUs thus special initialization may be necessary.
|
||
|
||
Hypervisors may require that vCPUs enabled on boot which are not
|
||
hotpluggable are clustered at the beginning starting with ID 0. It may
|
||
be also required that vCPU 0 is always present and non-hotpluggable.
|
||
|
||
Note that providing state for individual CPUs may be necessary to enable
|
||
support of addressable vCPU hotplug and this feature may not be
|
||
supported by all hypervisors.
|
||
|
||
For QEMU the following conditions are required. vCPU 0 needs to be
|
||
enabled and non-hotpluggable. On PPC64 along with it vCPUs that are in
|
||
the same core need to be enabled as well. All non-hotpluggable CPUs
|
||
present at boot need to be grouped after vCPU 0.
|
||
<span class="since">Since 2.2.0 (QEMU only)</span>
|
||
</dd>
|
||
</dl>
|
||
|
||
<h3><a id="elementsIOThreadsAllocation">IOThreads Allocation</a></h3>
|
||
<p>
|
||
IOThreads are dedicated event loop threads for supported disk
|
||
devices to perform block I/O requests in order to improve
|
||
scalability especially on an SMP host/guest with many LUNs.
|
||
<span class="since">Since 1.2.8 (QEMU only)</span>
|
||
</p>
|
||
|
||
<pre>
|
||
<domain>
|
||
...
|
||
<iothreads>4</iothreads>
|
||
...
|
||
</domain>
|
||
</pre>
|
||
<pre>
|
||
<domain>
|
||
...
|
||
<iothreadids>
|
||
<iothread id="2"/>
|
||
<iothread id="4"/>
|
||
<iothread id="6"/>
|
||
<iothread id="8"/>
|
||
</iothreadids>
|
||
...
|
||
</domain>
|
||
</pre>
|
||
|
||
<dl>
|
||
<dt><code>iothreads</code></dt>
|
||
<dd>
|
||
The content of this optional element defines the number
|
||
of IOThreads to be assigned to the domain for use by
|
||
supported target storage devices. There
|
||
should be only 1 or 2 IOThreads per host CPU. There may be more
|
||
than one supported device assigned to each IOThread.
|
||
<span class="since">Since 1.2.8</span>
|
||
</dd>
|
||
<dt><code>iothreadids</code></dt>
|
||
<dd>
|
||
The optional <code>iothreadids</code> element provides the capability
|
||
to specifically define the IOThread ID's for the domain. By default,
|
||
IOThread ID's are sequentially numbered starting from 1 through the
|
||
number of <code>iothreads</code> defined for the domain. The
|
||
<code>id</code> attribute is used to define the IOThread ID. The
|
||
<code>id</code> attribute must be a positive integer greater than 0.
|
||
If there are less <code>iothreadids</code> defined than
|
||
<code>iothreads</code> defined for the domain, then libvirt will
|
||
sequentially fill <code>iothreadids</code> starting at 1 avoiding
|
||
any predefined <code>id</code>. If there are more
|
||
<code>iothreadids</code> defined than <code>iothreads</code>
|
||
defined for the domain, then the <code>iothreads</code> value
|
||
will be adjusted accordingly.
|
||
<span class="since">Since 1.2.15</span>
|
||
</dd>
|
||
</dl>
|
||
|
||
<h3><a id="elementsCPUTuning">CPU Tuning</a></h3>
|
||
|
||
<pre>
|
||
<domain>
|
||
...
|
||
<cputune>
|
||
<vcpupin vcpu="0" cpuset="1-4,^2"/>
|
||
<vcpupin vcpu="1" cpuset="0,1"/>
|
||
<vcpupin vcpu="2" cpuset="2,3"/>
|
||
<vcpupin vcpu="3" cpuset="0,4"/>
|
||
<emulatorpin cpuset="1-3"/>
|
||
<iothreadpin iothread="1" cpuset="5,6"/>
|
||
<iothreadpin iothread="2" cpuset="7,8"/>
|
||
<shares>2048</shares>
|
||
<period>1000000</period>
|
||
<quota>-1</quota>
|
||
<global_period>1000000</global_period>
|
||
<global_quota>-1</global_quota>
|
||
<emulator_period>1000000</emulator_period>
|
||
<emulator_quota>-1</emulator_quota>
|
||
<iothread_period>1000000</iothread_period>
|
||
<iothread_quota>-1</iothread_quota>
|
||
<vcpusched vcpus='0-4,^3' scheduler='fifo' priority='1'/>
|
||
<iothreadsched iothreads='2' scheduler='batch'/>
|
||
<cachetune vcpus='0-3'>
|
||
<cache id='0' level='3' type='both' size='3' unit='MiB'/>
|
||
<cache id='1' level='3' type='both' size='3' unit='MiB'/>
|
||
</cachetune>
|
||
<memorytune vcpus='0-3'>
|
||
<node id='0' bandwidth='60'/>
|
||
</memorytune>
|
||
|
||
</cputune>
|
||
...
|
||
</domain>
|
||
</pre>
|
||
|
||
<dl>
|
||
<dt><code>cputune</code></dt>
|
||
<dd>
|
||
The optional <code>cputune</code> element provides details
|
||
regarding the CPU tunable parameters for the domain.
|
||
<span class="since">Since 0.9.0</span>
|
||
</dd>
|
||
<dt><code>vcpupin</code></dt>
|
||
<dd>
|
||
The optional <code>vcpupin</code> element specifies which of host's
|
||
physical CPUs the domain vCPU will be pinned to. If this is omitted,
|
||
and attribute <code>cpuset</code> of element <code>vcpu</code> is
|
||
not specified, the vCPU is pinned to all the physical CPUs by default.
|
||
It contains two required attributes, the attribute <code>vcpu</code>
|
||
specifies vCPU id, and the attribute <code>cpuset</code> is same as
|
||
attribute <code>cpuset</code> of element <code>vcpu</code>.
|
||
(NB: Only qemu driver support)
|
||
<span class="since">Since 0.9.0</span>
|
||
</dd>
|
||
<dt><code>emulatorpin</code></dt>
|
||
<dd>
|
||
The optional <code>emulatorpin</code> element specifies which of host
|
||
physical CPUs the "emulator", a subset of a domain not including vCPU
|
||
or iothreads will be pinned to. If this is omitted, and attribute
|
||
<code>cpuset</code> of element <code>vcpu</code> is not specified,
|
||
"emulator" is pinned to all the physical CPUs by default. It contains
|
||
one required attribute <code>cpuset</code> specifying which physical
|
||
CPUs to pin to.
|
||
</dd>
|
||
<dt><code>iothreadpin</code></dt>
|
||
<dd>
|
||
The optional <code>iothreadpin</code> element specifies which of host
|
||
physical CPUs the IOThreads will be pinned to. If this is omitted
|
||
and attribute <code>cpuset</code> of element <code>vcpu</code> is
|
||
not specified, the IOThreads are pinned to all the physical CPUs
|
||
by default. There are two required attributes, the attribute
|
||
<code>iothread</code> specifies the IOThread ID and the attribute
|
||
<code>cpuset</code> specifying which physical CPUs to pin to. See
|
||
the <code>iothreadids</code>
|
||
<a href="#elementsIOThreadsAllocation"><code>description</code></a>
|
||
for valid <code>iothread</code> values.
|
||
<span class="since">Since 1.2.9</span>
|
||
</dd>
|
||
<dt><code>shares</code></dt>
|
||
<dd>
|
||
The optional <code>shares</code> element specifies the proportional
|
||
weighted share for the domain. If this is omitted, it defaults to
|
||
the OS provided defaults. NB, There is no unit for the value,
|
||
it's a relative measure based on the setting of other VM,
|
||
e.g. A VM configured with value
|
||
2048 will get twice as much CPU time as a VM configured with value 1024.
|
||
<span class="since">Since 0.9.0</span>
|
||
</dd>
|
||
<dt><code>period</code></dt>
|
||
<dd>
|
||
The optional <code>period</code> element specifies the enforcement
|
||
interval (unit: microseconds). Within <code>period</code>, each vCPU of
|
||
the domain will not be allowed to consume more than <code>quota</code>
|
||
worth of runtime. The value should be in range [1000, 1000000]. A period
|
||
with value 0 means no value.
|
||
<span class="since">Only QEMU driver support since 0.9.4, LXC since
|
||
0.9.10</span>
|
||
</dd>
|
||
<dt><code>quota</code></dt>
|
||
<dd>
|
||
The optional <code>quota</code> element specifies the maximum allowed
|
||
bandwidth (unit: microseconds). A domain with <code>quota</code> as any
|
||
negative value indicates that the domain has infinite bandwidth for
|
||
vCPU threads, which means that it is not bandwidth controlled. The value
|
||
should be in range [1000, 18446744073709551] or less than 0. A quota
|
||
with value 0 means no value. You can use this feature to ensure that all
|
||
vCPUs run at the same speed.
|
||
<span class="since">Only QEMU driver support since 0.9.4, LXC since
|
||
0.9.10</span>
|
||
</dd>
|
||
<dt><code>global_period</code></dt>
|
||
<dd>
|
||
The optional <code>global_period</code> element specifies the
|
||
enforcement CFS scheduler interval (unit: microseconds) for the whole
|
||
domain in contrast with <code>period</code> which enforces the interval
|
||
per vCPU. The value should be in range 1000, 1000000]. A
|
||
<code>global_period</code> with value 0 means no value.
|
||
<span class="since">Only QEMU driver support since 1.3.3</span>
|
||
</dd>
|
||
<dt><code>global_quota</code></dt>
|
||
<dd>
|
||
The optional <code>global_quota</code> element specifies the maximum
|
||
allowed bandwidth (unit: microseconds) within a period for the whole
|
||
domain. A domain with <code>global_quota</code> as any negative
|
||
value indicates that the domain has infinite bandwidth, which means that
|
||
it is not bandwidth controlled. The value should be in range
|
||
[1000, 18446744073709551] or less than 0. A <code>global_quota</code>
|
||
with value 0 means no value.
|
||
<span class="since">Only QEMU driver support since 1.3.3</span>
|
||
</dd>
|
||
|
||
<dt><code>emulator_period</code></dt>
|
||
<dd>
|
||
The optional <code>emulator_period</code> element specifies the enforcement
|
||
interval (unit: microseconds). Within <code>emulator_period</code>, emulator
|
||
threads (those excluding vCPUs) of the domain will not be allowed to consume
|
||
more than <code>emulator_quota</code> worth of runtime. The value should be
|
||
in range [1000, 1000000]. A period with value 0 means no value.
|
||
<span class="since">Only QEMU driver support since 0.10.0</span>
|
||
</dd>
|
||
<dt><code>emulator_quota</code></dt>
|
||
<dd>
|
||
The optional <code>emulator_quota</code> element specifies the maximum
|
||
allowed bandwidth (unit: microseconds) for domain's emulator threads (those
|
||
excluding vCPUs). A domain with <code>emulator_quota</code> as any negative
|
||
value indicates that the domain has infinite bandwidth for emulator threads
|
||
(those excluding vCPUs), which means that it is not bandwidth controlled.
|
||
The value should be in range [1000, 18446744073709551] or less than 0. A
|
||
quota with value 0 means no value.
|
||
<span class="since">Only QEMU driver support since 0.10.0</span>
|
||
</dd>
|
||
|
||
<dt><code>iothread_period</code></dt>
|
||
<dd>
|
||
The optional <code>iothread_period</code> element specifies the
|
||
enforcement interval (unit: microseconds) for IOThreads. Within
|
||
<code>iothread_period</code>, each IOThread of the domain will
|
||
not be allowed to consume more than <code>iothread_quota</code>
|
||
worth of runtime. The value should be in range [1000, 1000000].
|
||
An iothread_period with value 0 means no value.
|
||
<span class="since">Only QEMU driver support since 2.1.0</span>
|
||
</dd>
|
||
<dt><code>iothread_quota</code></dt>
|
||
<dd>
|
||
The optional <code>iothread_quota</code> element specifies the maximum
|
||
allowed bandwidth (unit: microseconds) for IOThreads. A domain with
|
||
<code>iothread_quota</code> as any negative value indicates that the
|
||
domain IOThreads have infinite bandwidth, which means that it is
|
||
not bandwidth controlled. The value should be in range
|
||
[1000, 18446744073709551] or less than 0. An <code>iothread_quota</code>
|
||
with value 0 means no value. You can use this feature to ensure that
|
||
all IOThreads run at the same speed.
|
||
<span class="since">Only QEMU driver support since 2.1.0</span>
|
||
</dd>
|
||
|
||
<dt><code>vcpusched</code> and <code>iothreadsched</code></dt>
|
||
<dd>
|
||
The optional <code>vcpusched</code> elements specifies the scheduler
|
||
type (values <code>batch</code>, <code>idle</code>, <code>fifo</code>,
|
||
<code>rr</code>) for particular vCPU/IOThread threads (based on
|
||
<code>vcpus</code> and <code>iothreads</code>, leaving out
|
||
<code>vcpus</code>/<code>iothreads</code> sets the default). Valid
|
||
<code>vcpus</code> values start at 0 through one less than the
|
||
number of vCPU's defined for the domain. Valid <code>iothreads</code>
|
||
values are described in the <code>iothreadids</code>
|
||
<a href="#elementsIOThreadsAllocation"><code>description</code></a>.
|
||
If no <code>iothreadids</code> are defined, then libvirt numbers
|
||
IOThreads from 1 to the number of <code>iothreads</code> available
|
||
for the domain. For real-time schedulers (<code>fifo</code>,
|
||
<code>rr</code>), priority must be specified as
|
||
well (and is ignored for non-real-time ones). The value range
|
||
for the priority depends on the host kernel (usually 1-99).
|
||
<span class="since">Since 1.2.13</span>
|
||
</dd>
|
||
|
||
<dt><code>cachetune</code><span class="since">Since 4.1.0</span></dt>
|
||
<dd>
|
||
Optional <code>cachetune</code> element can control allocations for CPU
|
||
caches using the resctrl on the host. Whether or not is this supported
|
||
can be gathered from capabilities where some limitations like minimum
|
||
size and required granularity are reported as well. The required
|
||
attribute <code>vcpus</code> specifies to which vCPUs this allocation
|
||
applies. A vCPU can only be member of one <code>cachetune</code> element
|
||
allocation. The vCPUs specified by cachetune can be identical with those
|
||
in memorytune, however they are not allowed to overlap.
|
||
Supported subelements are:
|
||
<dl>
|
||
<dt><code>cache</code></dt>
|
||
<dd>
|
||
This element controls the allocation of CPU cache and has the
|
||
following attributes:
|
||
<dl>
|
||
<dt><code>level</code></dt>
|
||
<dd>
|
||
Host cache level from which to allocate.
|
||
</dd>
|
||
<dt><code>id</code></dt>
|
||
<dd>
|
||
Host cache id from which to allocate.
|
||
</dd>
|
||
<dt><code>type</code></dt>
|
||
<dd>
|
||
Type of allocation. Can be <code>code</code> for code
|
||
(instructions), <code>data</code> for data or <code>both</code>
|
||
for both code and data (unified). Currently the allocation can
|
||
be done only with the same type as the host supports, meaning
|
||
you cannot request <code>both</code> for host with CDP
|
||
(code/data prioritization) enabled.
|
||
</dd>
|
||
<dt><code>size</code></dt>
|
||
<dd>
|
||
The size of the region to allocate. The value by default is in
|
||
bytes, but the <code>unit</code> attribute can be used to scale
|
||
the value.
|
||
</dd>
|
||
<dt><code>unit</code> (optional)</dt>
|
||
<dd>
|
||
If specified it is the unit such as KiB, MiB, GiB, or TiB
|
||
(described in the <code>memory</code> element
|
||
for <a href="#elementsMemoryAllocation">Memory Allocation</a>)
|
||
in which <code>size</code> is specified, defaults to bytes.
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
|
||
<dt><code>memorytune</code><span class="since">Since 4.7.0</span></dt>
|
||
<dd>
|
||
Optional <code>memorytune</code> element can control allocations for
|
||
memory bandwidth using the resctrl on the host. Whether or not is this
|
||
supported can be gathered from capabilities where some limitations like
|
||
minimum bandwidth and required granularity are reported as well. The
|
||
required attribute <code>vcpus</code> specifies to which vCPUs this
|
||
allocation applies. A vCPU can only be member of one
|
||
<code>memorytune</code> element allocation. The <code>vcpus</code> specified
|
||
by <code>memorytune</code> can be identical to those specified by
|
||
<code>cachetune</code>. However they are not allowed to overlap each other.
|
||
Supported subelements are:
|
||
<dl>
|
||
<dt><code>node</code></dt>
|
||
<dd>
|
||
This element controls the allocation of CPU memory bandwidth and has the
|
||
following attributes:
|
||
<dl>
|
||
<dt><code>id</code></dt>
|
||
<dd>
|
||
Host node id from which to allocate memory bandwidth.
|
||
</dd>
|
||
<dt><code>bandwidth</code></dt>
|
||
<dd>
|
||
The memory bandwidth to allocate from this node. The value by default
|
||
is in percentage.
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
</dl>
|
||
|
||
|
||
<h3><a id="elementsMemoryAllocation">Memory Allocation</a></h3>
|
||
|
||
<pre>
|
||
<domain>
|
||
...
|
||
<maxMemory slots='16' unit='KiB'>1524288</maxMemory>
|
||
<memory unit='KiB'>524288</memory>
|
||
<currentMemory unit='KiB'>524288</currentMemory>
|
||
...
|
||
</domain>
|
||
</pre>
|
||
|
||
<dl>
|
||
<dt><code>memory</code></dt>
|
||
<dd>The maximum allocation of memory for the guest at boot time. The
|
||
memory allocation includes possible additional memory devices specified
|
||
at start or hotplugged later.
|
||
The units for this value are determined by the optional
|
||
attribute <code>unit</code>, which defaults to "KiB"
|
||
(kibibytes, 2<sup>10</sup> or blocks of 1024 bytes). Valid
|
||
units are "b" or "bytes" for bytes, "KB" for kilobytes
|
||
(10<sup>3</sup> or 1,000 bytes), "k" or "KiB" for kibibytes
|
||
(1024 bytes), "MB" for megabytes (10<sup>6</sup> or 1,000,000
|
||
bytes), "M" or "MiB" for mebibytes (2<sup>20</sup> or
|
||
1,048,576 bytes), "GB" for gigabytes (10<sup>9</sup> or
|
||
1,000,000,000 bytes), "G" or "GiB" for gibibytes
|
||
(2<sup>30</sup> or 1,073,741,824 bytes), "TB" for terabytes
|
||
(10<sup>12</sup> or 1,000,000,000,000 bytes), or "T" or "TiB"
|
||
for tebibytes (2<sup>40</sup> or 1,099,511,627,776 bytes).
|
||
However, the value will be rounded up to the nearest kibibyte
|
||
by libvirt, and may be further rounded to the granularity
|
||
supported by the hypervisor. Some hypervisors also enforce a
|
||
minimum, such as 4000KiB.
|
||
|
||
In case <a href="#elementsCPU">NUMA</a> is configured for the guest the
|
||
<code>memory</code> element can be omitted.
|
||
|
||
In the case of crash, optional attribute <code>dumpCore</code>
|
||
can be used to control whether the guest memory should be
|
||
included in the generated coredump or not (values "on", "off").
|
||
|
||
<span class='since'><code>unit</code> since 0.9.11</span>,
|
||
<span class='since'><code>dumpCore</code> since 0.10.2
|
||
(QEMU only)</span></dd>
|
||
<dt><code>maxMemory</code></dt>
|
||
<dd>The run time maximum memory allocation of the guest. The initial
|
||
memory specified by either the <code><memory></code> element or
|
||
the NUMA cell size configuration can be increased by hot-plugging of
|
||
memory to the limit specified by this element.
|
||
|
||
The <code>unit</code> attribute behaves the same as for
|
||
<code><memory></code>.
|
||
|
||
The <code>slots</code> attribute specifies the number of slots
|
||
available for adding memory to the guest. The bounds are hypervisor
|
||
specific.
|
||
|
||
Note that due to alignment of the memory chunks added via memory
|
||
hotplug the full size allocation specified by this element may be
|
||
impossible to achieve.
|
||
<span class='since'>Since 1.2.14 supported by the QEMU driver.</span>
|
||
</dd>
|
||
|
||
<dt><code>currentMemory</code></dt>
|
||
<dd>The actual allocation of memory for the guest. This value can
|
||
be less than the maximum allocation, to allow for ballooning
|
||
up the guests memory on the fly. If this is omitted, it defaults
|
||
to the same value as the <code>memory</code> element.
|
||
The <code>unit</code> attribute behaves the same as
|
||
for <code>memory</code>.</dd>
|
||
</dl>
|
||
|
||
|
||
<h3><a id="elementsMemoryBacking">Memory Backing</a></h3>
|
||
|
||
<pre>
|
||
<domain>
|
||
...
|
||
<memoryBacking>
|
||
<hugepages>
|
||
<page size="1" unit="G" nodeset="0-3,5"/>
|
||
<page size="2" unit="M" nodeset="4"/>
|
||
</hugepages>
|
||
<nosharepages/>
|
||
<locked/>
|
||
<source type="file|anonymous"/>
|
||
<access mode="shared|private"/>
|
||
<allocation mode="immediate|ondemand"/>
|
||
<discard/>
|
||
</memoryBacking>
|
||
...
|
||
</domain>
|
||
</pre>
|
||
|
||
<p>The optional <code>memoryBacking</code> element may contain several
|
||
elements that influence how virtual memory pages are backed by host
|
||
pages.</p>
|
||
|
||
<dl>
|
||
<dt><code>hugepages</code></dt>
|
||
<dd>This tells the hypervisor that the guest should have its memory
|
||
allocated using hugepages instead of the normal native page size.
|
||
<span class='since'>Since 1.2.5</span> it's possible to set hugepages
|
||
more specifically per numa node. The <code>page</code> element is
|
||
introduced. It has one compulsory attribute <code>size</code> which
|
||
specifies which hugepages should be used (especially useful on systems
|
||
supporting hugepages of different sizes). The default unit for the
|
||
<code>size</code> attribute is kilobytes (multiplier of 1024). If you
|
||
want to use different unit, use optional <code>unit</code> attribute.
|
||
For systems with NUMA, the optional <code>nodeset</code> attribute may
|
||
come handy as it ties given guest's NUMA nodes to certain hugepage
|
||
sizes. From the example snippet, one gigabyte hugepages are used for
|
||
every NUMA node except node number four. For the correct syntax see
|
||
<a href="#elementsNUMATuning">this</a>.</dd>
|
||
<dt><code>nosharepages</code></dt>
|
||
<dd>Instructs hypervisor to disable shared pages (memory merge, KSM) for
|
||
this domain. <span class="since">Since 1.0.6</span></dd>
|
||
<dt><code>locked</code></dt>
|
||
<dd>When set and supported by the hypervisor, memory pages belonging
|
||
to the domain will be locked in host's memory and the host will not
|
||
be allowed to swap them out, which might be required for some
|
||
workloads such as real-time. For QEMU/KVM guests, the memory used by
|
||
the QEMU process itself will be locked too: unlike guest memory, this
|
||
is an amount libvirt has no way of figuring out in advance, so it has
|
||
to remove the limit on locked memory altogether. Thus, enabling this
|
||
option opens up to a potential security risk: the host will be unable
|
||
to reclaim the locked memory back from the guest when it's running out
|
||
of memory, which means a malicious guest allocating large amounts of
|
||
locked memory could cause a denial-of-service attack on the host.
|
||
Because of this, using this option is discouraged unless your workload
|
||
demands it; even then, it's highly recommended to set a
|
||
<code>hard_limit</code> (see
|
||
<a href="#elementsMemoryTuning">memory tuning</a>) on memory allocation
|
||
suitable for the specific environment at the same time to mitigate
|
||
the risks described above. <span class="since">Since 1.0.6</span></dd>
|
||
<dt><code>source</code></dt>
|
||
<dd>Using the <code>type</code> attribute, it's possible to provide
|
||
"file" to utilize file memorybacking or keep the default
|
||
"anonymous".</dd>
|
||
<dt><code>access</code></dt>
|
||
<dd>Using the <code>mode</code> attribute, specify if the memory is
|
||
to be "shared" or "private". This can be overridden per numa node by
|
||
<code>memAccess</code>.</dd>
|
||
<dt><code>allocation</code></dt>
|
||
<dd>Using the <code>mode</code> attribute, specify when to allocate
|
||
the memory by supplying either "immediate" or "ondemand".</dd>
|
||
<dt><code>discard</code></dt>
|
||
<dd>When set and supported by hypervisor the memory
|
||
content is discarded just before guest shuts down (or
|
||
when DIMM module is unplugged). Please note that this is
|
||
just an optimization and is not guaranteed to work in
|
||
all cases (e.g. when hypervisor crashes).
|
||
<span class="since">Since 4.4.0</span> (QEMU/KVM only)
|
||
</dd>
|
||
</dl>
|
||
|
||
|
||
<h3><a id="elementsMemoryTuning">Memory Tuning</a></h3>
|
||
|
||
<pre>
|
||
<domain>
|
||
...
|
||
<memtune>
|
||
<hard_limit unit='G'>1</hard_limit>
|
||
<soft_limit unit='M'>128</soft_limit>
|
||
<swap_hard_limit unit='G'>2</swap_hard_limit>
|
||
<min_guarantee unit='bytes'>67108864</min_guarantee>
|
||
</memtune>
|
||
...
|
||
</domain>
|
||
</pre>
|
||
|
||
<dl>
|
||
<dt><code>memtune</code></dt>
|
||
<dd> The optional <code>memtune</code> element provides details
|
||
regarding the memory tunable parameters for the domain. If this is
|
||
omitted, it defaults to the OS provided defaults. For QEMU/KVM, the
|
||
parameters are applied to the QEMU process as a whole. Thus, when
|
||
counting them, one needs to add up guest RAM, guest video RAM, and
|
||
some memory overhead of QEMU itself. The last piece is hard to
|
||
determine so one needs guess and try. For each tunable, it
|
||
is possible to designate which unit the number is in on
|
||
input, using the same values as
|
||
for <code><memory></code>. For backwards
|
||
compatibility, output is always in
|
||
KiB. <span class='since'><code>unit</code>
|
||
since 0.9.11</span>
|
||
Possible values for all *_limit parameters are in range from 0 to
|
||
VIR_DOMAIN_MEMORY_PARAM_UNLIMITED.</dd>
|
||
<dt><code>hard_limit</code></dt>
|
||
<dd> The optional <code>hard_limit</code> element is the maximum memory
|
||
the guest can use. The units for this value are kibibytes (i.e. blocks
|
||
of 1024 bytes). Users of QEMU and KVM are strongly advised not to set
|
||
this limit as domain may get killed by the kernel if the guess is too
|
||
low, and determining the memory needed for a process to run is an
|
||
<a href="http://en.wikipedia.org/wiki/Undecidable_problem">
|
||
undecidable problem</a>; that said, if you already set
|
||
<code>locked</code> in
|
||
<a href="#elementsMemoryBacking">memory backing</a> because your
|
||
workload demands it, you'll have to take into account the specifics of
|
||
your deployment and figure out a value for <code>hard_limit</code> that
|
||
balances the risk of your guest being killed because the limit was set
|
||
too low and the risk of your host crashing because it cannot reclaim
|
||
the memory used by the guest due to <code>locked</code>. Good luck!</dd>
|
||
<dt><code>soft_limit</code></dt>
|
||
<dd> The optional <code>soft_limit</code> element is the memory limit to
|
||
enforce during memory contention. The units for this value are
|
||
kibibytes (i.e. blocks of 1024 bytes)</dd>
|
||
<dt><code>swap_hard_limit</code></dt>
|
||
<dd> The optional <code>swap_hard_limit</code> element is the maximum
|
||
memory plus swap the guest can use. The units for this value are
|
||
kibibytes (i.e. blocks of 1024 bytes). This has to be more than
|
||
hard_limit value provided</dd>
|
||
<dt><code>min_guarantee</code></dt>
|
||
<dd> The optional <code>min_guarantee</code> element is the guaranteed
|
||
minimum memory allocation for the guest. The units for this value are
|
||
kibibytes (i.e. blocks of 1024 bytes). This element is only supported
|
||
by VMware ESX and OpenVZ drivers.</dd>
|
||
</dl>
|
||
|
||
|
||
<h3><a id="elementsNUMATuning">NUMA Node Tuning</a></h3>
|
||
|
||
<pre>
|
||
<domain>
|
||
...
|
||
<numatune>
|
||
<memory mode="strict" nodeset="1-4,^3"/>
|
||
<memnode cellid="0" mode="strict" nodeset="1"/>
|
||
<memnode cellid="2" mode="preferred" nodeset="2"/>
|
||
</numatune>
|
||
...
|
||
</domain>
|
||
</pre>
|
||
|
||
<dl>
|
||
<dt><code>numatune</code></dt>
|
||
<dd>
|
||
The optional <code>numatune</code> element provides details of
|
||
how to tune the performance of a NUMA host via controlling NUMA policy
|
||
for domain process. NB, only supported by QEMU driver.
|
||
<span class='since'>Since 0.9.3</span>
|
||
</dd>
|
||
<dt><code>memory</code></dt>
|
||
<dd>
|
||
The optional <code>memory</code> element specifies how to allocate memory
|
||
for the domain process on a NUMA host. It contains several optional
|
||
attributes. Attribute <code>mode</code> is either 'interleave',
|
||
'strict', or 'preferred', defaults to 'strict'. Attribute
|
||
<code>nodeset</code> specifies the NUMA nodes, using the same syntax as
|
||
attribute <code>cpuset</code> of element <code>vcpu</code>. Attribute
|
||
<code>placement</code> (<span class='since'>since 0.9.12</span>) can be
|
||
used to indicate the memory placement mode for domain process, its value
|
||
can be either "static" or "auto", defaults to <code>placement</code> of
|
||
<code>vcpu</code>, or "static" if <code>nodeset</code> is specified.
|
||
"auto" indicates the domain process will only allocate memory from the
|
||
advisory nodeset returned from querying numad, and the value of attribute
|
||
<code>nodeset</code> will be ignored if it's specified.
|
||
|
||
If <code>placement</code> of <code>vcpu</code> is 'auto', and
|
||
<code>numatune</code> is not specified, a default <code>numatune</code>
|
||
with <code>placement</code> 'auto' and <code>mode</code> 'strict' will
|
||
be added implicitly.
|
||
|
||
<span class='since'>Since 0.9.3</span>
|
||
</dd>
|
||
<dt><code>memnode</code></dt>
|
||
<dd>
|
||
Optional <code>memnode</code> elements can specify memory allocation
|
||
policies per each guest NUMA node. For those nodes having no
|
||
corresponding <code>memnode</code> element, the default from
|
||
element <code>memory</code> will be used. Attribute <code>cellid</code>
|
||
addresses guest NUMA node for which the settings are applied.
|
||
Attributes <code>mode</code> and <code>nodeset</code> have the same
|
||
meaning and syntax as in <code>memory</code> element.
|
||
|
||
This setting is not compatible with automatic placement.
|
||
<span class='since'>QEMU Since 1.2.7</span>
|
||
</dd>
|
||
</dl>
|
||
|
||
|
||
<h3><a id="elementsBlockTuning">Block I/O Tuning</a></h3>
|
||
<pre>
|
||
<domain>
|
||
...
|
||
<blkiotune>
|
||
<weight>800</weight>
|
||
<device>
|
||
<path>/dev/sda</path>
|
||
<weight>1000</weight>
|
||
</device>
|
||
<device>
|
||
<path>/dev/sdb</path>
|
||
<weight>500</weight>
|
||
<read_bytes_sec>10000</read_bytes_sec>
|
||
<write_bytes_sec>10000</write_bytes_sec>
|
||
<read_iops_sec>20000</read_iops_sec>
|
||
<write_iops_sec>20000</write_iops_sec>
|
||
</device>
|
||
</blkiotune>
|
||
...
|
||
</domain>
|
||
</pre>
|
||
|
||
<dl>
|
||
<dt><code>blkiotune</code></dt>
|
||
<dd> The optional <code>blkiotune</code> element provides the ability
|
||
to tune Blkio cgroup tunable parameters for the domain. If this is
|
||
omitted, it defaults to the OS provided
|
||
defaults. <span class="since">Since 0.8.8</span></dd>
|
||
<dt><code>weight</code></dt>
|
||
<dd> The optional <code>weight</code> element is the overall I/O
|
||
weight of the guest. The value should be in the range [100,
|
||
1000]. After kernel 2.6.39, the value could be in the
|
||
range [10, 1000].</dd>
|
||
<dt><code>device</code></dt>
|
||
<dd>The domain may have multiple <code>device</code> elements
|
||
that further tune the weights for each host block device in
|
||
use by the domain. Note that
|
||
multiple <a href="#elementsDisks">guest disks</a> can share a
|
||
single host block device, if they are backed by files within
|
||
the same host file system, which is why this tuning parameter
|
||
is at the global domain level rather than associated with each
|
||
guest disk device (contrast this to
|
||
the <a href="#elementsDisks"><code><iotune></code></a>
|
||
element which can apply to an
|
||
individual <code><disk></code>).
|
||
Each <code>device</code> element has two
|
||
mandatory sub-elements, <code>path</code> describing the
|
||
absolute path of the device, and <code>weight</code> giving
|
||
the relative weight of that device, in the range [100,
|
||
1000]. After kernel 2.6.39, the value could be in the
|
||
range [10, 1000]. <span class="since">Since 0.9.8</span><br/>
|
||
Additionally, the following optional sub-elements can be used:
|
||
<dl>
|
||
<dt><code>read_bytes_sec</code></dt>
|
||
<dd>Read throughput limit in bytes per second.
|
||
<span class="since">Since 1.2.2</span></dd>
|
||
<dt><code>write_bytes_sec</code></dt>
|
||
<dd>Write throughput limit in bytes per second.
|
||
<span class="since">Since 1.2.2</span></dd>
|
||
<dt><code>read_iops_sec</code></dt>
|
||
<dd>Read I/O operations per second limit.
|
||
<span class="since">Since 1.2.2</span></dd>
|
||
<dt><code>write_iops_sec</code></dt>
|
||
<dd>Write I/O operations per second limit.
|
||
<span class="since">Since 1.2.2</span></dd>
|
||
</dl></dd></dl>
|
||
|
||
|
||
<h3><a id="resPartition">Resource partitioning</a></h3>
|
||
|
||
<p>
|
||
Hypervisors may allow for virtual machines to be placed into
|
||
resource partitions, potentially with nesting of said partitions.
|
||
The <code>resource</code> element groups together configuration
|
||
related to resource partitioning. It currently supports a child
|
||
element <code>partition</code> whose content defines the absolute path
|
||
of the resource partition in which to place the domain. If no
|
||
partition is listed, then the domain will be placed in a default
|
||
partition. It is the responsibility of the app/admin to ensure
|
||
that the partition exists prior to starting the guest. Only the
|
||
(hypervisor specific) default partition can be assumed to exist
|
||
by default.
|
||
</p>
|
||
<pre>
|
||
...
|
||
<resource>
|
||
<partition>/virtualmachines/production</partition>
|
||
</resource>
|
||
...
|
||
</pre>
|
||
|
||
<p>
|
||
Resource partitions are currently supported by the QEMU and
|
||
LXC drivers, which map partition paths to cgroups directories,
|
||
in all mounted controllers. <span class="since">Since 1.0.5</span>
|
||
</p>
|
||
|
||
<h3><a id="elementsCPU">CPU model and topology</a></h3>
|
||
|
||
<p>
|
||
Requirements for CPU model, its features and topology can be specified
|
||
using the following collection of elements.
|
||
<span class="since">Since 0.7.5</span>
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<cpu match='exact'>
|
||
<model fallback='allow'>core2duo</model>
|
||
<vendor>Intel</vendor>
|
||
<topology sockets='1' cores='2' threads='1'/>
|
||
<cache level='3' mode='emulate'/>
|
||
<feature policy='disable' name='lahf_lm'/>
|
||
</cpu>
|
||
...</pre>
|
||
|
||
<pre>
|
||
<cpu mode='host-model'>
|
||
<model fallback='forbid'/>
|
||
<topology sockets='1' cores='2' threads='1'/>
|
||
</cpu>
|
||
...</pre>
|
||
|
||
<pre>
|
||
<cpu mode='host-passthrough'>
|
||
<cache mode='passthrough'/>
|
||
<feature policy='disable' name='lahf_lm'/>
|
||
...</pre>
|
||
|
||
<p>
|
||
In case no restrictions need to be put on CPU model and its features, a
|
||
simpler <code>cpu</code> element can be used.
|
||
<span class="since">Since 0.7.6</span>
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<cpu>
|
||
<topology sockets='1' cores='2' threads='1'/>
|
||
</cpu>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><code>cpu</code></dt>
|
||
<dd>The <code>cpu</code> element is the main container for describing
|
||
guest CPU requirements. Its <code>match</code> attribute specifies how
|
||
strictly the virtual CPU provided to the guest matches these
|
||
requirements. <span class="since">Since 0.7.6</span> the
|
||
<code>match</code> attribute can be omitted if <code>topology</code>
|
||
is the only element within <code>cpu</code>. Possible values for the
|
||
<code>match</code> attribute are:
|
||
|
||
<dl>
|
||
<dt><code>minimum</code></dt>
|
||
<dd>The specified CPU model and features describes the minimum
|
||
requested CPU. A better CPU will be provided to the guest if it
|
||
is possible with the requested hypervisor on the current host.
|
||
This is a constrained <code>host-model</code> mode; the domain
|
||
will not be created if the provided virtual CPU does not meet
|
||
the requirements.</dd>
|
||
|
||
<dt><code>exact</code></dt>
|
||
<dd>The virtual CPU provided to the guest should exactly match the
|
||
specification. If such CPU is not supported, libvirt will refuse
|
||
to start the domain.</dd>
|
||
|
||
<dt><code>strict</code></dt>
|
||
<dd>The domain will not be created unless the host CPU exactly
|
||
matches the specification. This is not very useful in practice
|
||
and should only be used if there is a real reason.</dd>
|
||
</dl>
|
||
|
||
<span class="since">Since 0.8.5</span> the <code>match</code>
|
||
attribute can be omitted and will default to <code>exact</code>.
|
||
|
||
Sometimes the hypervisor is not able to create a virtual CPU exactly
|
||
matching the specification passed by libvirt.
|
||
<span class="since">Since 3.2.0</span>, an optional <code>check</code>
|
||
attribute can be used to request a specific way of checking whether
|
||
the virtual CPU matches the specification. It is usually safe to omit
|
||
this attribute when starting a domain and stick with the default
|
||
value. Once the domain starts, libvirt will automatically change the
|
||
<code>check</code> attribute to the best supported value to ensure the
|
||
virtual CPU does not change when the domain is migrated to another
|
||
host. The following values can be used:
|
||
|
||
<dl>
|
||
<dt><code>none</code></dt>
|
||
<dd>Libvirt does no checking and it is up to the hypervisor to
|
||
refuse to start the domain if it cannot provide the requested CPU.
|
||
With QEMU this means no checking is done at all since the default
|
||
behavior of QEMU is to emit warnings, but start the domain anyway.
|
||
</dd>
|
||
|
||
<dt><code>partial</code></dt>
|
||
<dd>Libvirt will check the guest CPU specification before starting
|
||
a domain, but the rest is left on the hypervisor. It can still
|
||
provide a different virtual CPU.</dd>
|
||
|
||
<dt><code>full</code></dt>
|
||
<dd>The virtual CPU created by the hypervisor will be checked
|
||
against the CPU specification and the domain will not be started
|
||
unless the two CPUs match.</dd>
|
||
</dl>
|
||
|
||
<span class="since">Since 0.9.10</span>, an optional <code>mode</code>
|
||
attribute may be used to make it easier to configure a guest CPU to be
|
||
as close to host CPU as possible. Possible values for the
|
||
<code>mode</code> attribute are:
|
||
|
||
<dl>
|
||
<dt><code>custom</code></dt>
|
||
<dd>In this mode, the <code>cpu</code> element describes the CPU
|
||
that should be presented to the guest. This is the default when no
|
||
<code>mode</code> attribute is specified. This mode makes it so that
|
||
a persistent guest will see the same hardware no matter what host
|
||
the guest is booted on.</dd>
|
||
<dt><code>host-model</code></dt>
|
||
<dd>The <code>host-model</code> mode is essentially a shortcut to
|
||
copying host CPU definition from capabilities XML into domain XML.
|
||
Since the CPU definition is copied just before starting a domain,
|
||
exactly the same XML can be used on different hosts while still
|
||
providing the best guest CPU each host supports. The
|
||
<code>match</code> attribute can't be used in this mode. Specifying
|
||
CPU model is not supported either, but <code>model</code>'s
|
||
<code>fallback</code> attribute may still be used. Using the
|
||
<code>feature</code> element, specific flags may be enabled or
|
||
disabled specifically in addition to the host model. This may be
|
||
used to fine tune features that can be emulated.
|
||
<span class="since">(Since 1.1.1)</span>.
|
||
Libvirt does not model every aspect of each CPU so
|
||
the guest CPU will not match the host CPU exactly. On the other
|
||
hand, the ABI provided to the guest is reproducible. During
|
||
migration, complete CPU model definition is transferred to the
|
||
destination host so the migrated guest will see exactly the same CPU
|
||
model even if the destination host contains more capable CPUs for
|
||
the running instance of the guest; but shutting down and restarting
|
||
the guest may present different hardware to the guest according to
|
||
the capabilities of the new host. Prior to libvirt 3.2.0 and QEMU
|
||
2.9.0 detection of the host CPU model via QEMU is not supported.
|
||
Thus the CPU configuration created using <code>host-model</code>
|
||
may not work as expected.
|
||
<span class="since">Since 3.2.0 and QEMU 2.9.0</span> this mode
|
||
works the way it was designed and it is indicated by the
|
||
<code>fallback</code> attribute set to <code>forbid</code> in the
|
||
host-model CPU definition advertised in
|
||
<a href="formatdomaincaps.html#elementsCPU">domain capabilities XML</a>.
|
||
When <code>fallback</code> attribute is set to <code>allow</code>
|
||
in the domain capabilities XML, it is recommended to use
|
||
<code>custom</code> mode with just the CPU model from the host
|
||
capabilities XML. <span class="since">Since 1.2.11</span> PowerISA
|
||
allows processors to run VMs in binary compatibility mode supporting
|
||
an older version of ISA. Libvirt on PowerPC architecture uses the
|
||
<code>host-model</code> to signify a guest mode CPU running in
|
||
binary compatibility mode. Example:
|
||
When a user needs a power7 VM to run in compatibility mode
|
||
on a Power8 host, this can be described in XML as follows :
|
||
<pre>
|
||
<cpu mode='host-model'>
|
||
<model>power7</model>
|
||
</cpu>
|
||
...</pre>
|
||
</dd>
|
||
<dt><code>host-passthrough</code></dt>
|
||
<dd>With this mode, the CPU visible to the guest should be exactly
|
||
the same as the host CPU even in the aspects that libvirt does not
|
||
understand. Though the downside of this mode is that the guest
|
||
environment cannot be reproduced on different hardware. Thus, if you
|
||
hit any bugs, you are on your own. Further details of that CPU can
|
||
be changed using <code>feature</code> elements. Migration of a guest
|
||
using host-passthrough is dangerous if the source and destination
|
||
hosts are not identical in both hardware and configuration. If such
|
||
a migration is attempted then the guest may hang or crash upon
|
||
resuming execution on the destination host.</dd>
|
||
</dl>
|
||
|
||
Both <code>host-model</code> and <code>host-passthrough</code> modes
|
||
make sense when a domain can run directly on the host CPUs (for
|
||
example, domains with type <code>kvm</code>). The actual host CPU is
|
||
irrelevant for domains with emulated virtual CPUs (such as domains with
|
||
type <code>qemu</code>). However, for backward compatibility
|
||
<code>host-model</code> may be implemented even for domains running on
|
||
emulated CPUs in which case the best CPU the hypervisor is able to
|
||
emulate may be used rather then trying to mimic the host CPU model.
|
||
</dd>
|
||
|
||
<dt><code>model</code></dt>
|
||
<dd>The content of the <code>model</code> element specifies CPU model
|
||
requested by the guest. The list of available CPU models and their
|
||
definition can be found in <code>cpu_map.xml</code> file installed
|
||
in libvirt's data directory. If a hypervisor is not able to use the
|
||
exact CPU model, libvirt automatically falls back to a closest model
|
||
supported by the hypervisor while maintaining the list of CPU
|
||
features. <span class="since">Since 0.9.10</span>, an optional
|
||
<code>fallback</code> attribute can be used to forbid this behavior,
|
||
in which case an attempt to start a domain requesting an unsupported
|
||
CPU model will fail. Supported values for <code>fallback</code>
|
||
attribute are: <code>allow</code> (this is the default), and
|
||
<code>forbid</code>. The optional <code>vendor_id</code> attribute
|
||
(<span class="since">Since 0.10.0</span>) can be used to set the
|
||
vendor id seen by the guest. It must be exactly 12 characters long.
|
||
If not set the vendor id of the host is used. Typical possible
|
||
values are "AuthenticAMD" and "GenuineIntel".</dd>
|
||
|
||
<dt><code>vendor</code></dt>
|
||
<dd><span class="since">Since 0.8.3</span> the content of the
|
||
<code>vendor</code> element specifies CPU vendor requested by the
|
||
guest. If this element is missing, the guest can be run on a CPU
|
||
matching given features regardless on its vendor. The list of
|
||
supported vendors can be found in <code>cpu_map.xml</code>.</dd>
|
||
|
||
<dt><code>topology</code></dt>
|
||
<dd>The <code>topology</code> element specifies requested topology of
|
||
virtual CPU provided to the guest. Three non-zero values have to be
|
||
given for <code>sockets</code>, <code>cores</code>, and
|
||
<code>threads</code>: total number of CPU sockets, number of cores per
|
||
socket, and number of threads per core, respectively. Hypervisors may
|
||
require that the maximum number of vCPUs specified by the
|
||
<code>cpus</code> element equals to the number of vcpus resulting
|
||
from the topology.</dd>
|
||
|
||
<dt><code>feature</code></dt>
|
||
<dd>The <code>cpu</code> element can contain zero or more
|
||
<code>elements</code> used to fine-tune features provided by the
|
||
selected CPU model. The list of known feature names can be found in
|
||
the same file as CPU models. The meaning of each <code>feature</code>
|
||
element depends on its <code>policy</code> attribute, which has to be
|
||
set to one of the following values:
|
||
|
||
<dl>
|
||
<dt><code>force</code></dt>
|
||
<dd>The virtual CPU will claim the feature is supported regardless
|
||
of it being supported by host CPU.</dd>
|
||
<dt><code>require</code></dt>
|
||
<dd>Guest creation will fail unless the feature is supported by the
|
||
host CPU or the hypervisor is able to emulate it.</dd>
|
||
<dt><code>optional</code></dt>
|
||
<dd>The feature will be supported by virtual CPU if and only if it
|
||
is supported by host CPU.</dd>
|
||
<dt><code>disable</code></dt>
|
||
<dd>The feature will not be supported by virtual CPU.</dd>
|
||
<dt><code>forbid</code></dt>
|
||
<dd>Guest creation will fail if the feature is supported by host
|
||
CPU.</dd>
|
||
</dl>
|
||
|
||
<span class="since">Since 0.8.5</span> the <code>policy</code>
|
||
attribute can be omitted and will default to <code>require</code>.
|
||
|
||
<p> Individual CPU feature names are specified as part of the
|
||
<code>name</code> attribute. For example, to explicitly specify
|
||
the 'pcid' feature with Intel IvyBridge CPU model:
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<cpu match='exact'>
|
||
<model fallback='forbid'>IvyBridge</model>
|
||
<vendor>Intel</vendor>
|
||
<feature policy='require' name='pcid'/>
|
||
</cpu>
|
||
...</pre>
|
||
|
||
</dd>
|
||
|
||
<dt><code>cache</code></dt>
|
||
<dd><span class="since">Since 3.3.0</span> the <code>cache</code>
|
||
element describes the virtual CPU cache. If the element is missing,
|
||
the hypervisor will use a sensible default.
|
||
|
||
<dl>
|
||
<dt><code>level</code></dt>
|
||
<dd>This optional attribute specifies which cache level is described
|
||
by the element. Missing attribute means the element describes all
|
||
CPU cache levels at once. Mixing <code>cache</code> elements with
|
||
the <code>level</code> attribute set and those without the
|
||
attribute is forbidden.</dd>
|
||
|
||
<dt><code>mode</code></dt>
|
||
<dd>
|
||
The following values are supported:
|
||
<dl>
|
||
<dt><code>emulate</code></dt>
|
||
<dd>The hypervisor will provide a fake CPU cache data.</dd>
|
||
|
||
<dt><code>passthrough</code></dt>
|
||
<dd>The real CPU cache data reported by the host CPU will be
|
||
passed through to the virtual CPU.</dd>
|
||
|
||
<dt><code>disable</code></dt>
|
||
<dd>The virtual CPU will report no CPU cache of the specified
|
||
level (or no cache at all if the <code>level</code> attribute
|
||
is missing).</dd>
|
||
</dl>
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
</dl>
|
||
|
||
<p>
|
||
Guest NUMA topology can be specified using the <code>numa</code> element.
|
||
<span class="since">Since 0.9.8</span>
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<cpu>
|
||
...
|
||
<numa>
|
||
<cell id='0' cpus='0-3' memory='512000' unit='KiB' discard='yes'/>
|
||
<cell id='1' cpus='4-7' memory='512000' unit='KiB' memAccess='shared'/>
|
||
</numa>
|
||
...
|
||
</cpu>
|
||
...</pre>
|
||
|
||
<p>
|
||
Each <code>cell</code> element specifies a NUMA cell or a NUMA node.
|
||
<code>cpus</code> specifies the CPU or range of CPUs that are
|
||
part of the node. <code>memory</code> specifies the node memory
|
||
in kibibytes (i.e. blocks of 1024 bytes).
|
||
<span class="since">Since 1.2.11</span> one can use an additional <a
|
||
href="#elementsMemoryAllocation"><code>unit</code></a> attribute to
|
||
define units in which <code>memory</code> is specified.
|
||
<span class="since">Since 1.2.7</span> all cells should
|
||
have <code>id</code> attribute in case referring to some cell is
|
||
necessary in the code, otherwise the cells are
|
||
assigned <code>id</code>s in the increasing order starting from
|
||
0. Mixing cells with and without the <code>id</code> attribute
|
||
is not recommended as it may result in unwanted behaviour.
|
||
|
||
<span class='since'>Since 1.2.9</span> the optional attribute
|
||
<code>memAccess</code> can control whether the memory is to be
|
||
mapped as "shared" or "private". This is valid only for
|
||
hugepages-backed memory and nvdimm modules.
|
||
|
||
Each <code>cell</code> element can have an optional
|
||
<code>discard</code> attribute which fine tunes the discard
|
||
feature for given numa node as described under
|
||
<a href="#elementsMemoryBacking">Memory Backing</a>.
|
||
Accepted values are <code>yes</code> and <code>no</code>.
|
||
<span class='since'>Since 4.4.0</span>
|
||
</p>
|
||
|
||
<p>
|
||
This guest NUMA specification is currently available only for
|
||
QEMU/KVM and Xen.
|
||
</p>
|
||
|
||
<p>
|
||
A NUMA hardware architecture supports the notion of distances
|
||
between NUMA cells. <span class="since">Since 3.10.0</span> it
|
||
is possible to define the distance between NUMA cells using the
|
||
<code>distances</code> element within a NUMA <code>cell</code>
|
||
description. The <code>sibling</code> sub-element is used to
|
||
specify the distance value between sibling NUMA cells. For more
|
||
details, see the chapter explaining the system's SLIT (System
|
||
Locality Information Table) within the ACPI (Advanced
|
||
Configuration and Power Interface) specification.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<cpu>
|
||
...
|
||
<numa>
|
||
<cell id='0' cpus='0,4-7' memory='512000' unit='KiB'>
|
||
<distances>
|
||
<sibling id='0' value='10'/>
|
||
<sibling id='1' value='21'/>
|
||
<sibling id='2' value='31'/>
|
||
<sibling id='3' value='41'/>
|
||
</distances>
|
||
</cell>
|
||
<cell id='1' cpus='1,8-10,12-15' memory='512000' unit='KiB' memAccess='shared'>
|
||
<distances>
|
||
<sibling id='0' value='21'/>
|
||
<sibling id='1' value='10'/>
|
||
<sibling id='2' value='21'/>
|
||
<sibling id='3' value='31'/>
|
||
</distances>
|
||
</cell>
|
||
<cell id='2' cpus='2,11' memory='512000' unit='KiB' memAccess='shared'>
|
||
<distances>
|
||
<sibling id='0' value='31'/>
|
||
<sibling id='1' value='21'/>
|
||
<sibling id='2' value='10'/>
|
||
<sibling id='3' value='21'/>
|
||
</distances>
|
||
</cell>
|
||
<cell id='3' cpus='3' memory='512000' unit='KiB'>
|
||
<distances>
|
||
<sibling id='0' value='41'/>
|
||
<sibling id='1' value='31'/>
|
||
<sibling id='2' value='21'/>
|
||
<sibling id='3' value='10'/>
|
||
</distances>
|
||
</cell>
|
||
</numa>
|
||
...
|
||
</cpu>
|
||
...</pre>
|
||
|
||
<p>
|
||
Describing distances between NUMA cells is currently only supported
|
||
by Xen and QEMU. If no <code>distances</code> are given to describe
|
||
the SLIT data between different cells, it will default to a scheme
|
||
using 10 for local and 20 for remote distances.
|
||
</p>
|
||
|
||
<h3><a id="elementsEvents">Events configuration</a></h3>
|
||
|
||
<p>
|
||
It is sometimes necessary to override the default actions taken
|
||
on various events. Not all hypervisors support all events and actions.
|
||
The actions may be taken as a result of calls to libvirt APIs
|
||
<a href="html/libvirt-libvirt-domain.html#virDomainReboot">
|
||
<code>virDomainReboot</code>
|
||
</a>,
|
||
<a href="html/libvirt-libvirt-domain.html#virDomainShutdown">
|
||
<code>virDomainShutdown</code>
|
||
</a>,
|
||
or
|
||
<a href="html/libvirt-libvirt-domain.html#virDomainShutdownFlags">
|
||
<code>virDomainShutdownFlags</code>
|
||
</a>.
|
||
Using <code>virsh reboot</code> or <code>virsh shutdown</code> would
|
||
also trigger the event.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<on_poweroff>destroy</on_poweroff>
|
||
<on_reboot>restart</on_reboot>
|
||
<on_crash>restart</on_crash>
|
||
<on_lockfailure>poweroff</on_lockfailure>
|
||
...</pre>
|
||
|
||
<p>
|
||
The following collections of elements allow the actions to be
|
||
specified when a guest OS triggers a lifecycle operation. A
|
||
common use case is to force a reboot to be treated as a poweroff
|
||
when doing the initial OS installation. This allows the VM to be
|
||
re-configured for the first post-install bootup.
|
||
</p>
|
||
<dl>
|
||
<dt><code>on_poweroff</code></dt>
|
||
<dd>The content of this element specifies the action to take when
|
||
the guest requests a poweroff.</dd>
|
||
<dt><code>on_reboot</code></dt>
|
||
<dd>The content of this element specifies the action to take when
|
||
the guest requests a reboot.</dd>
|
||
<dt><code>on_crash</code></dt>
|
||
<dd>The content of this element specifies the action to take when
|
||
the guest crashes.</dd>
|
||
</dl>
|
||
|
||
<p>
|
||
Each of these states allow for the same four possible actions.
|
||
</p>
|
||
|
||
<dl>
|
||
<dt><code>destroy</code></dt>
|
||
<dd>The domain will be terminated completely and all resources
|
||
released.</dd>
|
||
<dt><code>restart</code></dt>
|
||
<dd>The domain will be terminated and then restarted with
|
||
the same configuration.</dd>
|
||
<dt><code>preserve</code></dt>
|
||
<dd>The domain will be terminated and its resource preserved
|
||
to allow analysis.</dd>
|
||
<dt><code>rename-restart</code></dt>
|
||
<dd>The domain will be terminated and then restarted with
|
||
a new name.</dd>
|
||
</dl>
|
||
|
||
<p>
|
||
QEMU/KVM supports the <code>on_poweroff</code> and <code>on_reboot</code>
|
||
events handling the <code>destroy</code> and <code>restart</code> actions.
|
||
The <code>preserve</code> action for an <code>on_reboot</code> event
|
||
is treated as a <code>destroy</code> and the <code>rename-restart</code>
|
||
action for an <code>on_poweroff</code> event is treated as a
|
||
<code>restart</code> event.
|
||
</p>
|
||
|
||
<p>
|
||
The <code>on_crash</code> event supports these additional
|
||
actions <span class="since">since 0.8.4</span>.
|
||
</p>
|
||
|
||
<dl>
|
||
<dt><code>coredump-destroy</code></dt>
|
||
<dd>The crashed domain's core will be dumped, and then the
|
||
domain will be terminated completely and all resources
|
||
released</dd>
|
||
<dt><code>coredump-restart</code></dt>
|
||
<dd>The crashed domain's core will be dumped, and then the
|
||
domain will be restarted with the same configuration</dd>
|
||
</dl>
|
||
|
||
<p>
|
||
<span class="since">Since 3.9.0</span>, the lifecycle events can
|
||
be configured via the
|
||
<a href="html/libvirt-libvirt-domain.html#virDomainSetLifecycleAction">
|
||
<code>virDomainSetLifecycleAction</code></a> API.
|
||
</p>
|
||
|
||
<p>
|
||
The <code>on_lockfailure</code> element (<span class="since">since
|
||
1.0.0</span>) may be used to configure what action should be
|
||
taken when a lock manager loses resource locks. The following
|
||
actions are recognized by libvirt, although not all of them need
|
||
to be supported by individual lock managers. When no action is
|
||
specified, each lock manager will take its default action.
|
||
</p>
|
||
<dl>
|
||
<dt><code>poweroff</code></dt>
|
||
<dd>The domain will be forcefully powered off.</dd>
|
||
<dt><code>restart</code></dt>
|
||
<dd>The domain will be powered off and started up again to
|
||
reacquire its locks.</dd>
|
||
<dt><code>pause</code></dt>
|
||
<dd>The domain will be paused so that it can be manually resumed
|
||
when lock issues are solved.</dd>
|
||
<dt><code>ignore</code></dt>
|
||
<dd>Keep the domain running as if nothing happened.</dd>
|
||
</dl>
|
||
|
||
<h3><a id="elementsPowerManagement">Power Management</a></h3>
|
||
|
||
<p>
|
||
<span class="since">Since 0.10.2</span> it is possible to
|
||
forcibly enable or disable BIOS advertisements to the guest
|
||
OS. (NB: Only qemu driver support)
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<pm>
|
||
<suspend-to-disk enabled='no'/>
|
||
<suspend-to-mem enabled='yes'/>
|
||
</pm>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><code>pm</code></dt>
|
||
<dd>These elements enable ('yes') or disable ('no') BIOS support
|
||
for S3 (suspend-to-mem) and S4 (suspend-to-disk) ACPI sleep
|
||
states. If nothing is specified, then the hypervisor will be
|
||
left with its default value.<br/>
|
||
Note: This setting cannot prevent the guest OS from performing
|
||
a suspend as the guest OS itself can choose to circumvent the
|
||
unavailability of the sleep states (e.g. S4 by turning off
|
||
completely).</dd>
|
||
</dl>
|
||
|
||
<h3><a id="elementsFeatures">Hypervisor features</a></h3>
|
||
|
||
<p>
|
||
Hypervisors may allow certain CPU / machine features to be
|
||
toggled on/off.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<features>
|
||
<pae/>
|
||
<acpi/>
|
||
<apic/>
|
||
<hap/>
|
||
<privnet/>
|
||
<hyperv>
|
||
<relaxed state='on'/>
|
||
<vapic state='on'/>
|
||
<spinlocks state='on' retries='4096'/>
|
||
<vpindex state='on'/>
|
||
<runtime state='on'/>
|
||
<synic state='on'/>
|
||
<reset state='on'/>
|
||
<vendor_id state='on' value='KVM Hv'/>
|
||
<frequencies state='on'/>
|
||
<reenlightenment state='on'/>
|
||
<tlbflush state='on'/>
|
||
</hyperv>
|
||
<kvm>
|
||
<hidden state='on'/>
|
||
</kvm>
|
||
<pvspinlock state='on'/>
|
||
<gic version='2'/>
|
||
<ioapic driver='qemu'/>
|
||
<hpt resizing='required'>
|
||
<maxpagesize unit='MiB'>16</maxpagesize>
|
||
</hpt>
|
||
<vmcoreinfo state='on'/>
|
||
<smm state='on'>
|
||
<tseg unit='MiB'>48</tseg>
|
||
</smm>
|
||
<htm state='on'/>
|
||
</features>
|
||
...</pre>
|
||
|
||
<p>
|
||
All features are listed within the <code>features</code>
|
||
element, omitting a togglable feature tag turns it off.
|
||
The available features can be found by asking
|
||
for the <a href="formatcaps.html">capabilities XML</a> and
|
||
<a href="formatdomaincaps.html">domain capabilities XML</a>,
|
||
but a common set for fully virtualized domains are:
|
||
</p>
|
||
|
||
<dl>
|
||
<dt><code>pae</code></dt>
|
||
<dd>Physical address extension mode allows 32-bit guests
|
||
to address more than 4 GB of memory.</dd>
|
||
<dt><code>acpi</code></dt>
|
||
<dd>ACPI is useful for power management, for example, with
|
||
KVM guests it is required for graceful shutdown to work.
|
||
</dd>
|
||
<dt><code>apic</code></dt>
|
||
<dd>APIC allows the use of programmable IRQ
|
||
management. <span class="since">Since 0.10.2 (QEMU only)</span> there is
|
||
an optional attribute <code>eoi</code> with values <code>on</code>
|
||
and <code>off</code> which toggles the availability of EOI (End of
|
||
Interrupt) for the guest.
|
||
</dd>
|
||
<dt><code>hap</code></dt>
|
||
<dd>Depending on the <code>state</code> attribute (values <code>on</code>,
|
||
<code>off</code>) enable or disable use of Hardware Assisted Paging.
|
||
The default is <code>on</code> if the hypervisor detects availability
|
||
of Hardware Assisted Paging.
|
||
</dd>
|
||
<dt><code>viridian</code></dt>
|
||
<dd>Enable Viridian hypervisor extensions for paravirtualizing
|
||
guest operating systems
|
||
</dd>
|
||
<dt><code>privnet</code></dt>
|
||
<dd>Always create a private network namespace. This is
|
||
automatically set if any interface devices are defined.
|
||
This feature is only relevant for container based
|
||
virtualization drivers, such as LXC.
|
||
</dd>
|
||
<dt><code>hyperv</code></dt>
|
||
<dd>Enable various features improving behavior of guests
|
||
running Microsoft Windows.
|
||
<table class="top_table">
|
||
<tr>
|
||
<th>Feature</th>
|
||
<th>Description</th>
|
||
<th>Value</th>
|
||
<th>Since</th>
|
||
</tr>
|
||
<tr>
|
||
<td>relaxed</td>
|
||
<td>Relax constraints on timers</td>
|
||
<td> on, off</td>
|
||
<td><span class="since">1.0.0 (QEMU 2.0)</span></td>
|
||
</tr>
|
||
<tr>
|
||
<td>vapic</td>
|
||
<td>Enable virtual APIC</td>
|
||
<td>on, off</td>
|
||
<td><span class="since">1.1.0 (QEMU 2.0)</span></td>
|
||
</tr>
|
||
<tr>
|
||
<td>spinlocks</td>
|
||
<td>Enable spinlock support</td>
|
||
<td>on, off; retries - at least 4095</td>
|
||
<td><span class="since">1.1.0 (QEMU 2.0)</span></td>
|
||
</tr>
|
||
<tr>
|
||
<td>vpindex</td>
|
||
<td>Virtual processor index</td>
|
||
<td> on, off</td>
|
||
<td><span class="since">1.3.3 (QEMU 2.5)</span></td>
|
||
</tr>
|
||
<tr>
|
||
<td>runtime</td>
|
||
<td>Processor time spent on running guest code and on behalf of guest code</td>
|
||
<td> on, off</td>
|
||
<td><span class="since">1.3.3 (QEMU 2.5)</span></td>
|
||
</tr>
|
||
<tr>
|
||
<td>synic</td>
|
||
<td>Enable Synthetic Interrupt Controller (SyNIC)</td>
|
||
<td> on, off</td>
|
||
<td><span class="since">1.3.3 (QEMU 2.6)</span></td>
|
||
</tr>
|
||
<tr>
|
||
<td>stimer</td>
|
||
<td>Enable SyNIC timers</td>
|
||
<td> on, off</td>
|
||
<td><span class="since">1.3.3 (QEMU 2.6)</span></td>
|
||
</tr>
|
||
<tr>
|
||
<td>reset</td>
|
||
<td>Enable hypervisor reset</td>
|
||
<td> on, off</td>
|
||
<td><span class="since">1.3.3 (QEMU 2.5)</span></td>
|
||
</tr>
|
||
<tr>
|
||
<td>vendor_id</td>
|
||
<td>Set hypervisor vendor id</td>
|
||
<td>on, off; value - string, up to 12 characters</td>
|
||
<td><span class="since">1.3.3 (QEMU 2.5)</span></td>
|
||
</tr>
|
||
<tr>
|
||
<td>frequencies</td>
|
||
<td>Expose frequency MSRs</td>
|
||
<td> on, off</td>
|
||
<td><span class="since">4.7.0 (QEMU 2.12)</span></td>
|
||
</tr>
|
||
<tr>
|
||
<td>reenlightenment</td>
|
||
<td>Enable re-enlightenment notification on migration</td>
|
||
<td> on, off</td>
|
||
<td><span class="since">4.7.0 (QEMU 3.0)</span></td>
|
||
</tr>
|
||
<tr>
|
||
<td>tlbflush</td>
|
||
<td>Enable PV TLB flush support</td>
|
||
<td> on, off</td>
|
||
<td><span class="since">4.7.0 (QEMU 3.0)</span></td>
|
||
</tr>
|
||
</table>
|
||
</dd>
|
||
<dt><code>pvspinlock</code></dt>
|
||
<dd>Notify the guest that the host supports paravirtual spinlocks
|
||
for example by exposing the pvticketlocks mechanism. This feature
|
||
can be explicitly disabled by using <code>state='off'</code>
|
||
attribute.
|
||
</dd>
|
||
<dt><code>kvm</code></dt>
|
||
<dd>Various features to change the behavior of the KVM hypervisor.
|
||
<table class="top_table">
|
||
<tr>
|
||
<th>Feature</th>
|
||
<th>Description</th>
|
||
<th>Value</th>
|
||
<th>Since</th>
|
||
</tr>
|
||
<tr>
|
||
<td>hidden</td>
|
||
<td>Hide the KVM hypervisor from standard MSR based discovery</td>
|
||
<td>on, off</td>
|
||
<td><span class="since">1.2.8 (QEMU 2.1.0)</span></td>
|
||
</tr>
|
||
</table>
|
||
</dd>
|
||
<dt><code>pmu</code></dt>
|
||
<dd>Depending on the <code>state</code> attribute (values <code>on</code>,
|
||
<code>off</code>, default <code>on</code>) enable or disable the
|
||
performance monitoring unit for the guest.
|
||
<span class="since">Since 1.2.12</span>
|
||
</dd>
|
||
<dt><code>vmport</code></dt>
|
||
<dd>Depending on the <code>state</code> attribute (values <code>on</code>,
|
||
<code>off</code>, default <code>on</code>) enable or disable
|
||
the emulation of VMware IO port, for vmmouse etc.
|
||
<span class="since">Since 1.2.16</span>
|
||
</dd>
|
||
<dt><code>gic</code></dt>
|
||
<dd>Enable for architectures using a General Interrupt
|
||
Controller instead of APIC in order to handle interrupts.
|
||
For example, the 'aarch64' architecture uses
|
||
<code>gic</code> instead of <code>apic</code>. The optional
|
||
attribute <code>version</code> specifies the GIC version;
|
||
however, it may not be supported by all hypervisors. Accepted
|
||
values are <code>2</code>, <code>3</code> and <code>host</code>.
|
||
<span class="since">Since 1.2.16</span>
|
||
</dd>
|
||
<dt><code>smm</code></dt>
|
||
<dd>
|
||
<p>
|
||
Depending on the <code>state</code> attribute (values <code>on</code>,
|
||
<code>off</code>, default <code>on</code>) enable or disable
|
||
System Management Mode.
|
||
<span class="since">Since 2.1.0</span>
|
||
</p><p> Optional sub-element <code>tseg</code> can be used to specify
|
||
the amount of memory dedicated to SMM's extended TSEG. That offers a
|
||
fourth option size apart from the existing ones (1 MiB, 2 MiB and 8
|
||
MiB) that the guest OS (or rather loader) can choose from. The size
|
||
can be specified as a value of that element, optional attribute
|
||
<code>unit</code> can be used to specify the unit of the
|
||
aforementioned value (defaults to 'MiB'). If set to 0 the extended
|
||
size is not advertised and only the default ones (see above) are
|
||
available.
|
||
</p><p>
|
||
<b>If the VM is booting you should leave this option alone, unless you
|
||
are very certain you know what you are doing.</b>
|
||
</p><p>
|
||
This value is configurable due to the fact that the calculation cannot
|
||
be done right with the guarantee that it will work correctly. In
|
||
QEMU, the user-configurable extended TSEG feature was unavailable up
|
||
to and including <code>pc-q35-2.9</code>. Starting with
|
||
<code>pc-q35-2.10</code> the feature is available, with default size
|
||
16 MiB. That should suffice for up to roughly 272 vCPUs, 5 GiB guest
|
||
RAM in total, no hotplug memory range, and 32 GiB of 64-bit PCI MMIO
|
||
aperture. Or for 48 vCPUs, with 1TB of guest RAM, no hotplug DIMM
|
||
range, and 32GB of 64-bit PCI MMIO aperture. The values may also vary
|
||
based on the loader the VM is using.
|
||
</p><p>
|
||
Additional size might be needed for significantly higher vCPU counts
|
||
or increased address space (that can be memory, maxMemory, 64-bit PCI
|
||
MMIO aperture size; roughly 8 MiB of TSEG per 1 TiB of address space)
|
||
which can also be rounded up.
|
||
</p><p>
|
||
Due to the nature of this setting being similar to "how much RAM
|
||
should the guest have" users are advised to either consult the
|
||
documentation of the guest OS or loader (if there is any), or test
|
||
this by trial-and-error changing the value until the VM boots
|
||
successfully. Yet another guiding value for users might be the fact
|
||
that 48 MiB should be enough for pretty large guests (240 vCPUs and
|
||
4TB guest RAM), but it is on purpose not set as default as 48 MiB of
|
||
unavailable RAM might be too much for small guests (e.g. with 512 MiB
|
||
of RAM).
|
||
</p><p>
|
||
See <a href="#elementsMemoryAllocation">Memory Allocation</a>
|
||
for more details about the <code>unit</code> attribute.
|
||
<span class="since">Since 4.5.0</span> (QEMU only)
|
||
</p>
|
||
</dd>
|
||
<dt><code>ioapic</code></dt>
|
||
<dd>Tune the I/O APIC. Possible values for the
|
||
<code>driver</code> attribute are:
|
||
<code>kvm</code> (default for KVM domains)
|
||
and <code>qemu</code> which puts I/O APIC in userspace
|
||
which is also known as a split I/O APIC mode.
|
||
<span class="since">Since 3.4.0</span> (QEMU/KVM only)
|
||
</dd>
|
||
<dt><code>hpt</code></dt>
|
||
<dd>Configure the HPT (Hash Page Table) of a pSeries guest. Possible
|
||
values for the <code>resizing</code> attribute are
|
||
<code>enabled</code>, which causes HPT resizing to be enabled if
|
||
both the guest and the host support it; <code>disabled</code>, which
|
||
causes HPT resizing to be disabled regardless of guest and host
|
||
support; and <code>required</code>, which prevents the guest from
|
||
starting unless both the guest and the host support HPT resizing. If
|
||
the attribute is not defined, the hypervisor default will be used.
|
||
<span class="since">Since 3.10.0</span> (QEMU/KVM only).
|
||
|
||
<p>The optional <code>maxpagesize</code> subelement can be used to
|
||
limit the usable page size for HPT guests. Common values are 64 KiB,
|
||
16 MiB and 16 GiB; when not specified, the hypervisor default will
|
||
be used. <span class="since">Since 4.5.0</span> (QEMU/KVM only).</p>
|
||
</dd>
|
||
<dt><code>vmcoreinfo</code></dt>
|
||
<dd>Enable QEMU vmcoreinfo device to let the guest kernel save debug
|
||
details. <span class="since">Since 4.4.0</span> (QEMU only)
|
||
</dd>
|
||
<dt><code>htm</code></dt>
|
||
<dd>Configure HTM (Hardware Transational Memory) availability for
|
||
pSeries guests. Possible values for the <code>state</code> attribute
|
||
are <code>on</code> and <code>off</code>. If the attribute is not
|
||
defined, the hypervisor default will be used.
|
||
<span class="since">Since 4.6.0</span> (QEMU/KVM only)
|
||
</dd>
|
||
</dl>
|
||
|
||
<h3><a id="elementsTime">Time keeping</a></h3>
|
||
|
||
<p>
|
||
The guest clock is typically initialized from the host clock.
|
||
Most operating systems expect the hardware clock to be kept
|
||
in UTC, and this is the default. Windows, however, expects
|
||
it to be in so called 'localtime'.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<clock offset='localtime'>
|
||
<timer name='rtc' tickpolicy='catchup' track='guest'>
|
||
<catchup threshold='123' slew='120' limit='10000'/>
|
||
</timer>
|
||
<timer name='pit' tickpolicy='delay'/>
|
||
</clock>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><code>clock</code></dt>
|
||
<dd>
|
||
<p>The <code>offset</code> attribute takes four possible
|
||
values, allowing fine grained control over how the guest
|
||
clock is synchronized to the host. NB, not all hypervisors
|
||
support all modes.</p>
|
||
<dl>
|
||
<dt><code>utc</code></dt>
|
||
<dd>
|
||
The guest clock will always be synchronized to UTC when
|
||
booted.
|
||
<span class="since">Since 0.9.11</span> 'utc' mode can be converted
|
||
to 'variable' mode, which can be controlled by using the
|
||
<code>adjustment</code> attribute. If the value is 'reset', the
|
||
conversion is never done (not all hypervisors can
|
||
synchronize to UTC on each boot; use of 'reset' will cause
|
||
an error on those hypervisors). A numeric value
|
||
forces the conversion to 'variable' mode using the value as the
|
||
initial adjustment. The default <code>adjustment</code> is
|
||
hypervisor specific.
|
||
</dd>
|
||
<dt><code>localtime</code></dt>
|
||
<dd>
|
||
The guest clock will be synchronized to the host's configured
|
||
timezone when booted, if any.
|
||
<span class="since">Since 0.9.11,</span> the <code>adjustment</code>
|
||
attribute behaves the same as in 'utc' mode.
|
||
</dd>
|
||
<dt><code>timezone</code></dt>
|
||
<dd>
|
||
The guest clock will be synchronized to the requested timezone
|
||
using the <code>timezone</code> attribute.
|
||
<span class="since">Since 0.7.7</span>
|
||
</dd>
|
||
<dt><code>variable</code></dt>
|
||
<dd>
|
||
The guest clock will have an arbitrary offset applied
|
||
relative to UTC or localtime, depending on the <code>basis</code>
|
||
attribute. The delta relative to UTC (or localtime) is specified
|
||
in seconds, using the <code>adjustment</code> attribute.
|
||
The guest is free to adjust the RTC over time and expect
|
||
that it will be honored at next reboot. This is in
|
||
contrast to 'utc' and 'localtime' mode (with the optional
|
||
attribute adjustment='reset'), where the RTC adjustments are
|
||
lost at each reboot. <span class="since">Since 0.7.7</span>
|
||
<span class="since">Since 0.9.11</span> the <code>basis</code>
|
||
attribute can be either 'utc' (default) or 'localtime'.
|
||
</dd>
|
||
</dl>
|
||
<p>
|
||
A <code>clock</code> may have zero or more
|
||
<code>timer</code> sub-elements. <span class="since">Since
|
||
0.8.0</span>
|
||
</p>
|
||
</dd>
|
||
<dt><code>timer</code></dt>
|
||
<dd>
|
||
<p>
|
||
Each timer element requires a <code>name</code> attribute,
|
||
and has other optional attributes that depend on
|
||
the <code>name</code> specified. Various hypervisors
|
||
support different combinations of attributes.
|
||
</p>
|
||
<dl>
|
||
<dt><code>name</code></dt>
|
||
<dd>
|
||
The <code>name</code> attribute selects which timer is
|
||
being modified, and can be one of
|
||
"platform" (currently unsupported),
|
||
"hpet" (libxl, xen, qemu), "kvmclock" (qemu),
|
||
"pit" (qemu), "rtc" (qemu), "tsc" (libxl) or "hypervclock"
|
||
(qemu - <span class="since">since 1.2.2</span>).
|
||
|
||
The <code>hypervclock</code> timer adds support for the
|
||
reference time counter and the reference page for iTSC
|
||
feature for guests running the Microsoft Windows
|
||
operating system.
|
||
</dd>
|
||
<dt><code>track</code></dt>
|
||
<dd>
|
||
The <code>track</code> attribute specifies what the timer
|
||
tracks, and can be "boot", "guest", or "wall".
|
||
Only valid for <code>name="rtc"</code>
|
||
or <code>name="platform"</code>.
|
||
</dd>
|
||
<dt><code>tickpolicy</code></dt>
|
||
<dd>
|
||
<p>
|
||
The <code>tickpolicy</code> attribute determines what
|
||
happens when QEMU misses a deadline for injecting a
|
||
tick to the guest:
|
||
</p>
|
||
<dl>
|
||
<dt><code>delay</code></dt>
|
||
<dd>Continue to deliver ticks at the normal rate.
|
||
The guest time will be delayed due to the late
|
||
tick</dd>
|
||
<dt><code>catchup</code></dt>
|
||
<dd>Deliver ticks at a higher rate to catch up
|
||
with the missed tick. The guest time should
|
||
not be delayed once catchup is complete.</dd>
|
||
<dt><code>merge</code></dt>
|
||
<dd>Merge the missed tick(s) into one tick and
|
||
inject. The guest time may be delayed, depending
|
||
on how the OS reacts to the merging of ticks</dd>
|
||
<dt><code>discard</code></dt>
|
||
<dd>Throw away the missed tick(s) and continue
|
||
with future injection normally. The guest time
|
||
may be delayed, unless the OS has explicit
|
||
handling of lost ticks</dd>
|
||
</dl>
|
||
<p>If the policy is "catchup", there can be further details in
|
||
the <code>catchup</code> sub-element.</p>
|
||
<dl>
|
||
<dt><code>catchup</code></dt>
|
||
<dd>
|
||
The <code>catchup</code> element has three optional
|
||
attributes, each a positive integer. The attributes
|
||
are <code>threshold</code>, <code>slew</code>,
|
||
and <code>limit</code>.
|
||
</dd>
|
||
</dl>
|
||
<p>
|
||
Note that hypervisors are not required to support all policies across all time sources
|
||
</p>
|
||
</dd>
|
||
<dt><code>frequency</code></dt>
|
||
<dd>
|
||
The <code>frequency</code> attribute is an unsigned
|
||
integer specifying the frequency at
|
||
which <code>name="tsc"</code> runs.
|
||
</dd>
|
||
<dt><code>mode</code></dt>
|
||
<dd>
|
||
The <code>mode</code> attribute controls how
|
||
the <code>name="tsc"</code> timer is managed, and can be
|
||
"auto", "native", "emulate", "paravirt", or "smpsafe".
|
||
Other timers are always emulated.
|
||
</dd>
|
||
<dt><code>present</code></dt>
|
||
<dd>
|
||
The <code>present</code> attribute can be "yes" or "no" to
|
||
specify whether a particular timer is available to the guest.
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
</dl>
|
||
|
||
<h3><a id="elementsPerf">Performance monitoring events</a></h3>
|
||
|
||
<p>
|
||
Some platforms allow monitoring of performance of the virtual machine and
|
||
the code executed inside. To enable the performance monitoring events
|
||
you can either specify them in the <code>perf</code> element or enable
|
||
them via <code>virDomainSetPerfEvents</code> API. The performance values
|
||
are then retrieved using the virConnectGetAllDomainStats API.
|
||
<span class="since">Since 2.0.0</span>
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<perf>
|
||
<event name='cmt' enabled='yes'/>
|
||
<event name='mbmt' enabled='no'/>
|
||
<event name='mbml' enabled='yes'/>
|
||
<event name='cpu_cycles' enabled='no'/>
|
||
<event name='instructions' enabled='yes'/>
|
||
<event name='cache_references' enabled='no'/>
|
||
<event name='cache_misses' enabled='no'/>
|
||
<event name='branch_instructions' enabled='no'/>
|
||
<event name='branch_misses' enabled='no'/>
|
||
<event name='bus_cycles' enabled='no'/>
|
||
<event name='stalled_cycles_frontend' enabled='no'/>
|
||
<event name='stalled_cycles_backend' enabled='no'/>
|
||
<event name='ref_cpu_cycles' enabled='no'/>
|
||
<event name='cpu_clock' enabled='no'/>
|
||
<event name='task_clock' enabled='no'/>
|
||
<event name='page_faults' enabled='no'/>
|
||
<event name='context_switches' enabled='no'/>
|
||
<event name='cpu_migrations' enabled='no'/>
|
||
<event name='page_faults_min' enabled='no'/>
|
||
<event name='page_faults_maj' enabled='no'/>
|
||
<event name='alignment_faults' enabled='no'/>
|
||
<event name='emulation_faults' enabled='no'/>
|
||
</perf>
|
||
...
|
||
</pre>
|
||
|
||
<table class="top_table">
|
||
<tr>
|
||
<th>event name</th>
|
||
<th>Description</th>
|
||
<th>stats parameter name</th>
|
||
</tr>
|
||
<tr>
|
||
<td><code>cmt</code></td>
|
||
<td>usage of l3 cache in bytes by applications running on the platform</td>
|
||
<td><code>perf.cmt</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>mbmt</code></td>
|
||
<td>total system bandwidth from one level of cache</td>
|
||
<td><code>perf.mbmt</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>mbml</code></td>
|
||
<td>bandwidth of memory traffic for a memory controller</td>
|
||
<td><code>perf.mbml</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>cpu_cycles</code></td>
|
||
<td>the count of CPU cycles (total/elapsed)</td>
|
||
<td><code>perf.cpu_cycles</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>instructions</code></td>
|
||
<td>the count of instructions by applications running on the platform</td>
|
||
<td><code>perf.instructions</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>cache_references</code></td>
|
||
<td>the count of cache hits by applications running on the platform</td>
|
||
<td><code>perf.cache_references</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>cache_misses</code></td>
|
||
<td>the count of cache misses by applications running on the platform</td>
|
||
<td><code>perf.cache_misses</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>branch_instructions</code></td>
|
||
<td>the count of branch instructions by applications running on the platform</td>
|
||
<td><code>perf.branch_instructions</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>branch_misses</code></td>
|
||
<td>the count of branch misses by applications running on the platform</td>
|
||
<td><code>perf.branch_misses</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>bus_cycles</code></td>
|
||
<td>the count of bus cycles by applications running on the platform</td>
|
||
<td><code>perf.bus_cycles</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>stalled_cycles_frontend</code></td>
|
||
<td>the count of stalled CPU cycles in the frontend of the instruction
|
||
processor pipeline by applications running on the platform</td>
|
||
<td><code>perf.stalled_cycles_frontend</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>stalled_cycles_backend</code></td>
|
||
<td>the count of stalled CPU cycles in the backend of the instruction
|
||
processor pipeline by applications running on the platform</td>
|
||
<td><code>perf.stalled_cycles_backend</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>ref_cpu_cycles</code></td>
|
||
<td>the count of total CPU cycles not affected by CPU frequency scaling
|
||
by applications running on the platform</td>
|
||
<td><code>perf.ref_cpu_cycles</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>cpu_clock</code></td>
|
||
<td>the count of CPU clock time, as measured by a monotonic
|
||
high-resolution per-CPU timer, by applications running on
|
||
the platform</td>
|
||
<td><code>perf.cpu_clock</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>task_clock</code></td>
|
||
<td>the count of task clock time, as measured by a monotonic
|
||
high-resolution CPU timer, specific to the task that
|
||
is run by applications running on the platform</td>
|
||
<td><code>perf.task_clock</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>page_faults</code></td>
|
||
<td>the count of page faults by applications running on the
|
||
platform. This includes minor, major, invalid and other
|
||
types of page faults</td>
|
||
<td><code>perf.page_faults</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>context_switches</code></td>
|
||
<td>the count of context switches by applications running on
|
||
the platform</td>
|
||
<td><code>perf.context_switches</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>cpu_migrations</code></td>
|
||
<td>the count of CPU migrations, that is, where the process
|
||
moved from one logical processor to another, by
|
||
applications running on the platform</td>
|
||
<td><code>perf.cpu_migrations</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>page_faults_min</code></td>
|
||
<td>the count of minor page faults, that is, where the
|
||
page was present in the page cache, and therefore
|
||
the fault avoided loading it from storage, by
|
||
applications running on the platform</td>
|
||
<td><code>perf.page_faults_min</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>page_faults_maj</code></td>
|
||
<td>the count of major page faults, that is, where the
|
||
page was not present in the page cache, and
|
||
therefore had to be fetched from storage, by
|
||
applications running on the platform</td>
|
||
<td><code>perf.page_faults_maj</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>alignment_faults</code></td>
|
||
<td>the count of alignment faults, that is when
|
||
the load or store is not aligned properly, by
|
||
applications running on the platform</td>
|
||
<td><code>perf.alignment_faults</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>emulation_faults</code></td>
|
||
<td>the count of emulation faults, that is when
|
||
the kernel traps on unimplemented instrucions
|
||
and emulates them for user space, by
|
||
applications running on the platform</td>
|
||
<td><code>perf.emulation_faults</code></td>
|
||
</tr>
|
||
</table>
|
||
|
||
<h3><a id="elementsDevices">Devices</a></h3>
|
||
|
||
<p>
|
||
The final set of XML elements are all used to describe devices
|
||
provided to the guest domain. All devices occur as children
|
||
of the main <code>devices</code> element.
|
||
<span class="since">Since 0.1.3</span>
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<emulator>/usr/lib/xen/bin/qemu-dm</emulator>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><a id="elementEmulator"><code>emulator</code></a></dt>
|
||
<dd>
|
||
The contents of the <code>emulator</code> element specify
|
||
the fully qualified path to the device model emulator binary.
|
||
The <a href="formatcaps.html">capabilities XML</a> specifies
|
||
the recommended default emulator to use for each particular
|
||
domain type / architecture combination.
|
||
</dd>
|
||
</dl>
|
||
|
||
<p>
|
||
To help users identifying devices they care about, every
|
||
device can have direct child <code>alias</code> element
|
||
which then has <code>name</code> attribute where users can
|
||
store identifier for the device. The identifier has to have
|
||
"ua-" prefix and must be unique within the domain. Additionally, the
|
||
identifier must consist only of the following characters:
|
||
<code>[a-zA-Z0-9_-]</code>.
|
||
<span class="since">Since 3.9.0</span>
|
||
</p>
|
||
|
||
<pre>
|
||
<devices>
|
||
<disk type='file'>
|
||
<alias name='ua-myDisk'/>
|
||
</disk>
|
||
<interface type='network' trustGuestRxFilters='yes'>
|
||
<alias name='ua-myNIC'/>
|
||
</interface>
|
||
...
|
||
</devices>
|
||
</pre>
|
||
|
||
<h4><a id="elementsDisks">Hard drives, floppy disks, CDROMs</a></h4>
|
||
|
||
<p>
|
||
Any device that looks like a disk, be it a floppy, harddisk,
|
||
cdrom, or paravirtualized driver is specified via the <code>disk</code>
|
||
element.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<disk type='file' snapshot='external'>
|
||
<driver name="tap" type="aio" cache="default"/>
|
||
<source file='/var/lib/xen/images/fv0' startupPolicy='optional'>
|
||
<seclabel relabel='no'/>
|
||
</source>
|
||
<target dev='hda' bus='ide'/>
|
||
<iotune>
|
||
<total_bytes_sec>10000000</total_bytes_sec>
|
||
<read_iops_sec>400000</read_iops_sec>
|
||
<write_iops_sec>100000</write_iops_sec>
|
||
</iotune>
|
||
<boot order='2'/>
|
||
<encryption type='...'>
|
||
...
|
||
</encryption>
|
||
<shareable/>
|
||
<serial>
|
||
...
|
||
</serial>
|
||
</disk>
|
||
...
|
||
<disk type='network'>
|
||
<driver name="qemu" type="raw" io="threads" ioeventfd="on" event_idx="off"/>
|
||
<source protocol="sheepdog" name="image_name">
|
||
<host name="hostname" port="7000"/>
|
||
</source>
|
||
<target dev="hdb" bus="ide"/>
|
||
<boot order='1'/>
|
||
<transient/>
|
||
<address type='drive' controller='0' bus='1' unit='0'/>
|
||
</disk>
|
||
<disk type='network'>
|
||
<driver name="qemu" type="raw"/>
|
||
<source protocol="rbd" name="image_name2">
|
||
<host name="hostname" port="7000"/>
|
||
<snapshot name="snapname"/>
|
||
<config file="/path/to/file"/>
|
||
<auth username='myuser'>
|
||
<secret type='ceph' usage='mypassid'/>
|
||
</auth>
|
||
</source>
|
||
<target dev="hdc" bus="ide"/>
|
||
</disk>
|
||
<disk type='block' device='cdrom'>
|
||
<driver name='qemu' type='raw'/>
|
||
<target dev='hdd' bus='ide' tray='open'/>
|
||
<readonly/>
|
||
</disk>
|
||
<disk type='network' device='cdrom'>
|
||
<driver name='qemu' type='raw'/>
|
||
<source protocol="http" name="url_path">
|
||
<host name="hostname" port="80"/>
|
||
</source>
|
||
<target dev='hde' bus='ide' tray='open'/>
|
||
<readonly/>
|
||
</disk>
|
||
<disk type='network' device='cdrom'>
|
||
<driver name='qemu' type='raw'/>
|
||
<source protocol="https" name="url_path">
|
||
<host name="hostname" port="443"/>
|
||
</source>
|
||
<target dev='hdf' bus='ide' tray='open'/>
|
||
<readonly/>
|
||
</disk>
|
||
<disk type='network' device='cdrom'>
|
||
<driver name='qemu' type='raw'/>
|
||
<source protocol="ftp" name="url_path">
|
||
<host name="hostname" port="21"/>
|
||
</source>
|
||
<target dev='hdg' bus='ide' tray='open'/>
|
||
<readonly/>
|
||
</disk>
|
||
<disk type='network' device='cdrom'>
|
||
<driver name='qemu' type='raw'/>
|
||
<source protocol="ftps" name="url_path">
|
||
<host name="hostname" port="990"/>
|
||
</source>
|
||
<target dev='hdh' bus='ide' tray='open'/>
|
||
<readonly/>
|
||
</disk>
|
||
<disk type='network' device='cdrom'>
|
||
<driver name='qemu' type='raw'/>
|
||
<source protocol="tftp" name="url_path">
|
||
<host name="hostname" port="69"/>
|
||
</source>
|
||
<target dev='hdi' bus='ide' tray='open'/>
|
||
<readonly/>
|
||
</disk>
|
||
<disk type='block' device='lun'>
|
||
<driver name='qemu' type='raw'/>
|
||
<source dev='/dev/sda'>
|
||
<reservations managed='no'>
|
||
<source type='unix' path='/path/to/qemu-pr-helper' mode='client'/>
|
||
</reservations>
|
||
<target dev='sda' bus='scsi'/>
|
||
<address type='drive' controller='0' bus='0' target='3' unit='0'/>
|
||
</disk>
|
||
<disk type='block' device='disk'>
|
||
<driver name='qemu' type='raw'/>
|
||
<source dev='/dev/sda'/>
|
||
<geometry cyls='16383' heads='16' secs='63' trans='lba'/>
|
||
<blockio logical_block_size='512' physical_block_size='4096'/>
|
||
<target dev='hdj' bus='ide'/>
|
||
</disk>
|
||
<disk type='volume' device='disk'>
|
||
<driver name='qemu' type='raw'/>
|
||
<source pool='blk-pool0' volume='blk-pool0-vol0'/>
|
||
<target dev='hdk' bus='ide'/>
|
||
</disk>
|
||
<disk type='network' device='disk'>
|
||
<driver name='qemu' type='raw'/>
|
||
<source protocol='iscsi' name='iqn.2013-07.com.example:iscsi-nopool/2'>
|
||
<host name='example.com' port='3260'/>
|
||
<auth username='myuser'>
|
||
<secret type='iscsi' usage='libvirtiscsi'/>
|
||
</auth>
|
||
</source>
|
||
<target dev='vda' bus='virtio'/>
|
||
</disk>
|
||
<disk type='network' device='lun'>
|
||
<driver name='qemu' type='raw'/>
|
||
<source protocol='iscsi' name='iqn.2013-07.com.example:iscsi-nopool/1'>
|
||
<host name='example.com' port='3260'/>
|
||
<auth username='myuser'>
|
||
<secret type='iscsi' usage='libvirtiscsi'/>
|
||
</auth>
|
||
</source>
|
||
<target dev='sdb' bus='scsi'/>
|
||
</disk>
|
||
</disk>
|
||
<disk type='network' device='lun'>
|
||
<driver name='qemu' type='raw'/>
|
||
<source protocol='iscsi' name='iqn.2013-07.com.example:iscsi-nopool/0'>
|
||
<host name='example.com' port='3260'/>
|
||
<initiator>
|
||
<iqn name='iqn.2013-07.com.example:client'/>
|
||
</initiator>
|
||
</source>
|
||
<target dev='sdb' bus='scsi'/>
|
||
</disk>
|
||
<disk type='volume' device='disk'>
|
||
<driver name='qemu' type='raw'/>
|
||
<source pool='iscsi-pool' volume='unit:0:0:1' mode='host'/>
|
||
<target dev='vdb' bus='virtio'/>
|
||
</disk>
|
||
<disk type='volume' device='disk'>
|
||
<driver name='qemu' type='raw'/>
|
||
<source pool='iscsi-pool' volume='unit:0:0:2' mode='direct'/>
|
||
<target dev='vdc' bus='virtio'/>
|
||
</disk>
|
||
<disk type='file' device='disk'>
|
||
<driver name='qemu' type='qcow2' queues='4'/>
|
||
<source file='/var/lib/libvirt/images/domain.qcow'/>
|
||
<backingStore type='file'>
|
||
<format type='qcow2'/>
|
||
<source file='/var/lib/libvirt/images/snapshot.qcow'/>
|
||
<backingStore type='block'>
|
||
<format type='raw'/>
|
||
<source dev='/dev/mapper/base'/>
|
||
<backingStore/>
|
||
</backingStore>
|
||
</backingStore>
|
||
<target dev='vdd' bus='virtio'/>
|
||
</disk>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><code>disk</code></dt>
|
||
<dd>The <code>disk</code> element is the main container for
|
||
describing disks and supports the following attributes:
|
||
<dl>
|
||
<dt><code>type</code></dt>
|
||
<dd>
|
||
Valid values are "file", "block",
|
||
"dir" (<span class="since">since 0.7.5</span>),
|
||
"network" (<span class="since">since 0.8.7</span>), or
|
||
"volume" (<span class="since">since 1.0.5</span>)
|
||
and refer to the underlying source for the disk.
|
||
<span class="since">Since 0.0.3</span>
|
||
</dd>
|
||
<dt><code>device</code></dt>
|
||
<dd>
|
||
Indicates how the disk is to be exposed to the guest OS. Possible
|
||
values for this attribute are "floppy", "disk", "cdrom", and "lun",
|
||
defaulting to "disk".
|
||
<p>
|
||
Using "lun" (<span class="since">since 0.9.10</span>) is only
|
||
valid when the <code>type</code> is "block" or "network" for
|
||
<code>protocol='iscsi'</code> or when the <code>type</code>
|
||
is "volume" when using an iSCSI source <code>pool</code>
|
||
for <code>mode</code> "host" or as an
|
||
<a href="http://wiki.libvirt.org/page/NPIV_in_libvirt">NPIV</a>
|
||
virtual Host Bus Adapter (vHBA) using a Fibre Channel storage pool.
|
||
Configured in this manner, the LUN behaves identically to "disk",
|
||
except that generic SCSI commands from the guest are accepted
|
||
and passed through to the physical device. Also note that
|
||
device='lun' will only be recognized for actual raw devices,
|
||
but never for individual partitions or LVM partitions (in those
|
||
cases, the kernel will reject the generic SCSI commands, making
|
||
it identical to device='disk').
|
||
<span class="since">Since 0.1.4</span>
|
||
</p>
|
||
</dd>
|
||
<dt><code>rawio</code></dt>
|
||
<dd>
|
||
Indicates whether the disk needs rawio capability. Valid
|
||
settings are "yes" or "no" (default is "no"). If any one disk
|
||
in a domain has rawio='yes', rawio capability will be enabled
|
||
for all disks in the domain (because, in the case of QEMU, this
|
||
capability can only be set on a per-process basis). This attribute
|
||
is only valid when device is "lun". NB, <code>rawio</code> intends
|
||
to confine the capability per-device, however, current QEMU
|
||
implementation gives the domain process broader capability
|
||
than that (per-process basis, affects all the domain disks).
|
||
To confine the capability as much as possible for QEMU driver
|
||
as this stage, <code>sgio</code> is recommended, it's more
|
||
secure than <code>rawio</code>.
|
||
<span class="since">Since 0.9.10</span>
|
||
</dd>
|
||
<dt><code>sgio</code></dt>
|
||
<dd>
|
||
If supported by the hypervisor and OS, indicates whether
|
||
unprivileged SG_IO commands are filtered for the disk. Valid
|
||
settings are "filtered" or "unfiltered" where the default is
|
||
"filtered". Only available when the <code>device</code> is 'lun'.
|
||
<span class="since">Since 1.0.2</span>
|
||
</dd>
|
||
<dt><code>snapshot</code></dt>
|
||
<dd>
|
||
Indicates the default behavior of the disk during disk snapshots:
|
||
"<code>internal</code>" requires a file format such as qcow2 that
|
||
can store both the snapshot and the data changes since the snapshot;
|
||
"<code>external</code>" will separate the snapshot from the live
|
||
data; and "<code>no</code>" means the disk will not participate in
|
||
snapshots. Read-only disks default to "<code>no</code>", while the
|
||
default for other disks depends on the hypervisor's capabilities.
|
||
Some hypervisors allow a per-snapshot choice as well, during
|
||
<a href="formatsnapshot.html">domain snapshot creation</a>.
|
||
Not all snapshot modes are supported; for example, enabling
|
||
snapshots with a transient disk generally does not make sense.
|
||
<span class="since">Since 0.9.5</span>
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
<dt><code>source</code></dt>
|
||
<dd>Representation of the disk <code>source</code> depends on the
|
||
disk <code>type</code> attribute value as follows:
|
||
<dl>
|
||
<dt><code>file</code></dt>
|
||
<dd>
|
||
The <code>file</code> attribute specifies the fully-qualified
|
||
path to the file holding the disk.
|
||
<span class="since">Since 0.0.3</span>
|
||
</dd>
|
||
<dt><code>block</code></dt>
|
||
<dd>
|
||
The <code>dev</code> attribute specifies the fully-qualified path
|
||
to the host device to serve as the disk.
|
||
<span class="since">Since 0.0.3</span>
|
||
</dd>
|
||
<dt><code>dir</code></dt>
|
||
<dd>
|
||
The <code>dir</code> attribute specifies the fully-qualified path
|
||
to the directory to use as the disk.
|
||
<span class="since">Since 0.7.5</span>
|
||
</dd>
|
||
<dt><code>network</code></dt>
|
||
<dd>
|
||
The <code>protocol</code> attribute specifies the protocol to
|
||
access to the requested image. Possible values are "nbd",
|
||
"iscsi", "rbd", "sheepdog", "gluster" or "vxhs".
|
||
|
||
<p>If the <code>protocol</code> attribute is "rbd", "sheepdog",
|
||
"gluster", or "vxhs", an additional attribute <code>name</code>
|
||
is mandatory to specify which volume/image will be used.
|
||
</p>
|
||
|
||
<p>For "nbd", the <code>name</code> attribute is optional. TLS
|
||
transport for NBD can be enabled by setting the <code>tls</code>
|
||
attribute to <code>yes</code>. For the QEMU hypervisor, usage of
|
||
a TLS environment can also be globally controlled on the host by
|
||
the <code>nbd_tls</code> and <code>nbd_tls_x509_cert_dir</code> in
|
||
/etc/libvirt/qemu.conf.
|
||
('tls' <span class="since">Since 4.5.0</span>)
|
||
</p>
|
||
|
||
<p>For "iscsi" (<span class="since">since 1.0.4</span>), the
|
||
<code>name</code> attribute may include a logical unit number,
|
||
separated from the target's name by a slash (e.g.,
|
||
<code>iqn.2013-07.com.example:iscsi-pool/1</code>). If not
|
||
specified, the default LUN is zero.
|
||
</p>
|
||
|
||
<p>For "vxhs" (<span class="since">since 3.8.0</span>), the
|
||
<code>name</code> is the UUID of the volume, assigned by the
|
||
HyperScale server. Additionally, an optional attribute
|
||
<code>tls</code> (QEMU only) can be used to control whether a
|
||
VxHS block device would utilize a hypervisor configured TLS
|
||
X.509 certificate environment in order to encrypt the data
|
||
channel. For the QEMU hypervisor, usage of a TLS environment can
|
||
also be globally controlled on the host by the
|
||
<code>vxhs_tls</code> and <code>vxhs_tls_x509_cert_dir</code> or
|
||
<code>default_tls_x509_cert_dir</code> settings in the file
|
||
/etc/libvirt/qemu.conf. If <code>vxhs_tls</code> is enabled,
|
||
then unless the domain <code>tls</code> attribute is set to "no",
|
||
libvirt will use the host configured TLS environment. If the
|
||
<code>tls</code> attribute is set to "yes", then regardless of
|
||
the qemu.conf setting, TLS authentication will be attempted.
|
||
</p>
|
||
<span class="since">Since 0.8.7</span>
|
||
</dd>
|
||
<dt><code>volume</code></dt>
|
||
<dd>
|
||
The underlying disk source is represented by attributes
|
||
<code>pool</code> and <code>volume</code>. Attribute
|
||
<code>pool</code> specifies the name of the
|
||
<a href="formatstorage.html">storage pool</a> (managed
|
||
by libvirt) where the disk source resides. Attribute
|
||
<code>volume</code> specifies the name of storage volume (managed
|
||
by libvirt) used as the disk source. The value for the
|
||
<code>volume</code> attribute will be the output from the "Name"
|
||
column of a <code>virsh vol-list [pool-name]</code> command.
|
||
<p>
|
||
Use the attribute <code>mode</code>
|
||
(<span class="since">since 1.1.1</span>) to indicate how to
|
||
represent the LUN as the disk source. Valid values are
|
||
"direct" and "host". If <code>mode</code> is not specified,
|
||
the default is to use "host".
|
||
|
||
Using "direct" as the <code>mode</code> value indicates to use
|
||
the <a href="formatstorage.html">storage pool's</a>
|
||
<code>source</code> element <code>host</code> attribute as
|
||
the disk source to generate the libiscsi URI (e.g.
|
||
'file=iscsi://example.com:3260/iqn.2013-07.com.example:iscsi-pool/1').
|
||
|
||
Using "host" as the <code>mode</code> value indicates to use the
|
||
LUN's path as it shows up on host (e.g.
|
||
'file=/dev/disk/by-path/ip-example.com:3260-iscsi-iqn.2013-07.com.example:iscsi-pool-lun-1').
|
||
|
||
Using a LUN from an iSCSI source pool provides the same
|
||
features as a <code>disk</code> configured using
|
||
<code>type</code> 'block' or 'network' and <code>device</code>
|
||
of 'lun' with respect to how the LUN is presented to and
|
||
may be used by the guest.
|
||
|
||
<span class="since">Since 1.0.5</span>
|
||
</p>
|
||
</dd>
|
||
</dl>
|
||
With "file", "block", and "volume", one or more optional
|
||
sub-elements <code>seclabel</code>, <a href="#seclabel">described
|
||
below</a> (and <span class="since">since 0.9.9</span>), can be
|
||
used to override the domain security labeling policy for just
|
||
that source file. (NB, for "volume" type disk, <code>seclabel</code>
|
||
is only valid when the specified storage volume is of 'file' or
|
||
'block' type).
|
||
<p>
|
||
The <code>source</code> element may also have the <code>index</code>
|
||
attribute with same semantics the <a href='#elementsDiskBackingStoreIndex'>
|
||
<code>index</code></a> attribute of <code>backingStore</code>
|
||
</p>
|
||
<p>
|
||
The <code>source</code> element may contain the following sub elements:
|
||
</p>
|
||
|
||
<dl>
|
||
<dt><code>host</code></dt>
|
||
<dd>
|
||
<p>
|
||
When the disk <code>type</code> is "network", the <code>source</code>
|
||
may have zero or more <code>host</code> sub-elements used to
|
||
specify the hosts to connect.
|
||
|
||
The <code>host</code> element supports 4 attributes, viz. "name",
|
||
"port", "transport" and "socket", which specify the hostname,
|
||
the port number, transport type and path to socket, respectively.
|
||
The meaning of this element and the number of the elements depend
|
||
on the protocol attribute.
|
||
</p>
|
||
<table class="top_table">
|
||
<tr>
|
||
<th> Protocol </th>
|
||
<th> Meaning </th>
|
||
<th> Number of hosts </th>
|
||
<th> Default port </th>
|
||
</tr>
|
||
<tr>
|
||
<td> nbd </td>
|
||
<td> a server running nbd-server </td>
|
||
<td> only one </td>
|
||
<td> 10809 </td>
|
||
</tr>
|
||
<tr>
|
||
<td> iscsi </td>
|
||
<td> an iSCSI server </td>
|
||
<td> only one </td>
|
||
<td> 3260 </td>
|
||
</tr>
|
||
<tr>
|
||
<td> rbd </td>
|
||
<td> monitor servers of RBD </td>
|
||
<td> one or more </td>
|
||
<td> librados default </td>
|
||
</tr>
|
||
<tr>
|
||
<td> sheepdog </td>
|
||
<td> one of the sheepdog servers (default is localhost:7000) </td>
|
||
<td> zero or one </td>
|
||
<td> 7000 </td>
|
||
</tr>
|
||
<tr>
|
||
<td> gluster </td>
|
||
<td> a server running glusterd daemon </td>
|
||
<td> one or more (<span class="since">Since 2.1.0</span>), just one prior to that </td>
|
||
<td> 24007 </td>
|
||
</tr>
|
||
<tr>
|
||
<td> vxhs </td>
|
||
<td> a server running Veritas HyperScale daemon </td>
|
||
<td> only one </td>
|
||
<td> 9999 </td>
|
||
</tr>
|
||
</table>
|
||
<p>
|
||
gluster supports "tcp", "rdma", "unix" as valid values for the
|
||
transport attribute. nbd supports "tcp" and "unix". Others only
|
||
support "tcp". If nothing is specified, "tcp" is assumed. If the
|
||
transport is "unix", the socket attribute specifies the path to an
|
||
AF_UNIX socket.
|
||
</p>
|
||
</dd>
|
||
<dt><code>snapshot</code></dt>
|
||
<dd>
|
||
The <code>name</code> attribute of <code>snapshot</code> element can
|
||
optionally specify an internal snapshot name to be used as the
|
||
source for storage protocols.
|
||
Supported for 'rbd' <span class="since">since 1.2.11 (QEMU only).</span>
|
||
</dd>
|
||
<dt><code>config</code></dt>
|
||
<dd>
|
||
The <code>file</code> attribute for the <code>config</code> element
|
||
provides a fully qualified path to a configuration file to be
|
||
provided as a parameter to the client of a networked storage
|
||
protocol. Supported for 'rbd' <span class="since">since 1.2.11
|
||
(QEMU only).</span>
|
||
</dd>
|
||
<dt><code>auth</code></dt>
|
||
<dd><span class="since">Since libvirt 3.9.0</span>, the
|
||
<code>auth</code> element is supported for a disk
|
||
<code>type</code> "network" that is using a <code>source</code>
|
||
element with the <code>protocol</code> attributes "rbd" or "iscsi".
|
||
If present, the <code>auth</code> element provides the
|
||
authentication credentials needed to access the source. It
|
||
includes a mandatory attribute <code>username</code>, which
|
||
identifies the username to use during authentication, as well
|
||
as a sub-element <code>secret</code> with mandatory
|
||
attribute <code>type</code>, to tie back to
|
||
a <a href="formatsecret.html">libvirt secret object</a> that
|
||
holds the actual password or other credentials (the domain XML
|
||
intentionally does not expose the password, only the reference
|
||
to the object that does manage the password).
|
||
Known secret types are "ceph" for Ceph RBD network sources and
|
||
"iscsi" for CHAP authentication of iSCSI targets.
|
||
Both will require either a <code>uuid</code> attribute
|
||
with the UUID of the secret object or a <code>usage</code>
|
||
attribute matching the key that was specified in the
|
||
secret object.
|
||
</dd>
|
||
<dt><code>encryption</code></dt>
|
||
<dd><span class="since">Since libvirt 3.9.0</span>, the
|
||
<code>encryption</code> can be a sub-element of the
|
||
<code>source</code> element for encrypted storage sources.
|
||
If present, specifies how the storage source is encrypted
|
||
See the
|
||
<a href="formatstorageencryption.html">Storage Encryption</a>
|
||
page for more information.
|
||
<p/>
|
||
Note that the 'qcow' format of encryption is broken and thus is no
|
||
longer supported for use with disk images.
|
||
(<span class="since">Since libvirt 4.5.0</span>)
|
||
</dd>
|
||
<dt><code>reservations</code></dt>
|
||
<dd><span class="since">Since libvirt 4.4.0</span>, the
|
||
<code>reservations</code> can be a sub-element of the
|
||
<code>source</code> element for storage sources (QEMU driver only).
|
||
If present it enables persistent reservations for SCSI
|
||
based disks. The element has one mandatory attribute
|
||
<code>managed</code> with accepted values <code>yes</code> and
|
||
<code>no</code>. If <code>managed</code> is enabled libvirt prepares
|
||
and manages any resources needed. When the persistent reservations
|
||
are unmanaged, then the hypervisor acts as a client and the path to
|
||
the server socket must be provided in the child element
|
||
<code>source</code>, which currently accepts only the following
|
||
attributes:
|
||
<code>type</code> with one value <code>unix</code>,
|
||
<code>path</code> path to the socket, and
|
||
finally <code>mode</code> which accepts one value
|
||
<code>client</code> specifying the role of hypervisor.
|
||
It's recommended to allow libvirt manage the persistent
|
||
reservations.
|
||
</dd>
|
||
<dt><code>initiator</code></dt>
|
||
<dd><span class="since">Since libvirt 4.7.0</span>, the
|
||
<code>initiator</code> element is supported for a disk
|
||
<code>type</code> "network" that is using a <code>source</code>
|
||
element with the <code>protocol</code> attribute "iscsi".
|
||
If present, the <code>initiator</code> element provides the
|
||
initiator IQN needed to access the source via mandatory
|
||
attribute <code>name</code>.
|
||
</dd>
|
||
</dl>
|
||
|
||
<p>
|
||
For a "file" or "volume" disk type which represents a cdrom or floppy
|
||
(the <code>device</code> attribute), it is possible to define
|
||
policy what to do with the disk if the source file is not accessible.
|
||
(NB, <code>startupPolicy</code> is not valid for "volume" disk unless
|
||
the specified storage volume is of "file" type). This is done by the
|
||
<code>startupPolicy</code> attribute
|
||
(<span class="since">since 0.9.7</span>),
|
||
accepting these values:
|
||
</p>
|
||
<table class="top_table">
|
||
<tr>
|
||
<td> mandatory </td>
|
||
<td> fail if missing for any reason (the default) </td>
|
||
</tr>
|
||
<tr>
|
||
<td> requisite </td>
|
||
<td> fail if missing on boot up,
|
||
drop if missing on migrate/restore/revert </td>
|
||
</tr>
|
||
<tr>
|
||
<td> optional </td>
|
||
<td> drop if missing at any start attempt </td>
|
||
</tr>
|
||
</table>
|
||
<p>
|
||
<span class="since">Since 1.1.2</span> the <code>startupPolicy</code>
|
||
is extended to support hard disks besides cdrom and floppy. On guest
|
||
cold bootup, if a certain disk is not accessible or its disk chain is
|
||
broken, with startupPolicy 'optional' the guest will drop this disk.
|
||
This feature doesn't support migration currently.
|
||
</p>
|
||
</dd>
|
||
<dt><code>backingStore</code></dt>
|
||
<dd>
|
||
This element describes the backing store used by the disk
|
||
specified by sibling <code>source</code> element. It is
|
||
currently ignored on input and only used for output to
|
||
describe the detected backing chains of running
|
||
domains <span class="since">since 1.2.4</span> (although a
|
||
future version of libvirt may start accepting chains on input,
|
||
or output information for offline domains). An
|
||
empty <code>backingStore</code> element means the sibling
|
||
source is self-contained and is not based on any backing
|
||
store. For backing chain information to be accurate, the
|
||
backing format must be correctly specified in the metadata of
|
||
each file of the chain (files created by libvirt satisfy this
|
||
property, but using existing external files for snapshot or
|
||
block copy operations requires the end user to pre-create the
|
||
file correctly). The following attributes are
|
||
supported in <code>backingStore</code>:
|
||
<dl>
|
||
<dt><code>type</code></dt>
|
||
<dd>
|
||
The <code>type</code> attribute represents the type of disk used
|
||
by the backing store, see disk type attribute above for more
|
||
details and possible values.
|
||
</dd>
|
||
<dt><code><a id="elementsDiskBackingStoreIndex">index</a></code></dt>
|
||
<dd>
|
||
This attribute is only valid in output (and ignored on input) and
|
||
it can be used to refer to a specific part of the disk chain when
|
||
doing block operations (such as via the
|
||
<code>virDomainBlockRebase</code> API). For example,
|
||
<code>vda[2]</code> refers to the backing store with
|
||
<code>index='2'</code> of the disk with <code>vda</code> target.
|
||
</dd>
|
||
</dl>
|
||
Moreover, <code>backingStore</code> supports the following sub-elements:
|
||
<dl>
|
||
<dt><code>format</code></dt>
|
||
<dd>
|
||
The <code>format</code> element contains <code>type</code>
|
||
attribute which specifies the internal format of the backing
|
||
store, such as <code>raw</code> or <code>qcow2</code>.
|
||
</dd>
|
||
<dt><code>source</code></dt>
|
||
<dd>
|
||
This element has the same structure as the <code>source</code>
|
||
element in <code>disk</code>. It specifies which file, device,
|
||
or network location contains the data of the described backing
|
||
store.
|
||
</dd>
|
||
<dt><code>backingStore</code></dt>
|
||
<dd>
|
||
If the backing store is not self-contained, the next element
|
||
in the chain is described by nested <code>backingStore</code>
|
||
element.
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
<dt><code>mirror</code></dt>
|
||
<dd>
|
||
This element is present if the hypervisor has started a
|
||
long-running block job operation, where the mirror location in
|
||
the <code>source</code> sub-element will eventually have the
|
||
same contents as the source, and with the file format in the
|
||
sub-element <code>format</code> (which might differ from the
|
||
format of the source). The details of the <code>source</code>
|
||
sub-element are determined by the <code>type</code> attribute
|
||
of the mirror, similar to what is done for the
|
||
overall <code>disk</code> device element. The <code>job</code>
|
||
attribute mentions which API started the operation ("copy" for
|
||
the <code>virDomainBlockRebase</code> API, or "active-commit"
|
||
for the <code>virDomainBlockCommit</code>
|
||
API), <span class="since">since 1.2.7</span>. The
|
||
attribute <code>ready</code>, if present, tracks progress of
|
||
the job: <code>yes</code> if the disk is known to be ready to
|
||
pivot, or, <span class="since">since
|
||
1.2.7</span>, <code>abort</code> or <code>pivot</code> if the
|
||
job is in the process of completing. If <code>ready</code> is
|
||
not present, the disk is probably still
|
||
copying. For now, this element only valid in output; it is
|
||
ignored on input. The <code>source</code> sub-element exists
|
||
for all two-phase jobs <span class="since">since 1.2.6</span>.
|
||
Older libvirt supported only block copy to a
|
||
file, <span class="since">since 0.9.12</span>; for
|
||
compatibility with older clients, such jobs include redundant
|
||
information in the attributes <code>file</code>
|
||
and <code>format</code> in the <code>mirror</code> element.
|
||
</dd>
|
||
<dt><code>target</code></dt>
|
||
<dd>The <code>target</code> element controls the bus / device
|
||
under which the disk is exposed to the guest
|
||
OS. The <code>dev</code> attribute indicates the "logical"
|
||
device name. The actual device name specified is not
|
||
guaranteed to map to the device name in the guest OS. Treat it
|
||
as a device ordering hint. The optional <code>bus</code>
|
||
attribute specifies the type of disk device to emulate;
|
||
possible values are driver specific, with typical values being
|
||
"ide", "scsi", "virtio", "xen", "usb", "sata", or
|
||
"sd" <span class="since">"sd" since 1.1.2</span>. If omitted, the bus
|
||
type is inferred from the style of the device name (e.g. a device named
|
||
'sda' will typically be exported using a SCSI bus). The optional
|
||
attribute <code>tray</code> indicates the tray status of the
|
||
removable disks (i.e. CDROM or Floppy disk), the value can be either
|
||
"open" or "closed", defaults to "closed". NB, the value of
|
||
<code>tray</code> could be updated while the domain is running.
|
||
The optional attribute <code>removable</code> sets the
|
||
removable flag for USB disks, and its value can be either "on"
|
||
or "off", defaulting to "off". <span class="since">Since
|
||
0.0.3; <code>bus</code> attribute since 0.4.3;
|
||
<code>tray</code> attribute since 0.9.11; "usb" attribute value since
|
||
after 0.4.4; "sata" attribute value since 0.9.7; "removable" attribute
|
||
value since 1.1.3</span>
|
||
</dd>
|
||
<dt><code>iotune</code></dt>
|
||
<dd>The optional <code>iotune</code> element provides the
|
||
ability to provide additional per-device I/O tuning, with
|
||
values that can vary for each device (contrast this to
|
||
the <a href="#elementsBlockTuning"><code><blkiotune></code></a>
|
||
element, which applies globally to the domain). Currently,
|
||
the only tuning available is Block I/O throttling for qemu.
|
||
This element has optional sub-elements; any sub-element not
|
||
specified or given with a value of 0 implies no
|
||
limit. <span class="since">Since 0.9.8</span>
|
||
<dl>
|
||
<dt><code>total_bytes_sec</code></dt>
|
||
<dd>The optional <code>total_bytes_sec</code> element is the
|
||
total throughput limit in bytes per second. This cannot
|
||
appear with <code>read_bytes_sec</code>
|
||
or <code>write_bytes_sec</code>.</dd>
|
||
<dt><code>read_bytes_sec</code></dt>
|
||
<dd>The optional <code>read_bytes_sec</code> element is the
|
||
read throughput limit in bytes per second.</dd>
|
||
<dt><code>write_bytes_sec</code></dt>
|
||
<dd>The optional <code>write_bytes_sec</code> element is the
|
||
write throughput limit in bytes per second.</dd>
|
||
<dt><code>total_iops_sec</code></dt>
|
||
<dd>The optional <code>total_iops_sec</code> element is the
|
||
total I/O operations per second. This cannot
|
||
appear with <code>read_iops_sec</code>
|
||
or <code>write_iops_sec</code>.</dd>
|
||
<dt><code>read_iops_sec</code></dt>
|
||
<dd>The optional <code>read_iops_sec</code> element is the
|
||
read I/O operations per second.</dd>
|
||
<dt><code>write_iops_sec</code></dt>
|
||
<dd>The optional <code>write_iops_sec</code> element is the
|
||
write I/O operations per second.</dd>
|
||
<dt><code>total_bytes_sec_max</code></dt>
|
||
<dd>The optional <code>total_bytes_sec_max</code> element is the
|
||
maximum total throughput limit in bytes per second. This cannot
|
||
appear with <code>read_bytes_sec_max</code>
|
||
or <code>write_bytes_sec_max</code>.</dd>
|
||
<dt><code>read_bytes_sec_max</code></dt>
|
||
<dd>The optional <code>read_bytes_sec_max</code> element is the
|
||
maximum read throughput limit in bytes per second.</dd>
|
||
<dt><code>write_bytes_sec_max</code></dt>
|
||
<dd>The optional <code>write_bytes_sec_max</code> element is the
|
||
maximum write throughput limit in bytes per second.</dd>
|
||
<dt><code>total_iops_sec_max</code></dt>
|
||
<dd>The optional <code>total_iops_sec_max</code> element is the
|
||
maximum total I/O operations per second. This cannot
|
||
appear with <code>read_iops_sec_max</code>
|
||
or <code>write_iops_sec_max</code>.</dd>
|
||
<dt><code>read_iops_sec_max</code></dt>
|
||
<dd>The optional <code>read_iops_sec_max</code> element is the
|
||
maximum read I/O operations per second.</dd>
|
||
<dt><code>write_iops_sec_max</code></dt>
|
||
<dd>The optional <code>write_iops_sec_max</code> element is the
|
||
maximum write I/O operations per second.</dd>
|
||
<dt><code>size_iops_sec</code></dt>
|
||
<dd>The optional <code>size_iops_sec</code> element is the
|
||
size of I/O operations per second.
|
||
<p>
|
||
<span class="since">Throughput limits since 1.2.11 and QEMU 1.7</span>
|
||
</p>
|
||
</dd>
|
||
<dt><code>group_name</code></dt>
|
||
<dd>The optional <code>group_name</code> provides the cability
|
||
to share I/O throttling quota between multiple drives. This
|
||
prevents end-users from circumventing a hosting provider's
|
||
throttling policy by splitting 1 large drive in N small drives
|
||
and getting N times the normal throttling quota. Any name may
|
||
be used.
|
||
<p>
|
||
<span class="since">group_name since 3.0.0 and QEMU 2.4</span>
|
||
</p>
|
||
</dd>
|
||
<dt><code>total_bytes_sec_max_length</code></dt>
|
||
<dd>The optional <code>total_bytes_sec_max_length</code>
|
||
element is the maximum duration in seconds for the
|
||
<code>total_bytes_sec_max</code> burst period. Only valid
|
||
when the <code>total_bytes_sec_max</code> is set.</dd>
|
||
<dt><code>read_bytes_sec_max_length</code></dt>
|
||
<dd>The optional <code>read_bytes_sec_max_length</code>
|
||
element is the maximum duration in seconds for the
|
||
<code>read_bytes_sec_max</code> burst period. Only valid
|
||
when the <code>read_bytes_sec_max</code> is set.</dd>
|
||
<dt><code>write_bytes_sec_max</code></dt>
|
||
<dd>The optional <code>write_bytes_sec_max_length</code>
|
||
element is the maximum duration in seconds for the
|
||
<code>write_bytes_sec_max</code> burst period. Only valid
|
||
when the <code>write_bytes_sec_max</code> is set.</dd>
|
||
<dt><code>total_iops_sec_max_length</code></dt>
|
||
<dd>The optional <code>total_iops_sec_max_length</code>
|
||
element is the maximum duration in seconds for the
|
||
<code>total_iops_sec_max</code> burst period. Only valid
|
||
when the <code>total_iops_sec_max</code> is set.</dd>
|
||
<dt><code>read_iops_sec_max_length</code></dt>
|
||
<dd>The optional <code>read_iops_sec_max_length</code>
|
||
element is the maximum duration in seconds for the
|
||
<code>read_iops_sec_max</code> burst period. Only valid
|
||
when the <code>read_iops_sec_max</code> is set.</dd>
|
||
<dt><code>write_iops_sec_max</code></dt>
|
||
<dd>The optional <code>write_iops_sec_max_length</code>
|
||
element is the maximum duration in seconds for the
|
||
<code>write_iops_sec_max</code> burst period. Only valid
|
||
when the <code>write_iops_sec_max</code> is set.
|
||
<p>
|
||
<span class="since">Throughput length since 2.4.0 and QEMU 2.6</span>
|
||
</p>
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
<dt><code>driver</code></dt>
|
||
<dd>
|
||
The optional driver element allows specifying further details
|
||
related to the hypervisor driver used to provide the disk.
|
||
<span class="since">Since 0.1.8</span>
|
||
<ul>
|
||
<li>
|
||
If the hypervisor supports multiple backend drivers, then
|
||
the <code>name</code> attribute selects the primary
|
||
backend driver name, while the optional <code>type</code>
|
||
attribute provides the sub-type. For example, xen
|
||
supports a name of "tap", "tap2", "phy", or "file", with a
|
||
type of "aio", while qemu only supports a name of "qemu",
|
||
but multiple types including "raw", "bochs", "qcow2", and
|
||
"qed".
|
||
</li>
|
||
<li>
|
||
The optional <code>cache</code> attribute controls the
|
||
cache mechanism, possible values are "default", "none",
|
||
"writethrough", "writeback", "directsync" (like
|
||
"writethrough", but it bypasses the host page cache) and
|
||
"unsafe" (host may cache all disk io, and sync requests from
|
||
guest are ignored).
|
||
<span class="since">
|
||
Since 0.6.0,
|
||
"directsync" since 0.9.5,
|
||
"unsafe" since 0.9.7
|
||
</span>
|
||
</li>
|
||
<li>
|
||
The optional <code>error_policy</code> attribute controls
|
||
how the hypervisor will behave on a disk read or write
|
||
error, possible values are "stop", "report", "ignore", and
|
||
"enospace".<span class="since">Since 0.8.0, "report" since
|
||
0.9.7</span> The default is left to the discretion of the
|
||
hypervisor. There is also an
|
||
optional <code>rerror_policy</code> that controls behavior
|
||
for read errors only. <span class="since">Since
|
||
0.9.7</span>. If no rerror_policy is given, error_policy
|
||
is used for both read and write errors. If rerror_policy
|
||
is given, it overrides the <code>error_policy</code> for
|
||
read errors. Also note that "enospace" is not a valid
|
||
policy for read errors, so if <code>error_policy</code> is
|
||
set to "enospace" and no <code>rerror_policy</code> is
|
||
given, the read error policy will be left at its default.
|
||
</li>
|
||
<li>
|
||
The optional <code>io</code> attribute controls specific
|
||
policies on I/O; qemu guests support "threads" and
|
||
"native". <span class="since">Since 0.8.8</span>
|
||
</li>
|
||
<li>
|
||
The optional <code>ioeventfd</code> attribute allows users to
|
||
set <a href='https://patchwork.kernel.org/patch/43390/'>
|
||
domain I/O asynchronous handling</a> for disk device.
|
||
The default is left to the discretion of the hypervisor.
|
||
Accepted values are "on" and "off". Enabling this allows
|
||
qemu to execute VM while a separate thread handles I/O.
|
||
Typically guests experiencing high system CPU utilization
|
||
during I/O will benefit from this. On the other hand,
|
||
on overloaded host it could increase guest I/O latency.
|
||
<span class="since">Since 0.9.3 (QEMU and KVM only)</span>
|
||
<b>In general you should leave this option alone, unless you
|
||
are very certain you know what you are doing.</b>
|
||
</li>
|
||
<li>
|
||
The optional <code>event_idx</code> attribute controls
|
||
some aspects of device event processing. The value can be
|
||
either 'on' or 'off' - if it is on, it will reduce the
|
||
number of interrupts and exits for the guest. The default
|
||
is determined by QEMU; usually if the feature is
|
||
supported, default is on. In case there is a situation
|
||
where this behavior is suboptimal, this attribute provides
|
||
a way to force the feature off.
|
||
<span class="since">Since 0.9.5 (QEMU and KVM only)</span>
|
||
<b>In general you should leave this option alone, unless you
|
||
are very certain you know what you are doing.</b>
|
||
</li>
|
||
<li>
|
||
The optional <code>copy_on_read</code> attribute controls
|
||
whether to copy read backing file into the image file. The
|
||
value can be either "on" or "off".
|
||
Copy-on-read avoids accessing the same backing file sectors
|
||
repeatedly and is useful when the backing file is over a slow
|
||
network. By default copy-on-read is off.
|
||
<span class='since'>Since 0.9.10 (QEMU and KVM only)</span>
|
||
</li>
|
||
<li>
|
||
The optional <code>discard</code> attribute controls whether
|
||
discard requests (also known as "trim" or "unmap") are
|
||
ignored or passed to the filesystem. The value can be either
|
||
"unmap" (allow the discard request to be passed) or "ignore"
|
||
(ignore the discard request).
|
||
<span class='since'>Since 1.0.6 (QEMU and KVM only)</span>
|
||
</li>
|
||
<li>
|
||
The optional <code>detect_zeroes</code> attribute controls whether
|
||
to detect zero write requests. The value can be "off", "on" or
|
||
"unmap". First two values turn the detection off and on,
|
||
respectively. The third value ("unmap") turns the detection on
|
||
and additionally tries to discard such areas from the image based
|
||
on the value of <code>discard</code> above (it will act as "on"
|
||
if <code>discard</code> is set to "ignore"). NB enabling the
|
||
detection is a compute intensive operation, but can save file
|
||
space and/or time on slow media.
|
||
<span class='since'>Since 2.0.0</span>
|
||
</li>
|
||
<li>
|
||
The optional <code>iothread</code> attribute assigns the
|
||
disk to an IOThread as defined by the range for the domain
|
||
<a href="#elementsIOThreadsAllocation"><code>iothreads</code></a>
|
||
value. Multiple disks may be assigned to the same IOThread and
|
||
are numbered from 1 to the domain iothreads value. Available
|
||
for a disk device <code>target</code> configured to use "virtio"
|
||
<code>bus</code> and "pci" or "ccw" <code>address</code> types.
|
||
<span class='since'>Since 1.2.8 (QEMU 2.1)</span>
|
||
</li>
|
||
<li>
|
||
The optional <code>queues</code> attribute specifies the number of
|
||
virt queues for virtio-blk. (<span class="since">Since 3.9.0</span>)
|
||
</li>
|
||
<li>
|
||
For virtio disks,
|
||
<a href="#elementsVirtio">Virtio-specific options</a> can also be
|
||
set. (<span class="since">Since 3.5.0</span>)
|
||
</li>
|
||
</ul>
|
||
</dd>
|
||
<dt><code>backenddomain</code></dt>
|
||
<dd>The optional <code>backenddomain</code> element allows specifying a
|
||
backend domain (aka driver domain) hosting the disk. Use the
|
||
<code>name</code> attribute to specify the backend domain name.
|
||
<span class="since">Since 1.2.13 (Xen only)</span>
|
||
</dd>
|
||
<dt><code>boot</code></dt>
|
||
<dd>Specifies that the disk is bootable. The <code>order</code>
|
||
attribute determines the order in which devices will be tried during
|
||
boot sequence. On the S390 architecture only the first boot device is
|
||
used. The optional <code>loadparm</code> attribute is an 8 character
|
||
string which can be queried by guests on S390 via sclp or diag 308.
|
||
Linux guests on S390 can use <code>loadparm</code> to select a boot
|
||
entry. <span class="since">Since 3.5.0</span>
|
||
The per-device <code>boot</code> elements cannot be used together
|
||
with general boot elements in
|
||
<a href="#elementsOSBIOS">BIOS bootloader</a> section.
|
||
<span class="since">Since 0.8.8</span>
|
||
</dd>
|
||
<dt><code>encryption</code></dt>
|
||
<dd>Starting with <span class="since">libvirt 3.9.0</span> the
|
||
<code>encryption</code> element is preferred to be a sub-element
|
||
of the <code>source</code> element. If present, specifies how the
|
||
volume is encrypted using "qcow". See the
|
||
<a href="formatstorageencryption.html">Storage Encryption</a> page
|
||
for more information.
|
||
</dd>
|
||
<dt><code>readonly</code></dt>
|
||
<dd>If present, this indicates the device cannot be modified by
|
||
the guest. For now, this is the default for disks with
|
||
attribute <code>device='cdrom'</code>.
|
||
</dd>
|
||
<dt><code>shareable</code></dt>
|
||
<dd>If present, this indicates the device is expected to be shared
|
||
between domains (assuming the hypervisor and OS support this),
|
||
which means that caching should be deactivated for that device.
|
||
</dd>
|
||
<dt><code>transient</code></dt>
|
||
<dd>If present, this indicates that changes to the device
|
||
contents should be reverted automatically when the guest
|
||
exits. With some hypervisors, marking a disk transient
|
||
prevents the domain from participating in migration or
|
||
snapshots. <span class="since">Since 0.9.5</span>
|
||
</dd>
|
||
<dt><code>serial</code></dt>
|
||
<dd>If present, this specify serial number of virtual hard drive.
|
||
For example, it may look
|
||
like <code><serial>WD-WMAP9A966149</serial></code>.
|
||
Not supported for scsi-block devices, that is those using
|
||
disk <code>type</code> 'block' using <code>device</code> 'lun'
|
||
on <code>bus</code> 'scsi'.
|
||
<span class="since">Since 0.7.1</span>
|
||
</dd>
|
||
<dt><code>wwn</code></dt>
|
||
<dd>If present, this element specifies the WWN (World Wide Name)
|
||
of a virtual hard disk or CD-ROM drive. It must be composed
|
||
of 16 hexadecimal digits.
|
||
<span class='since'>Since 0.10.1</span>
|
||
</dd>
|
||
<dt><code>vendor</code></dt>
|
||
<dd>If present, this element specifies the vendor of a virtual hard
|
||
disk or CD-ROM device. It must not be longer than 8 printable
|
||
characters.
|
||
<span class='since'>Since 1.0.1</span>
|
||
</dd>
|
||
<dt><code>product</code></dt>
|
||
<dd>If present, this element specifies the product of a virtual hard
|
||
disk or CD-ROM device. It must not be longer than 16 printable
|
||
characters.
|
||
<span class='since'>Since 1.0.1</span>
|
||
</dd>
|
||
<dt><code>address</code></dt>
|
||
<dd>If present, the <code>address</code> element ties the disk
|
||
to a given slot of a controller (the
|
||
actual <code><controller></code> device can often be
|
||
inferred by libvirt, although it can
|
||
be <a href="#elementsControllers">explicitly specified</a>).
|
||
The <code>type</code> attribute is mandatory, and is typically
|
||
"pci" or "drive". For a "pci" controller, additional
|
||
attributes for <code>bus</code>, <code>slot</code>,
|
||
and <code>function</code> must be present, as well as
|
||
optional <code>domain</code> and <code>multifunction</code>.
|
||
Multifunction defaults to 'off'; any other value requires
|
||
QEMU 0.1.3 and <span class="since">libvirt 0.9.7</span>. For a
|
||
"drive" controller, additional attributes
|
||
<code>controller</code>, <code>bus</code>, <code>target</code>
|
||
(<span class="since">libvirt 0.9.11</span>), and <code>unit</code>
|
||
are available, each defaulting to 0.
|
||
</dd>
|
||
<dt><code>auth</code></dt>
|
||
<dd>Starting with <span class="since">libvirt 3.9.0</span> the
|
||
<code>auth</code> element is preferred to be a sub-element of
|
||
the <code>source</code> element. The element is still read and
|
||
managed as a <code>disk</code> sub-element. It is invalid to use
|
||
<code>auth</code> as both a sub-element of <code>disk</code>
|
||
and <code>source</code>. The <code>auth</code> element was
|
||
introduced as a <code>disk</code> sub-element in
|
||
<span class="since">libvirt 0.9.7.</span>
|
||
</dd>
|
||
<dt><code>geometry</code></dt>
|
||
<dd>The optional <code>geometry</code> element provides the
|
||
ability to override geometry settings. This mostly useful for
|
||
S390 DASD-disks or older DOS-disks. <span class="since">0.10.0</span>
|
||
<dl>
|
||
<dt><code>cyls</code></dt>
|
||
<dd>The <code>cyls</code> attribute is the
|
||
number of cylinders. </dd>
|
||
<dt><code>heads</code></dt>
|
||
<dd>The <code>heads</code> attribute is the
|
||
number of heads. </dd>
|
||
<dt><code>secs</code></dt>
|
||
<dd>The <code>secs</code> attribute is the
|
||
number of sectors per track. </dd>
|
||
<dt><code>trans</code></dt>
|
||
<dd>The optional <code>trans</code> attribute is the
|
||
BIOS-Translation-Modus (none, lba or auto)</dd>
|
||
</dl>
|
||
</dd>
|
||
<dt><code>blockio</code></dt>
|
||
<dd>If present, the <code>blockio</code> element allows
|
||
to override any of the block device properties listed below.
|
||
<span class="since">Since 0.10.2 (QEMU and KVM)</span>
|
||
<dl>
|
||
<dt><code>logical_block_size</code></dt>
|
||
<dd>The logical block size the disk will report to the guest
|
||
OS. For Linux this would be the value returned by the
|
||
BLKSSZGET ioctl and describes the smallest units for disk
|
||
I/O.
|
||
</dd>
|
||
<dt><code>physical_block_size</code></dt>
|
||
<dd>The physical block size the disk will report to the guest
|
||
OS. For Linux this would be the value returned by the
|
||
BLKPBSZGET ioctl and describes the disk's hardware sector
|
||
size which can be relevant for the alignment of disk data.
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
</dl>
|
||
|
||
<h4><a id="elementsFilesystems">Filesystems</a></h4>
|
||
|
||
<p>
|
||
A directory on the host that can be accessed directly from the guest.
|
||
<span class="since">since 0.3.3, since 0.8.5 for QEMU/KVM</span>
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<filesystem type='template'>
|
||
<source name='my-vm-template'/>
|
||
<target dir='/'/>
|
||
</filesystem>
|
||
<filesystem type='mount' accessmode='passthrough'>
|
||
<driver type='path' wrpolicy='immediate'/>
|
||
<source dir='/export/to/guest'/>
|
||
<target dir='/import/from/host'/>
|
||
<readonly/>
|
||
</filesystem>
|
||
<filesystem type='file' accessmode='passthrough'>
|
||
<driver name='loop' type='raw'/>
|
||
<driver type='path' wrpolicy='immediate'/>
|
||
<source file='/export/to/guest.img'/>
|
||
<target dir='/import/from/host'/>
|
||
<readonly/>
|
||
</filesystem>
|
||
...
|
||
</devices>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><code>filesystem</code></dt>
|
||
<dd>
|
||
|
||
The filesystem attribute <code>type</code> specifies the type of the
|
||
<code>source</code>. The possible values are:
|
||
|
||
<dl>
|
||
<dt><code>mount</code></dt>
|
||
<dd>
|
||
A host directory to mount in the guest. Used by LXC,
|
||
OpenVZ <span class="since">(since 0.6.2)</span>
|
||
and QEMU/KVM <span class="since">(since 0.8.5)</span>.
|
||
This is the default <code>type</code> if one is not specified.
|
||
This mode also has an optional
|
||
sub-element <code>driver</code>, with an
|
||
attribute <code>type='path'</code>
|
||
or <code>type='handle'</code> <span class="since">(since
|
||
0.9.7)</span>. The driver block has an optional attribute
|
||
<code>wrpolicy</code> that further controls interaction with
|
||
the host page cache; omitting the attribute gives default behavior,
|
||
while the value <code>immediate</code> means that a host writeback
|
||
is immediately triggered for all pages touched during a guest file
|
||
write operation <span class="since">(since 0.9.10)</span>.
|
||
</dd>
|
||
<dt><code>template</code></dt>
|
||
<dd>
|
||
OpenVZ filesystem template. Only used by OpenVZ driver.
|
||
</dd>
|
||
<dt><code>file</code></dt>
|
||
<dd>
|
||
A host file will be treated as an image and mounted in
|
||
the guest. The filesystem format will be autodetected.
|
||
Only used by LXC driver.
|
||
</dd>
|
||
<dt><code>block</code></dt>
|
||
<dd>
|
||
A host block device to mount in the guest. The filesystem
|
||
format will be autodetected. Only used by LXC driver
|
||
<span class="since">(since 0.9.5)</span>.
|
||
</dd>
|
||
<dt><code>ram</code></dt>
|
||
<dd>
|
||
An in-memory filesystem, using memory from the host OS.
|
||
The source element has a single attribute <code>usage</code>
|
||
which gives the memory usage limit in KiB, unless units
|
||
are specified by the <code>units</code> attribute. Only used
|
||
by LXC driver.
|
||
<span class="since"> (since 0.9.13)</span></dd>
|
||
<dt><code>bind</code></dt>
|
||
<dd>
|
||
A directory inside the guest will be bound to another
|
||
directory inside the guest. Only used by LXC driver
|
||
<span class="since"> (since 0.9.13)</span></dd>
|
||
</dl>
|
||
|
||
The filesystem block has an optional attribute <code>accessmode</code>
|
||
which specifies the security mode for accessing the source
|
||
<span class="since">(since 0.8.5)</span>. Currently this only works
|
||
with <code>type='mount'</code> for the QEMU/KVM driver. The possible
|
||
values are:
|
||
|
||
<dl>
|
||
<dt><code>passthrough</code></dt>
|
||
<dd>
|
||
The <code>source</code> is accessed with the permissions of the
|
||
user inside the guest. This is the default <code>accessmode</code> if
|
||
one is not specified.
|
||
<a href="http://lists.gnu.org/archive/html/qemu-devel/2010-05/msg02673.html">More info</a>
|
||
</dd>
|
||
<dt><code>mapped</code></dt>
|
||
<dd>
|
||
The <code>source</code> is accessed with the permissions of the
|
||
hypervisor (QEMU process).
|
||
<a href="http://lists.gnu.org/archive/html/qemu-devel/2010-05/msg02673.html">More info</a>
|
||
</dd>
|
||
<dt><code>squash</code></dt>
|
||
<dd>
|
||
Similar to 'passthrough', the exception is that failure of
|
||
privileged operations like 'chown' are ignored. This makes a
|
||
passthrough-like mode usable for people who run the hypervisor
|
||
as non-root.
|
||
<a href="http://lists.gnu.org/archive/html/qemu-devel/2010-09/msg00121.html">More info</a>
|
||
</dd>
|
||
</dl>
|
||
|
||
</dd>
|
||
|
||
<dt><code>driver</code></dt>
|
||
<dd>
|
||
The optional driver element allows specifying further details
|
||
related to the hypervisor driver used to provide the filesystem.
|
||
<span class="since">Since 1.0.6</span>
|
||
<ul>
|
||
<li>
|
||
If the hypervisor supports multiple backend drivers, then
|
||
the <code>type</code> attribute selects the primary
|
||
backend driver name, while the <code>format</code>
|
||
attribute provides the format type. For example, LXC
|
||
supports a type of "loop", with a format of "raw" or
|
||
"nbd" with any format. QEMU supports a type of "path"
|
||
or "handle", but no formats. Virtuozzo driver supports
|
||
a type of "ploop" with a format of "ploop".
|
||
</li>
|
||
<li>
|
||
For virtio-backed devices,
|
||
<a href="#elementsVirtio">Virtio-specific options</a> can also be
|
||
set. (<span class="since">Since 3.5.0</span>)
|
||
</li>
|
||
</ul>
|
||
</dd>
|
||
|
||
<dt><code>source</code></dt>
|
||
<dd>
|
||
The resource on the host that is being accessed in the guest. The
|
||
<code>name</code> attribute must be used with
|
||
<code>type='template'</code>, and the <code>dir</code> attribute must
|
||
be used with <code>type='mount'</code>. The <code>usage</code> attribute
|
||
is used with <code>type='ram'</code> to set the memory limit in KiB,
|
||
unless units are specified by the <code>units</code> attribute.
|
||
</dd>
|
||
|
||
<dt><code>target</code></dt>
|
||
<dd>
|
||
Where the <code>source</code> can be accessed in the guest. For
|
||
most drivers this is an automatic mount point, but for QEMU/KVM
|
||
this is merely an arbitrary string tag that is exported to the
|
||
guest as a hint for where to mount.
|
||
</dd>
|
||
|
||
<dt><code>readonly</code></dt>
|
||
<dd>
|
||
Enables exporting filesystem as a readonly mount for guest, by
|
||
default read-write access is given (currently only works for
|
||
QEMU/KVM driver).
|
||
</dd>
|
||
|
||
<dt><code>space_hard_limit</code></dt>
|
||
<dd>
|
||
Maximum space available to this guest's filesystem.
|
||
<span class="since">Since 0.9.13</span>
|
||
</dd>
|
||
|
||
<dt><code>space_soft_limit</code></dt>
|
||
<dd>
|
||
Maximum space available to this guest's filesystem. The container is
|
||
permitted to exceed its soft limits for a grace period of time. Afterwards the
|
||
hard limit is enforced.
|
||
<span class="since">Since 0.9.13</span>
|
||
</dd>
|
||
</dl>
|
||
|
||
<h4><a id="elementsAddress">Device Addresses</a></h4>
|
||
|
||
<p>
|
||
Many devices have an optional <code><address></code>
|
||
sub-element to describe where the device is placed on the
|
||
virtual bus presented to the guest. If an address (or any
|
||
optional attribute within an address) is omitted on
|
||
input, libvirt will generate an appropriate address; but an
|
||
explicit address is required if more control over layout is
|
||
required. See below for device examples including an address
|
||
element.
|
||
</p>
|
||
|
||
<p>
|
||
Every address has a mandatory attribute <code>type</code> that
|
||
describes which bus the device is on. The choice of which
|
||
address to use for a given device is constrained in part by the
|
||
device and the architecture of the guest. For example,
|
||
a <code><disk></code> device
|
||
uses <code>type='drive'</code>, while
|
||
a <code><console></code> device would
|
||
use <code>type='pci'</code> on i686 or x86_64 guests,
|
||
or <code>type='spapr-vio'</code> on PowerPC64 pseries guests.
|
||
Each address type has further optional attributes that control
|
||
where on the bus the device will be placed:
|
||
</p>
|
||
|
||
<dl>
|
||
<dt><code>pci</code></dt>
|
||
<dd>PCI addresses have the following additional
|
||
attributes: <code>domain</code> (a 2-byte hex integer, not
|
||
currently used by qemu), <code>bus</code> (a hex value between
|
||
0 and 0xff, inclusive), <code>slot</code> (a hex value between
|
||
0x0 and 0x1f, inclusive), and <code>function</code> (a value
|
||
between 0 and 7, inclusive). Also available is
|
||
the <code>multifunction</code> attribute, which controls
|
||
turning on the multifunction bit for a particular
|
||
slot/function in the PCI control register
|
||
(<span class="since">since 0.9.7, requires QEMU
|
||
0.13</span>). <code>multifunction</code> defaults to 'off',
|
||
but should be set to 'on' for function 0 of a slot that will
|
||
have multiple functions used.<br/>
|
||
<span class="since">Since 1.3.5</span>, some hypervisor
|
||
drivers may accept an <code><address type='pci'/></code>
|
||
element with no other attributes as an explicit request to
|
||
assign a PCI address for the device rather than some other
|
||
type of address that may also be appropriate for that same
|
||
device (e.g. virtio-mmio).
|
||
</dd>
|
||
<dt><code>drive</code></dt>
|
||
<dd>Drive addresses have the following additional
|
||
attributes: <code>controller</code> (a 2-digit controller
|
||
number), <code>bus</code> (a 2-digit bus number),
|
||
<code>target</code> (a 2-digit target number),
|
||
and <code>unit</code> (a 2-digit unit number on the bus).
|
||
</dd>
|
||
<dt><code>virtio-serial</code></dt>
|
||
<dd>Each virtio-serial address has the following additional
|
||
attributes: <code>controller</code> (a 2-digit controller
|
||
number), <code>bus</code> (a 2-digit bus number),
|
||
and <code>slot</code> (a 2-digit slot within the bus).
|
||
</dd>
|
||
<dt><code>ccid</code></dt>
|
||
<dd>A CCID address, for smart-cards, has the following
|
||
additional attributes: <code>bus</code> (a 2-digit bus
|
||
number), and <code>slot</code> attribute (a 2-digit slot
|
||
within the bus). <span class="since">Since 0.8.8.</span>
|
||
</dd>
|
||
<dt><code>usb</code></dt>
|
||
<dd>USB addresses have the following additional
|
||
attributes: <code>bus</code> (a hex value between 0 and 0xfff,
|
||
inclusive), and <code>port</code> (a dotted notation of up to
|
||
four octets, such as 1.2 or 2.1.3.1).
|
||
</dd>
|
||
<dt><code>spapr-vio</code></dt>
|
||
<dd>On PowerPC pseries guests, devices can be assigned to the
|
||
SPAPR-VIO bus. It has a flat 64-bit address space; by
|
||
convention, devices are generally assigned at a non-zero
|
||
multiple of 0x1000, but other addresses are valid and
|
||
permitted by libvirt. Each address has the following
|
||
additional attribute: <code>reg</code> (the hex value address
|
||
of the starting register). <span class="since">Since
|
||
0.9.9.</span>
|
||
</dd>
|
||
<dt><code>ccw</code></dt>
|
||
<dd>S390 guests with a <code>machine</code> value of
|
||
s390-ccw-virtio use the native CCW bus for I/O devices.
|
||
CCW bus addresses have the following additional attributes:
|
||
<code>cssid</code> (a hex value between 0 and 0xfe, inclusive),
|
||
<code>ssid</code> (a value between 0 and 3, inclusive) and
|
||
<code>devno</code> (a hex value between 0 and 0xffff, inclusive).
|
||
Partially specified bus addresses are not allowed.
|
||
If omitted, libvirt will assign a free bus address with
|
||
cssid=0xfe and ssid=0. Virtio-ccw devices must have their cssid
|
||
set to 0xfe.
|
||
<span class="since">Since 1.0.4</span>
|
||
</dd>
|
||
<dt><code>virtio-mmio</code></dt>
|
||
<dd>This places the device on the virtio-mmio transport, which is
|
||
currently only available for some <code>armv7l</code> and
|
||
<code>aarch64</code> virtual machines. virtio-mmio addresses
|
||
do not have any additional attributes.
|
||
<span class="since">Since 1.1.3</span><br/>
|
||
If the guest architecture is <code>aarch64</code> and the machine
|
||
type is <code>virt</code>, libvirt will automatically assign PCI
|
||
addresses to devices; however, the presence of a single device
|
||
with virtio-mmio address in the guest configuration will cause
|
||
libvirt to assign virtio-mmio addresses to all further devices.
|
||
<span class="since">Since 3.0.0</span>
|
||
</dd>
|
||
<dt><code>isa</code></dt>
|
||
<dd>ISA addresses have the following additional
|
||
attributes: <code>iobase</code> and <code>irq</code>.
|
||
<span class="since">Since 1.2.1</span>
|
||
</dd>
|
||
</dl>
|
||
|
||
<h4><a id="elementsVirtio">Virtio-related options</a></h4>
|
||
|
||
<p>
|
||
QEMU's virtio devices have some attributes related to the virtio transport under
|
||
the <code>driver</code> element:
|
||
The <code>iommu</code> attribute enables the use of emulated IOMMU
|
||
by the device. The attribute <code>ats</code> controls the Address
|
||
Translation Service support for PCIe devices. This is needed to make use
|
||
of IOTLB support (see <a href="#elementsIommu">IOMMU device</a>).
|
||
Possible values are <code>on</code> or <code>off</code>.
|
||
<span class="since">Since 3.5.0</span>
|
||
</p>
|
||
|
||
<h4><a id="elementsControllers">Controllers</a></h4>
|
||
|
||
<p>
|
||
Depending on the guest architecture, some device buses can
|
||
appear more than once, with a group of virtual devices tied to a
|
||
virtual controller. Normally, libvirt can automatically infer such
|
||
controllers without requiring explicit XML markup, but sometimes
|
||
it is necessary to provide an explicit controller element, notably
|
||
when planning the <a href="pci-hotplug.html">PCI topology</a>
|
||
for guests where device hotplug is expected.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<controller type='ide' index='0'/>
|
||
<controller type='virtio-serial' index='0' ports='16' vectors='4'/>
|
||
<controller type='virtio-serial' index='1'>
|
||
<address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
|
||
</controller>
|
||
<controller type='scsi' index='0' model='virtio-scsi'>
|
||
<driver iothread='4'/>
|
||
<address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
|
||
</controller>
|
||
...
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
Each controller has a mandatory attribute <code>type</code>,
|
||
which must be one of 'ide', 'fdc', 'scsi', 'sata', 'usb',
|
||
'ccid', 'virtio-serial' or 'pci', and a mandatory
|
||
attribute <code>index</code> which is the decimal integer
|
||
describing in which order the bus controller is encountered (for
|
||
use in <code>controller</code> attributes of
|
||
<code><address></code> elements).
|
||
<span class="since">Since 1.3.5</span> the index is optional; if
|
||
not specified, it will be auto-assigned to be the lowest unused
|
||
index for the given controller type. Some controller types have
|
||
additional attributes that control specific features, such as:
|
||
</p>
|
||
|
||
<dl>
|
||
<dt><code>virtio-serial</code></dt>
|
||
<dd>The <code>virtio-serial</code> controller has two additional
|
||
optional attributes <code>ports</code> and <code>vectors</code>,
|
||
which control how many devices can be connected through the
|
||
controller.</dd>
|
||
<dt><code>scsi</code></dt>
|
||
<dd>A <code>scsi</code> controller has an optional attribute
|
||
<code>model</code>, which is one of 'auto', 'buslogic', 'ibmvscsi',
|
||
'lsilogic', 'lsisas1068', 'lsisas1078', 'virtio-scsi' or
|
||
'vmpvscsi'.</dd>
|
||
<dt><code>usb</code></dt>
|
||
<dd>A <code>usb</code> controller has an optional attribute
|
||
<code>model</code>, which is one of "piix3-uhci", "piix4-uhci",
|
||
"ehci", "ich9-ehci1", "ich9-uhci1", "ich9-uhci2", "ich9-uhci3",
|
||
"vt82c686b-uhci", "pci-ohci", "nec-xhci", "qusb1" (xen pvusb
|
||
with qemu backend, version 1.1), "qusb2" (xen pvusb with qemu
|
||
backend, version 2.0) or "qemu-xhci". Additionally,
|
||
<span class="since">since 0.10.0</span>, if the USB bus needs to
|
||
be explicitly disabled for the guest, <code>model='none'</code>
|
||
may be used. <span class="since">Since 1.0.5</span>, no default
|
||
USB controller will be built on s390.
|
||
<span class="since">Since 1.3.5</span>, USB controllers accept a
|
||
<code>ports</code> attribute to configure how many devices can be
|
||
connected to the controller.</dd>
|
||
<dt><code>ide</code></dt>
|
||
<dd><span class="since">Since 3.10.0</span> for the vbox driver, the
|
||
<code>ide</code> controller has an optional attribute
|
||
<code>model</code>, which is one of "piix3", "piix4" or "ich6".</dd>
|
||
</dl>
|
||
|
||
<p>
|
||
Note: The PowerPC64 "spapr-vio" addresses do not have an
|
||
associated controller.
|
||
</p>
|
||
|
||
<p>
|
||
For controllers that are themselves devices on a PCI or USB bus,
|
||
an optional sub-element <code><address></code> can specify
|
||
the exact relationship of the controller to its master bus, with
|
||
semantics <a href="#elementsAddress">given above</a>.
|
||
</p>
|
||
|
||
<p>
|
||
An optional sub-element <code>driver</code> can specify the driver
|
||
specific options:
|
||
</p>
|
||
<dl>
|
||
<dt><code>queues</code></dt>
|
||
<dd>
|
||
The optional <code>queues</code> attribute specifies the number of
|
||
queues for the controller. For best performance, it's recommended to
|
||
specify a value matching the number of vCPUs.
|
||
<span class="since">Since 1.0.5 (QEMU and KVM only)</span>
|
||
</dd>
|
||
<dt><code>cmd_per_lun</code></dt>
|
||
<dd>
|
||
The optional <code>cmd_per_lun</code> attribute specifies the maximum
|
||
number of commands that can be queued on devices controlled by the
|
||
host.
|
||
<span class="since">Since 1.2.7 (QEMU and KVM only)</span>
|
||
</dd>
|
||
<dt><code>max_sectors</code></dt>
|
||
<dd>
|
||
The optional <code>max_sectors</code> attribute specifies the maximum
|
||
amount of data in bytes that will be transferred to or from the device
|
||
in a single command. The transfer length is measured in sectors, where
|
||
a sector is 512 bytes.
|
||
<span class="since">Since 1.2.7 (QEMU and KVM only)</span>
|
||
</dd>
|
||
<dt><code>ioeventfd</code></dt>
|
||
<dd>
|
||
The optional <code>ioeventfd</code> attribute specifies
|
||
whether the controller should use
|
||
<a href='https://patchwork.kernel.org/patch/43390/'>
|
||
I/O asynchronous handling</a> or not. Accepted values are
|
||
"on" and "off". <span class="since">Since 1.2.18</span>
|
||
</dd>
|
||
<dt><code>iothread</code></dt>
|
||
<dd>
|
||
Supported for controller type <code>scsi</code> using model
|
||
<code>virtio-scsi</code> for <code>address</code> types
|
||
<code>pci</code> and <code>ccw</code>
|
||
<span class="since">since 1.3.5 (QEMU 2.4)</span>.
|
||
|
||
The optional <code>iothread</code> attribute assigns the controller
|
||
to an IOThread as defined by the range for the domain
|
||
<a href="#elementsIOThreadsAllocation"><code>iothreads</code></a>
|
||
value. Each SCSI <code>disk</code> assigned to use the specified
|
||
<code>controller</code> will utilize the same IOThread. If a specific
|
||
IOThread is desired for a specific SCSI <code>disk</code>, then
|
||
multiple controllers must be defined each having a specific
|
||
<code>iothread</code> value. The <code>iothread</code> value
|
||
must be within the range 1 to the domain iothreads value.
|
||
</dd>
|
||
<dt>virtio options</dt>
|
||
<dd>
|
||
For virtio controllers,
|
||
<a href="#elementsVirtio">Virtio-specific options</a> can also be
|
||
set. (<span class="since">Since 3.5.0</span>)
|
||
</dd>
|
||
</dl>
|
||
<p>
|
||
USB companion controllers have an optional
|
||
sub-element <code><master></code> to specify the exact
|
||
relationship of the companion to its master controller.
|
||
A companion controller is on the same bus as its master, so
|
||
the companion <code>index</code> value should be equal.
|
||
Not all controller models can be used as companion controllers
|
||
and libvirt might provide some sensible defaults (settings
|
||
of <code>master startport</code> and <code>function</code> of an
|
||
address) for some particular models.
|
||
Preferred companion controllers are <code>ich-uhci[123]</code>.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<controller type='usb' index='0' model='ich9-ehci1'>
|
||
<address type='pci' domain='0' bus='0' slot='4' function='7'/>
|
||
</controller>
|
||
<controller type='usb' index='0' model='ich9-uhci1'>
|
||
<master startport='0'/>
|
||
<address type='pci' domain='0' bus='0' slot='4' function='0' multifunction='on'/>
|
||
</controller>
|
||
...
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
PCI controllers have an optional <code>model</code> attribute; possible
|
||
values for this attribute are
|
||
</p>
|
||
<ul>
|
||
<li>
|
||
<code>pci-root</code>, <code>pci-bridge</code>
|
||
(<span class="since">since 1.0.5</span>)
|
||
</li>
|
||
<li>
|
||
<code>pcie-root</code>, <code>dmi-to-pci-bridge</code>
|
||
(<span class="since">since 1.1.2</span>)
|
||
</li>
|
||
<li>
|
||
<code>pcie-root-port</code>, <code>pcie-switch-upstream-port</code>,
|
||
<code>pcie-switch-downstream-port</code>
|
||
(<span class="since">since 1.2.19</span>)
|
||
</li>
|
||
<li>
|
||
<code>pci-expander-bus</code>, <code>pcie-expander-bus</code>
|
||
(<span class="since">since 1.3.4</span>)
|
||
</li>
|
||
<li>
|
||
<code>pcie-to-pci-bridge</code>
|
||
(<span class="since">since 4.3.0</span>)
|
||
</li>
|
||
</ul>
|
||
<p>
|
||
The root controllers (<code>pci-root</code>
|
||
and <code>pcie-root</code>) have an
|
||
optional <code>pcihole64</code> element specifying how big (in
|
||
kilobytes, or in the unit specified by <code>pcihole64</code>'s
|
||
<code>unit</code> attribute) the 64-bit PCI hole should be. Some guests (like
|
||
Windows XP or Windows Server 2003) might crash when QEMU and Seabios
|
||
are recent enough to support 64-bit PCI holes, unless this is disabled
|
||
(set to 0). <span class="since">Since 1.1.2 (QEMU only)</span>
|
||
</p>
|
||
<p>
|
||
PCI controllers also have an optional
|
||
subelement <code><model></code> with an attribute
|
||
<code>name</code>. The name attribute holds the name of the
|
||
specific device that qemu is emulating (e.g. "i82801b11-bridge")
|
||
rather than simply the class of device ("dmi-to-pci-bridge",
|
||
"pci-bridge"), which is set in the controller element's
|
||
model <b>attribute</b>. In almost all cases, you should not
|
||
manually add a <code><model></code> subelement to a
|
||
controller, nor should you modify one that is automatically
|
||
generated by libvirt. <span class="since">Since 1.2.19 (QEMU
|
||
only).</span>
|
||
</p>
|
||
<p>
|
||
PCI controllers also have an optional
|
||
subelement <code><target></code> with the attributes and
|
||
subelements listed below. These are configurable items that 1)
|
||
are visible to the guest OS so must be preserved for guest ABI
|
||
compatibility, and 2) are usually left to default values or
|
||
derived automatically by libvirt. In almost all cases, you
|
||
should not manually add a <code><target></code> subelement
|
||
to a controller, nor should you modify the values in the those
|
||
that are automatically generated by
|
||
libvirt. <span class="since">Since 1.2.19 (QEMU only).</span>
|
||
</p>
|
||
<dl>
|
||
<dt><code>chassisNr</code></dt>
|
||
<dd>
|
||
PCI controllers that have attribute model="pci-bridge", can
|
||
also have a <code>chassisNr</code> attribute in
|
||
the <code><target></code> subelement, which is used to
|
||
control QEMU's "chassis_nr" option for the pci-bridge device
|
||
(normally libvirt automatically sets this to the same value as
|
||
the index attribute of the pci controller). If set, chassisNr
|
||
must be between 1 and 255.
|
||
</dd>
|
||
<dt><code>chassis</code></dt>
|
||
<dd>
|
||
pcie-root-port and pcie-switch-downstream-port controllers can
|
||
also have a <code>chassis</code> attribute in
|
||
the <code><target></code> subelement, which is used to
|
||
set the controller's "chassis" configuration value, which is
|
||
visible to the virtual machine. If set, chassis must be
|
||
between 0 and 255.
|
||
</dd>
|
||
<dt><code>port</code></dt>
|
||
<dd>
|
||
pcie-root-port and pcie-switch-downstream-port controllers can
|
||
also have a <code>port</code> attribute in
|
||
the <code><target></code> subelement, which
|
||
is used to set the controller's "port" configuration value,
|
||
which is visible to the virtual machine. If set, port must be
|
||
between 0 and 255.
|
||
</dd>
|
||
<dt><code>busNr</code></dt>
|
||
<dd>
|
||
pci-expander-bus and pcie-expander-bus controllers can have an
|
||
optional <code>busNr</code> attribute (1-254). This will be
|
||
the bus number of the new bus; All bus numbers between that
|
||
specified and 255 will be available only for assignment to
|
||
PCI/PCIe controllers plugged into the hierarchy starting with
|
||
this expander bus, and bus numbers less than the specified
|
||
value will be available to the next lower expander-bus (or the
|
||
root-bus if there are no lower expander buses). If you do not
|
||
specify a busNumber, libvirt will find the lowest existing
|
||
busNumber in all other expander buses (or use 256 if there are
|
||
no others) and auto-assign the busNr of that found bus - 2,
|
||
which provides one bus number for the pci-expander-bus and one
|
||
for the pci-bridge that is automatically attached to it (if
|
||
you plan on adding more pci-bridges to the hierarchy of the
|
||
bus, you should manually set busNr to a lower value).
|
||
<p>
|
||
A similar algorithm is used for automatically determining
|
||
the busNr attribute for pcie-expander-bus, but since the
|
||
pcie-expander-bus doesn't have any built-in pci-bridge, the
|
||
2nd bus-number is just being reserved for the pcie-root-port
|
||
that must necessarily be connected to the bus in order to
|
||
actually plug in an endpoint device. If you intend to plug
|
||
multiple devices into a pcie-expander-bus, you must connect
|
||
a pcie-switch-upstream-port to the pcie-root-port that is
|
||
plugged into the pcie-expander-bus, and multiple
|
||
pcie-switch-downstream-ports to the
|
||
pcie-switch-upstream-port, and of course for this to work
|
||
properly, you will need to decrease the pcie-expander-bus'
|
||
busNr accordingly so that there are enough unused bus
|
||
numbers above it to accommodate giving out one bus number for
|
||
the upstream-port and one for each downstream-port (in
|
||
addition to the pcie-root-port and the pcie-expander-bus
|
||
itself).
|
||
</p>
|
||
</dd>
|
||
<dt><code>node</code></dt>
|
||
<dd>
|
||
Some PCI controllers (<code>pci-expander-bus</code> for the pc
|
||
machine type, <code>pcie-expander-bus</code> for the q35 machine
|
||
type and, <span class="since">since 3.6.0</span>,
|
||
<code>pci-root</code> for the pseries machine type) can have an
|
||
optional <code><node></code> subelement within
|
||
the <code><target></code> subelement, which is used to
|
||
set the NUMA node reported to the guest OS for that bus - the
|
||
guest OS will then know that all devices on that bus are a
|
||
part of the specified NUMA node (it is up to the user of the
|
||
libvirt API to attach host devices to the correct
|
||
pci-expander-bus when assigning them to the domain).
|
||
</dd>
|
||
<dt><code>index</code></dt>
|
||
<dd>
|
||
pci-root controllers for pSeries guests use this attribute to
|
||
record the order they will show up in the guest.
|
||
<span class="since">Since 3.6.0</span>
|
||
</dd>
|
||
</dl>
|
||
<p>
|
||
For machine types which provide an implicit PCI bus, the pci-root
|
||
controller with index=0 is auto-added and required to use PCI devices.
|
||
pci-root has no address.
|
||
PCI bridges are auto-added if there are too many devices to fit on
|
||
the one bus provided by pci-root, or a PCI bus number greater than zero
|
||
was specified.
|
||
PCI bridges can also be specified manually, but their addresses should
|
||
only refer to PCI buses provided by already specified PCI controllers.
|
||
Leaving gaps in the PCI controller indexes might lead to an invalid
|
||
configuration.
|
||
</p>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<controller type='pci' index='0' model='pci-root'/>
|
||
<controller type='pci' index='1' model='pci-bridge'>
|
||
<address type='pci' domain='0' bus='0' slot='5' function='0' multifunction='off'/>
|
||
</controller>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
For machine types which provide an implicit PCI Express (PCIe)
|
||
bus (for example, the machine types based on the Q35 chipset),
|
||
the pcie-root controller with index=0 is auto-added to the
|
||
domain's configuration. pcie-root has also no address, provides
|
||
31 slots (numbered 1-31) that can be used to attach PCIe or PCI
|
||
devices (although libvirt will never auto-assign a PCI device to
|
||
a PCIe slot, it will allow manual specification of such an
|
||
assignment). Devices connected to pcie-root cannot be
|
||
hotplugged. If traditional PCI devices are present in the guest
|
||
configuration, a <code>pcie-to-pci-bridge</code> controller will
|
||
automatically be added: this controller, which plugs into a
|
||
<code>pcie-root-port</code>, provides 31 usable PCI slots (1-31) with
|
||
hotplug support (<span class="since">since 4.3.0</span>). If the QEMU
|
||
binary doesn't support the corresponding device, then a
|
||
<code>dmi-to-pci-bridge</code> controller will be added instead,
|
||
usually at the defacto standard location of slot=0x1e. A
|
||
dmi-to-pci-bridge controller plugs into a PCIe slot (as provided
|
||
by pcie-root), and itself provides 31 standard PCI slots (which
|
||
also do not support device hotplug). In order to have
|
||
hot-pluggable PCI slots in the guest system, a pci-bridge
|
||
controller will also be automatically created and connected to
|
||
one of the slots of the auto-created dmi-to-pci-bridge
|
||
controller; all guest PCI devices with addresses that are
|
||
auto-determined by libvirt will be placed on this pci-bridge
|
||
device. (<span class="since">since 1.1.2</span>).
|
||
</p>
|
||
<p>
|
||
Domains with an implicit pcie-root can also add controllers
|
||
with <code>model='pcie-root-port'</code>,
|
||
<code>model='pcie-switch-upstream-port'</code>,
|
||
and <code>model='pcie-switch-downstream-port'</code>. pcie-root-port
|
||
is a simple type of bridge device that can connect only to one
|
||
of the 31 slots on the pcie-root bus on its upstream side, and
|
||
makes a single (PCIe, hotpluggable) port available on the
|
||
downstream side (at slot='0'). pcie-root-port can be used to
|
||
provide a single slot to later hotplug a PCIe device (but is not
|
||
itself hotpluggable - it must be in the configuration when the
|
||
domain is started).
|
||
(<span class="since">since 1.2.19</span>)
|
||
</p>
|
||
<p>
|
||
pcie-switch-upstream-port is a more flexible (but also more
|
||
complex) device that can only plug into a pcie-root-port or
|
||
pcie-switch-downstream-port on the upstream side (and only
|
||
before the domain is started - it is not hot-pluggable), and
|
||
provides 32 ports on the downstream side (slot='0' - slot='31')
|
||
that accept only pcie-switch-downstream-port devices; each
|
||
pcie-switch-downstream-port device can only plug into a
|
||
pcie-switch-upstream-port on its upstream side (again, not
|
||
hot-pluggable), and on its downstream side provides a single
|
||
hotpluggable pcie port that can accept any standard pci or pcie
|
||
device (or another pcie-switch-upstream-port), i.e. identical in
|
||
function to a pcie-root-port. (<span class="since">since
|
||
1.2.19</span>)
|
||
</p>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<controller type='pci' index='0' model='pcie-root'/>
|
||
<controller type='pci' index='1' model='dmi-to-pci-bridge'>
|
||
<address type='pci' domain='0' bus='0' slot='0xe' function='0'/>
|
||
</controller>
|
||
<controller type='pci' index='2' model='pci-bridge'>
|
||
<address type='pci' domain='0' bus='1' slot='1' function='0'/>
|
||
</controller>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<h4><a id="elementsLease">Device leases</a></h4>
|
||
|
||
<p>
|
||
When using a lock manager, it may be desirable to record device leases
|
||
against a VM. The lock manager will ensure the VM won't start unless
|
||
the leases can be acquired.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
...
|
||
<lease>
|
||
<lockspace>somearea</lockspace>
|
||
<key>somekey</key>
|
||
<target path='/some/lease/path' offset='1024'/>
|
||
</lease>
|
||
...
|
||
</devices>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><code>lockspace</code></dt>
|
||
<dd>This is an arbitrary string, identifying the lockspace
|
||
within which the key is held. Lock managers may impose
|
||
extra restrictions on the format, or length of the lockspace
|
||
name.</dd>
|
||
<dt><code>key</code></dt>
|
||
<dd>This is an arbitrary string, uniquely identifying the
|
||
lease to be acquired. Lock managers may impose extra
|
||
restrictions on the format, or length of the key.
|
||
</dd>
|
||
<dt><code>target</code></dt>
|
||
<dd>This is the fully qualified path of the file associated
|
||
with the lockspace. The offset specifies where the lease
|
||
is stored within the file. If the lock manager does not
|
||
require an offset, just pass 0.
|
||
</dd>
|
||
</dl>
|
||
|
||
<h4><a id="elementsHostDev">Host device assignment</a></h4>
|
||
|
||
<h5><a id="elementsHostDevSubsys">USB / PCI / SCSI devices</a></h5>
|
||
|
||
<p>
|
||
USB, PCI and SCSI devices attached to the host can be passed through
|
||
to the guest using the <code>hostdev</code> element.
|
||
<span class="since">since after 0.4.4 for USB, 0.6.0 for PCI (KVM only)
|
||
and 1.0.6 for SCSI (KVM only)</span>:
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<hostdev mode='subsystem' type='usb'>
|
||
<source startupPolicy='optional'>
|
||
<vendor id='0x1234'/>
|
||
<product id='0xbeef'/>
|
||
</source>
|
||
<boot order='2'/>
|
||
</hostdev>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>or:</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<hostdev mode='subsystem' type='pci' managed='yes'>
|
||
<source>
|
||
<address domain='0x0000' bus='0x06' slot='0x02' function='0x0'/>
|
||
</source>
|
||
<boot order='1'/>
|
||
<rom bar='on' file='/etc/fake/boot.bin'/>
|
||
</hostdev>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>or:</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<hostdev mode='subsystem' type='scsi' sgio='filtered' rawio='yes'>
|
||
<source>
|
||
<adapter name='scsi_host0'/>
|
||
<address bus='0' target='0' unit='0'/>
|
||
</source>
|
||
<readonly/>
|
||
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
|
||
</hostdev>
|
||
</devices>
|
||
...</pre>
|
||
|
||
|
||
<p>or:</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<hostdev mode='subsystem' type='scsi'>
|
||
<source protocol='iscsi' name='iqn.2014-08.com.example:iscsi-nopool/1'>
|
||
<host name='example.com' port='3260'/>
|
||
<auth username='myuser'>
|
||
<secret type='iscsi' usage='libvirtiscsi'/>
|
||
</auth>
|
||
</source>
|
||
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
|
||
</hostdev>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>or:</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<hostdev mode='subsystem' type='scsi_host'>
|
||
<source protocol='vhost' wwpn='naa.50014057667280d8'/>
|
||
</hostdev>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>or:</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<hostdev mode='subsystem' type='mdev' model='vfio-pci'>
|
||
<source>
|
||
<address uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'/>
|
||
</source>
|
||
</hostdev>
|
||
<hostdev mode='subsystem' type='mdev' model='vfio-ccw'>
|
||
<source>
|
||
<address uuid='9063cba3-ecef-47b6-abcf-3fef4fdcad85'/>
|
||
</source>
|
||
<address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0001'/>
|
||
</hostdev>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><code>hostdev</code></dt>
|
||
<dd>The <code>hostdev</code> element is the main container for describing
|
||
host devices. For each device, the <code>mode</code> is always
|
||
"subsystem" and the <code>type</code> is one of the following values
|
||
with additional attributes noted.
|
||
<dl>
|
||
<dt><code>usb</code></dt>
|
||
<dd>USB devices are detached from the host on guest startup
|
||
and reattached after the guest exits or the device is
|
||
hot-unplugged.
|
||
</dd>
|
||
<dt><code>pci</code></dt>
|
||
<dd>For PCI devices, when <code>managed</code> is "yes" it is
|
||
detached from the host before being passed on to the guest
|
||
and reattached to the host after the guest exits. If
|
||
<code>managed</code> is omitted or "no", the user is
|
||
responsible to call <code>virNodeDeviceDetachFlags</code>
|
||
(or <code>virsh nodedev-detach</code> before starting the guest
|
||
or hot-plugging the device and <code>virNodeDeviceReAttach</code>
|
||
(or <code>virsh nodedev-reattach</code>) after hot-unplug or
|
||
stopping the guest.
|
||
</dd>
|
||
<dt><code>scsi</code></dt>
|
||
<dd>For SCSI devices, user is responsible to make sure the device
|
||
is not used by host. If supported by the hypervisor and OS, the
|
||
optional <code>sgio</code> (<span class="since">since 1.0.6</span>)
|
||
attribute indicates whether unprivileged SG_IO commands are
|
||
filtered for the disk. Valid settings are "filtered" or
|
||
"unfiltered", where the default is "filtered".
|
||
The optional <code>rawio</code>
|
||
(<span class="since">since 1.2.9</span>) attribute indicates
|
||
whether the lun needs the rawio capability. Valid settings are
|
||
"yes" or "no". See the rawio description within the
|
||
<a href="#elementsDisks">disk</a> section.
|
||
If a disk lun in the domain already has the rawio capability,
|
||
then this setting not required.
|
||
</dd>
|
||
<dt><code>scsi_host</code></dt>
|
||
<dd><span class="since">since 2.5.0</span>For SCSI devices, user
|
||
is responsible to make sure the device is not used by host. This
|
||
<code>type</code> passes all LUNs presented by a single HBA to
|
||
the guest.
|
||
</dd>
|
||
<dt><code>mdev</code></dt>
|
||
<dd>For mediated devices (<span class="since">Since 3.2.0</span>)
|
||
the <code>model</code> attribute specifies the device API which
|
||
determines how the host's vfio driver will expose the device to the
|
||
guest. Currently, <code>model='vfio-pci'</code> and
|
||
<code>model='vfio-ccw'</code> (<span class="since">Since 4.4.0</span>)
|
||
is supported. <a href="drvnodedev.html#MDEV">MDEV</a> section
|
||
provides more information about mediated devices as well as how to
|
||
create mediated devices on the host.
|
||
<span class="since">Since 4.6.0 (QEMU 2.12)</span> an optional
|
||
<code>display</code> attribute may be used to enable or disable
|
||
support for an accelerated remote desktop backed by a mediated
|
||
device (such as NVIDIA vGPU or Intel GVT-g) as an alternative to
|
||
emulated <a href="#elementsVideo">video devices</a>. This attribute
|
||
is limited to <code>model='vfio-pci'</code> only. Supported values
|
||
are either <code>on</code> or <code>off</code> (default is 'off').
|
||
It is required to use a
|
||
<a href="#elementsGraphics">graphical framebuffer</a> in order to
|
||
use this attribute, currently only supported with VNC, Spice and
|
||
egl-headless graphics devices.
|
||
<p>
|
||
Note: There are also some implications on the usage of guest's
|
||
address type depending on the <code>model</code> attribute,
|
||
see the <code>address</code> element below.
|
||
</p>
|
||
</dd>
|
||
</dl>
|
||
<p>
|
||
Note: The <code>managed</code> attribute is only used with
|
||
<code>type='pci'</code> and is ignored by all the other device types,
|
||
thus setting <code>managed</code> explicitly with other than a PCI
|
||
device has the same effect as omitting it. Similarly,
|
||
<code>model</code> attribute is only supported by mediated devices and
|
||
ignored by all other device types.
|
||
</p>
|
||
</dd>
|
||
<dt><code>source</code></dt>
|
||
<dd>The source element describes the device as seen from the host using
|
||
the following mechanism to describe:
|
||
<dl>
|
||
<dt><code>usb</code></dt>
|
||
<dd>The USB device can either be addressed by vendor / product id
|
||
using the <code>vendor</code> and <code>product</code> elements
|
||
or by the device's address on the host using the
|
||
<code>address</code> element.
|
||
<p>
|
||
<span class="since">Since 1.0.0</span>, the <code>source</code>
|
||
element of USB devices may contain <code>startupPolicy</code>
|
||
attribute which can be used to define policy what to do if the
|
||
specified host USB device is not found. The attribute accepts
|
||
the following values:
|
||
</p>
|
||
<table class="top_table">
|
||
<tr>
|
||
<td> mandatory </td>
|
||
<td> fail if missing for any reason (the default) </td>
|
||
</tr>
|
||
<tr>
|
||
<td> requisite </td>
|
||
<td> fail if missing on boot up,
|
||
drop if missing on migrate/restore/revert </td>
|
||
</tr>
|
||
<tr>
|
||
<td> optional </td>
|
||
<td> drop if missing at any start attempt </td>
|
||
</tr>
|
||
</table>
|
||
</dd>
|
||
<dt><code>pci</code></dt>
|
||
<dd>PCI devices can only be described by their <code>address</code>.
|
||
</dd>
|
||
<dt><code>scsi</code></dt>
|
||
<dd>SCSI devices are described by both the <code>adapter</code>
|
||
and <code>address</code> elements. The <code>address</code>
|
||
element includes a <code>bus</code> attribute (a 2-digit bus
|
||
number), a <code>target</code> attribute (a 10-digit target
|
||
number), and a <code>unit</code> attribute (a 20-digit unit
|
||
number on the bus). Not all hypervisors support larger
|
||
<code>target</code> and <code>unit</code> values. It is up
|
||
to each hypervisor to determine the maximum value supported
|
||
for the adapter.
|
||
<p>
|
||
<span class="since">Since 1.2.8</span>, the <code>source</code>
|
||
element of a SCSI device may contain the <code>protocol</code>
|
||
attribute. When the attribute is set to "iscsi", the host
|
||
device XML follows the network <a href="#elementsDisks">disk</a>
|
||
device using the same <code>name</code> attribute and optionally
|
||
using the <code>auth</code> element to provide the authentication
|
||
credentials to the iSCSI server.
|
||
</p>
|
||
</dd>
|
||
<dt><code>scsi_host</code></dt>
|
||
<dd><span class="since">Since 2.5.0</span>, multiple LUNs behind a
|
||
single SCSI HBA are described by a <code>protocol</code>
|
||
attribute set to "vhost" and a <code>wwpn</code> attribute that
|
||
is the vhost_scsi wwpn (16 hexadecimal digits with a prefix of
|
||
"naa.") established in the host configfs.
|
||
</dd>
|
||
<dt><code>mdev</code></dt>
|
||
<dd>Mediated devices (<span class="since">Since 3.2.0</span>) are
|
||
described by the <code>address</code> element. The
|
||
<code>address</code> element contains a single mandatory attribute
|
||
<code>uuid</code>.
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
<dt><code>vendor</code>, <code>product</code></dt>
|
||
<dd>The <code>vendor</code> and <code>product</code> elements each have an
|
||
<code>id</code> attribute that specifies the USB vendor and product id.
|
||
The ids can be given in decimal, hexadecimal (starting with 0x) or
|
||
octal (starting with 0) form.</dd>
|
||
<dt><code>boot</code></dt>
|
||
<dd>Specifies that the device is bootable. The <code>order</code>
|
||
attribute determines the order in which devices will be tried during
|
||
boot sequence. The per-device <code>boot</code> elements cannot be
|
||
used together with general boot elements in
|
||
<a href="#elementsOSBIOS">BIOS bootloader</a> section.
|
||
<span class="since">Since 0.8.8</span> for PCI devices,
|
||
<span class="since">Since 1.0.1</span> for USB devices.
|
||
</dd>
|
||
<dt><code>rom</code></dt>
|
||
<dd>The <code>rom</code> element is used to change how a PCI
|
||
device's ROM is presented to the guest. The optional <code>bar</code>
|
||
attribute can be set to "on" or "off", and determines whether
|
||
or not the device's ROM will be visible in the guest's memory
|
||
map. (In PCI documentation, the "rombar" setting controls the
|
||
presence of the Base Address Register for the ROM). If no rom
|
||
bar is specified, the qemu default will be used (older
|
||
versions of qemu used a default of "off", while newer qemus
|
||
have a default of "on"). <span class="since">Since
|
||
0.9.7 (QEMU and KVM only)</span>. The optional
|
||
<code>file</code> attribute contains an absolute path to a binary file
|
||
to be presented to the guest as the device's ROM BIOS. This
|
||
can be useful, for example, to provide a PXE boot ROM for a
|
||
virtual function of an sr-iov capable ethernet device (which
|
||
has no boot ROMs for the VFs).
|
||
<span class="since">Since 0.9.10 (QEMU and KVM only)</span>.
|
||
The optional <code>enabled</code> attribute can be set to
|
||
<code>no</code> to disable PCI ROM loading completely for the device;
|
||
if PCI ROM loading is disabled through this attribute, attempts to
|
||
tweak the loading process further using the <code>bar</code> or
|
||
<code>file</code> attributes will be rejected.
|
||
<span class="since">Since 4.3.0 (QEMU and KVM only)</span>.
|
||
</dd>
|
||
<dt><code>address</code></dt>
|
||
<dd>The <code>address</code> element for USB devices has a
|
||
<code>bus</code> and <code>device</code> attribute to specify the
|
||
USB bus and device number the device appears at on the host.
|
||
The values of these attributes can be given in decimal, hexadecimal
|
||
(starting with 0x) or octal (starting with 0) form.
|
||
For PCI devices the element carries 4 attributes allowing to designate
|
||
the device as can be found with the <code>lspci</code> or
|
||
with <code>virsh nodedev-list</code>. For SCSI devices a 'drive'
|
||
address type must be used. For mediated devices, which are software-only
|
||
devices defining an allocation of resources on the physical parent device,
|
||
the address type used must conform to the <code>model</code> attribute
|
||
of element <code>hostdev</code>, e.g. any address type other than PCI for
|
||
<code>vfio-pci</code> device API or any address type other than CCW for
|
||
<code>vfio-ccw</code> device API will result in an error.
|
||
<a href="#elementsAddress">See above</a> for more details on the address
|
||
element.</dd>
|
||
<dt><code>driver</code></dt>
|
||
<dd>
|
||
PCI devices can have an optional <code>driver</code>
|
||
subelement that specifies which backend driver to use for PCI
|
||
device assignment. Use the <code>name</code> attribute to
|
||
select either "vfio" (for the new VFIO device assignment
|
||
backend, which is compatible with UEFI SecureBoot) or "kvm"
|
||
(the legacy device assignment handled directly by the KVM
|
||
kernel module)<span class="since">Since 1.0.5 (QEMU and KVM
|
||
only, requires kernel 3.6 or newer)</span>. When specified,
|
||
device assignment will fail if the requested method of device
|
||
assignment isn't available on the host. When not specified,
|
||
the default is "vfio" on systems where the VFIO driver is
|
||
available and loaded, and "kvm" on older systems, or those
|
||
where the VFIO driver hasn't been
|
||
loaded <span class="since">Since 1.1.3</span> (prior to that
|
||
the default was always "kvm").
|
||
</dd>
|
||
<dt><code>readonly</code></dt>
|
||
<dd>Indicates that the device is readonly, only supported by SCSI host
|
||
device now. <span class="since">Since 1.0.6 (QEMU and KVM only)</span>
|
||
</dd>
|
||
<dt><code>shareable</code></dt>
|
||
<dd>If present, this indicates the device is expected to be shared
|
||
between domains (assuming the hypervisor and OS support this).
|
||
Only supported by SCSI host device.
|
||
<span class="since">Since 1.0.6</span>
|
||
<p>
|
||
Note: Although <code>shareable</code> was introduced
|
||
<span class="since">in 1.0.6</span>, it did not work as
|
||
as expected until <span class="since">1.2.2</span>.
|
||
</p>
|
||
</dd>
|
||
</dl>
|
||
|
||
|
||
<h5><a id="elementsHostDevCaps">Block / character devices</a></h5>
|
||
|
||
<p>
|
||
Block / character devices from the host can be passed through
|
||
to the guest using the <code>hostdev</code> element. This is
|
||
only possible with container based virtualization. Devices are specified
|
||
by a fully qualified path.
|
||
<span class="since">since after 1.0.1 for LXC</span>:
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<hostdev mode='capabilities' type='storage'>
|
||
<source>
|
||
<block>/dev/sdf1</block>
|
||
</source>
|
||
</hostdev>
|
||
...
|
||
</pre>
|
||
|
||
<pre>
|
||
...
|
||
<hostdev mode='capabilities' type='misc'>
|
||
<source>
|
||
<char>/dev/input/event3</char>
|
||
</source>
|
||
</hostdev>
|
||
...
|
||
</pre>
|
||
|
||
<pre>
|
||
...
|
||
<hostdev mode='capabilities' type='net'>
|
||
<source>
|
||
<interface>eth0</interface>
|
||
</source>
|
||
</hostdev>
|
||
...
|
||
</pre>
|
||
|
||
<dl>
|
||
<dt><code>hostdev</code></dt>
|
||
<dd>The <code>hostdev</code> element is the main container for describing
|
||
host devices. For block/character device passthrough <code>mode</code> is
|
||
always "capabilities" and <code>type</code> is "storage" for a block
|
||
device, "misc" for a character device and "net" for a host network
|
||
interface.
|
||
</dd>
|
||
<dt><code>source</code></dt>
|
||
<dd>The source element describes the device as seen from the host.
|
||
For block devices, the path to the block device in the host
|
||
OS is provided in the nested "block" element, while for character
|
||
devices the "char" element is used. For network interfaces, the
|
||
name of the interface is provided in the "interface" element.
|
||
</dd>
|
||
</dl>
|
||
|
||
<h4><a id="elementsRedir">Redirected devices</a></h4>
|
||
|
||
<p>
|
||
USB device redirection through a character device is
|
||
supported <span class="since">since after 0.9.5 (KVM
|
||
only)</span>:
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<redirdev bus='usb' type='tcp'>
|
||
<source mode='connect' host='localhost' service='4000'/>
|
||
<boot order='1'/>
|
||
</redirdev>
|
||
<redirfilter>
|
||
<usbdev class='0x08' vendor='0x1234' product='0xbeef' version='2.56' allow='yes'/>
|
||
<usbdev allow='no'/>
|
||
</redirfilter>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><code>redirdev</code></dt>
|
||
<dd>The <code>redirdev</code> element is the main container for
|
||
describing redirected devices. <code>bus</code> must be "usb"
|
||
for a USB device.
|
||
|
||
An additional attribute <code>type</code> is required,
|
||
matching one of the
|
||
supported <a href="#elementsConsole">serial device</a> types,
|
||
to describe the host side of the
|
||
tunnel; <code>type='tcp'</code>
|
||
or <code>type='spicevmc'</code> (which uses the usbredir
|
||
channel of a <a href="#elementsGraphics">SPICE graphics
|
||
device</a>) are typical. The redirdev element has an optional
|
||
sub-element <code><address></code> which can tie the
|
||
device to a particular controller. Further sub-elements,
|
||
such as <code><source></code>, may be required according
|
||
to the given type, although a <code><target></code> sub-element
|
||
is not required (since the consumer of the character device is
|
||
the hypervisor itself, rather than a device visible in the guest).
|
||
</dd>
|
||
<dt><code>boot</code></dt>
|
||
|
||
<dd>Specifies that the device is bootable.
|
||
The <code>order</code> attribute determines the order in which
|
||
devices will be tried during boot sequence. The per-device
|
||
<code>boot</code> elements cannot be used together with general
|
||
boot elements in <a href="#elementsOSBIOS">BIOS bootloader</a> section.
|
||
(<span class="since">Since 1.0.1</span>)
|
||
</dd>
|
||
<dt><code>redirfilter</code></dt>
|
||
<dd>The<code> redirfilter </code>element is used for creating the
|
||
filter rule to filter out certain devices from redirection.
|
||
It uses sub-element <code><usbdev></code> to define each filter rule.
|
||
<code>class</code> attribute is the USB Class code, for example,
|
||
0x08 represents mass storage devices. The USB device can be addressed by
|
||
vendor / product id using the <code>vendor</code> and <code>product</code> attributes.
|
||
<code>version</code> is the device revision from the bcdDevice field (not
|
||
the version of the USB protocol).
|
||
These four attributes are optional and <code>-1</code> can be used to allow
|
||
any value for them. <code>allow</code> attribute is mandatory,
|
||
'yes' means allow, 'no' for deny.
|
||
</dd>
|
||
</dl>
|
||
|
||
<h4><a id="elementsSmartcard">Smartcard devices</a></h4>
|
||
|
||
<p>
|
||
A virtual smartcard device can be supplied to the guest via the
|
||
<code>smartcard</code> element. A USB smartcard reader device on
|
||
the host cannot be used on a guest with simple device
|
||
passthrough, since it will then not be available on the host,
|
||
possibly locking the host computer when it is "removed".
|
||
Therefore, some hypervisors provide a specialized virtual device
|
||
that can present a smartcard interface to the guest, with
|
||
several modes for describing how credentials are obtained from
|
||
the host or even a from a channel created to a third-party
|
||
smartcard provider. <span class="since">Since 0.8.8</span>
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<smartcard mode='host'/>
|
||
<smartcard mode='host-certificates'>
|
||
<certificate>cert1</certificate>
|
||
<certificate>cert2</certificate>
|
||
<certificate>cert3</certificate>
|
||
<database>/etc/pki/nssdb/</database>
|
||
</smartcard>
|
||
<smartcard mode='passthrough' type='tcp'>
|
||
<source mode='bind' host='127.0.0.1' service='2001'/>
|
||
<protocol type='raw'/>
|
||
<address type='ccid' controller='0' slot='0'/>
|
||
</smartcard>
|
||
<smartcard mode='passthrough' type='spicevmc'/>
|
||
</devices>
|
||
...
|
||
</pre>
|
||
|
||
<p>
|
||
The <code><smartcard></code> element has a mandatory
|
||
attribute <code>mode</code>. The following modes are supported;
|
||
in each mode, the guest sees a device on its USB bus that
|
||
behaves like a physical USB CCID (Chip/Smart Card Interface
|
||
Device) card.
|
||
</p>
|
||
|
||
<dl>
|
||
<dt><code>host</code></dt>
|
||
<dd>The simplest operation, where the hypervisor relays all
|
||
requests from the guest into direct access to the host's
|
||
smartcard via NSS. No other attributes or sub-elements are
|
||
required. See below about the use of an
|
||
optional <code><address></code> sub-element.</dd>
|
||
|
||
<dt><code>host-certificates</code></dt>
|
||
<dd>Rather than requiring a smartcard to be plugged into the
|
||
host, it is possible to provide three NSS certificate names
|
||
residing in a database on the host. These certificates can be
|
||
generated via the command <code>certutil -d /etc/pki/nssdb -x -t
|
||
CT,CT,CT -S -s CN=cert1 -n cert1</code>, and the resulting three
|
||
certificate names must be supplied as the content of each of
|
||
three <code><certificate></code> sub-elements. An
|
||
additional sub-element <code><database></code> can specify
|
||
the absolute path to an alternate directory (matching
|
||
the <code>-d</code> option of the <code>certutil</code> command
|
||
when creating the certificates); if not present, it defaults to
|
||
/etc/pki/nssdb.</dd>
|
||
|
||
<dt><code>passthrough</code></dt>
|
||
<dd>Rather than having the hypervisor directly communicate with
|
||
the host, it is possible to tunnel all requests through a
|
||
secondary character device to a third-party provider (which may
|
||
in turn be talking to a smartcard or using three certificate
|
||
files). In this mode of operation, an additional
|
||
attribute <code>type</code> is required, matching one of the
|
||
supported <a href="#elementsConsole">serial device</a> types, to
|
||
describe the host side of the tunnel; <code>type='tcp'</code>
|
||
or <code>type='spicevmc'</code> (which uses the smartcard
|
||
channel of a <a href="#elementsGraphics">SPICE graphics
|
||
device</a>) are typical. Further sub-elements, such
|
||
as <code><source></code>, may be required according to the
|
||
given type, although a <code><target></code> sub-element
|
||
is not required (since the consumer of the character device is
|
||
the hypervisor itself, rather than a device visible in the
|
||
guest).</dd>
|
||
</dl>
|
||
|
||
<p>
|
||
Each mode supports an optional
|
||
sub-element <code><address></code>, which fine-tunes the
|
||
correlation between the smartcard and a ccid bus
|
||
controller, <a href="#elementsAddress">documented above</a>.
|
||
For now, qemu only supports at most one
|
||
smartcard, with an address of bus=0 slot=0.
|
||
</p>
|
||
|
||
<h4><a id="elementsNICS">Network interfaces</a></h4>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='direct' trustGuestRxFilters='yes'>
|
||
<source dev='eth0'/>
|
||
<mac address='52:54:00:5d:c7:9e'/>
|
||
<boot order='1'/>
|
||
<rom bar='off'/>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
There are several possibilities for specifying a network
|
||
interface visible to the guest. Each subsection below provides
|
||
more details about common setup options.
|
||
</p>
|
||
<p>
|
||
<span class="since">Since 1.2.10</span>),
|
||
the <code>interface</code> element
|
||
property <code>trustGuestRxFilters</code> provides the
|
||
capability for the host to detect and trust reports from the
|
||
guest regarding changes to the interface mac address and receive
|
||
filters by setting the attribute to <code>yes</code>. The default
|
||
setting for the attribute is <code>no</code> for security
|
||
reasons and support depends on the guest network device model as
|
||
well as the type of connection on the host - currently it is
|
||
only supported for the virtio device model and for macvtap
|
||
connections on the host.
|
||
</p>
|
||
<p>
|
||
Each <code><interface></code> element has an
|
||
optional <code><address></code> sub-element that can tie
|
||
the interface to a particular pci slot, with
|
||
attribute <code>type='pci'</code>
|
||
as <a href="#elementsAddress">documented above</a>.
|
||
</p>
|
||
|
||
<h5><a id="elementsNICSVirtual">Virtual network</a></h5>
|
||
|
||
<p>
|
||
<strong><em>
|
||
This is the recommended config for general guest connectivity on
|
||
hosts with dynamic / wireless networking configs (or multi-host
|
||
environments where the host hardware details are described
|
||
separately in a <code><network></code>
|
||
definition <span class="since">Since 0.9.4</span>).
|
||
</em></strong>
|
||
</p>
|
||
|
||
<p>
|
||
|
||
Provides a connection whose details are described by the named
|
||
network definition. Depending on the virtual network's "forward
|
||
mode" configuration, the network may be totally isolated
|
||
(no <code><forward></code> element given), NAT'ing to an
|
||
explicit network device or to the default route
|
||
(<code><forward mode='nat'></code>), routed with no NAT
|
||
(<code><forward mode='route'/></code>), or connected
|
||
directly to one of the host's network interfaces (via macvtap)
|
||
or bridge devices ((<code><forward
|
||
mode='bridge|private|vepa|passthrough'/></code> <span class="since">Since
|
||
0.9.4</span>)
|
||
</p>
|
||
<p>
|
||
For networks with a forward mode of bridge, private, vepa, and
|
||
passthrough, it is assumed that the host has any necessary DNS
|
||
and DHCP services already setup outside the scope of libvirt. In
|
||
the case of isolated, nat, and routed networks, DHCP and DNS are
|
||
provided on the virtual network by libvirt, and the IP range can
|
||
be determined by examining the virtual network config with
|
||
'<code>virsh net-dumpxml [networkname]</code>'. There is one
|
||
virtual network called 'default' setup out of the box which does
|
||
NAT'ing to the default route and has an IP range
|
||
of <code>192.168.122.0/255.255.255.0</code>. Each guest will
|
||
have an associated tun device created with a name of vnetN,
|
||
which can also be overridden with the <target> element
|
||
(see
|
||
<a href="#elementsNICSTargetOverride">overriding the target element</a>).
|
||
</p>
|
||
<p>
|
||
When the source of an interface is a network,
|
||
a <code>portgroup</code> can be specified along with the name of
|
||
the network; one network may have multiple portgroups defined,
|
||
with each portgroup containing slightly different configuration
|
||
information for different classes of network
|
||
connections. <span class="since">Since 0.9.4</span>.
|
||
</p>
|
||
<p>
|
||
Also, similar to <code>direct</code> network connections
|
||
(described below), a connection of type <code>network</code> may
|
||
specify a <code>virtualport</code> element, with configuration
|
||
data to be forwarded to a vepa (802.1Qbg) or 802.1Qbh compliant
|
||
switch (<span class="since">Since 0.8.2</span>), or to an
|
||
Open vSwitch virtual switch (<span class="since">Since
|
||
0.9.11</span>).
|
||
</p>
|
||
<p>
|
||
Since the actual type of switch may vary depending on the
|
||
configuration in the <code><network></code> on the host,
|
||
it is acceptable to omit the virtualport <code>type</code>
|
||
attribute, and specify attributes from multiple different
|
||
virtualport types (and also to leave out certain attributes); at
|
||
domain startup time, a complete <code><virtualport></code>
|
||
element will be constructed by merging together the type and
|
||
attributes defined in the network and the portgroup referenced
|
||
by the interface. The newly-constructed virtualport is a combination
|
||
of them. The attributes from lower virtualport can't make change
|
||
on the ones defined in higher virtualport.
|
||
Interface takes the highest priority, portgroup is lowest priority.
|
||
(<span class="since">Since 0.10.0</span>). For example, in order
|
||
to work properly with both an 802.1Qbh switch and an Open vSwitch
|
||
switch, you may choose to specify no type, but both
|
||
a <code>profileid</code> (in case the switch is 802.1Qbh) and
|
||
an <code>interfaceid</code> (in case the switch is Open vSwitch)
|
||
(you may also omit the other attributes, such as managerid,
|
||
typeid, or profileid, to be filled in from the
|
||
network's <code><virtualport></code>). If you want to
|
||
limit a guest to connecting only to certain types of switches,
|
||
you can specify the virtualport type, but still omit some/all of
|
||
the parameters - in this case if the host's network has a
|
||
different type of virtualport, connection of the interface will
|
||
fail.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='network'>
|
||
<source network='default'/>
|
||
</interface>
|
||
...
|
||
<interface type='network'>
|
||
<source network='default' portgroup='engineering'/>
|
||
<target dev='vnet7'/>
|
||
<mac address="00:11:22:33:44:55"/>
|
||
<virtualport>
|
||
<parameters instanceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
|
||
</virtualport>
|
||
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<h5><a id="elementsNICSBridge">Bridge to LAN</a></h5>
|
||
|
||
<p>
|
||
<strong><em>
|
||
This is the recommended config for general guest connectivity on
|
||
hosts with static wired networking configs.
|
||
</em></strong>
|
||
</p>
|
||
|
||
<p>
|
||
Provides a bridge from the VM directly to the LAN. This assumes
|
||
there is a bridge device on the host which has one or more of the hosts
|
||
physical NICs enslaved. The guest VM will have an associated tun device
|
||
created with a name of vnetN, which can also be overridden with the
|
||
<target> element (see
|
||
<a href="#elementsNICSTargetOverride">overriding the target element</a>).
|
||
The tun device will be enslaved to the bridge. The IP range / network
|
||
configuration is whatever is used on the LAN. This provides the guest VM
|
||
full incoming & outgoing net access just like a physical machine.
|
||
</p>
|
||
<p>
|
||
On Linux systems, the bridge device is normally a standard Linux
|
||
host bridge. On hosts that support Open vSwitch, it is also
|
||
possible to connect to an Open vSwitch bridge device by adding
|
||
a <code><virtualport type='openvswitch'/></code> to the
|
||
interface definition. (<span class="since">Since
|
||
0.9.11</span>). The Open vSwitch type virtualport accepts two
|
||
parameters in its <code><parameters></code> element -
|
||
an <code>interfaceid</code> which is a standard uuid used to
|
||
uniquely identify this particular interface to Open vSwitch (if
|
||
you do not specify one, a random interfaceid will be generated
|
||
for you when you first define the interface), and an
|
||
optional <code>profileid</code> which is sent to Open vSwitch as
|
||
the interfaces "port-profile".
|
||
</p>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
...
|
||
<interface type='bridge'>
|
||
<source bridge='br0'/>
|
||
</interface>
|
||
<interface type='bridge'>
|
||
<source bridge='br1'/>
|
||
<target dev='vnet7'/>
|
||
<mac address="00:11:22:33:44:55"/>
|
||
</interface>
|
||
<interface type='bridge'>
|
||
<source bridge='ovsbr'/>
|
||
<virtualport type='openvswitch'>
|
||
<parameters profileid='menial' interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
|
||
</virtualport>
|
||
</interface>
|
||
...
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
On hosts that support Open vSwitch on the kernel side and have the
|
||
Midonet Host Agent configured, it is also possible to connect to the
|
||
'midonet' bridge device by adding a
|
||
<code><virtualport type='midonet'/></code> to the
|
||
interface definition. (<span class="since">Since
|
||
1.2.13</span>). The Midonet virtualport type requires an
|
||
<code>interfaceid</code> attribute in its
|
||
<code><parameters></code> element. This interface id is the UUID
|
||
that specifies which port in the virtual network topology will be bound
|
||
to the interface.
|
||
</p>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
...
|
||
<interface type='bridge'>
|
||
<source bridge='br0'/>
|
||
</interface>
|
||
<interface type='bridge'>
|
||
<source bridge='br1'/>
|
||
<target dev='vnet7'/>
|
||
<mac address="00:11:22:33:44:55"/>
|
||
</interface>
|
||
<interface type='bridge'>
|
||
<source bridge='midonet'/>
|
||
<virtualport type='midonet'>
|
||
<parameters interfaceid='0b2d64da-3d0e-431e-afdd-804415d6ebbb'/>
|
||
</virtualport>
|
||
</interface>
|
||
...
|
||
</devices>
|
||
...</pre>
|
||
|
||
<h5><a id="elementsNICSSlirp">Userspace SLIRP stack</a></h5>
|
||
|
||
<p>
|
||
Provides a virtual LAN with NAT to the outside world. The virtual
|
||
network has DHCP & DNS services and will give the guest VM addresses
|
||
starting from <code>10.0.2.15</code>. The default router will be
|
||
<code>10.0.2.2</code> and the DNS server will be <code>10.0.2.3</code>.
|
||
This networking is the only option for unprivileged users who need their
|
||
VMs to have outgoing access. <span class="since">Since 3.8.0</span>
|
||
it is possible to override the default network address by
|
||
including an <code>ip</code> element specifying an IPv4
|
||
address in its one mandatory attribute, <code>address</code>.
|
||
Optionally, a second <code>ip</code> element with a
|
||
<code>family</code> attribute set to "ipv6" can be
|
||
specified to add an IPv6 address to the interface.
|
||
<code>address</code>. Optionally, address
|
||
<code>prefix</code> can be specified.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='user'/>
|
||
...
|
||
<interface type='user'>
|
||
<mac address="00:11:22:33:44:55"/>
|
||
<ip family='ipv4' address='172.17.2.0' prefix='24'/>
|
||
<ip family='ipv6' address='2001:db8:ac10:fd01::' prefix='64'/>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
|
||
<h5><a id="elementsNICSEthernet">Generic ethernet connection</a></h5>
|
||
|
||
<p>
|
||
Provides a means for the administrator to execute an arbitrary script
|
||
to connect the guest's network to the LAN. The guest will have a tun
|
||
device created with a name of vnetN, which can also be overridden with the
|
||
<target> element. After creating the tun device a shell script will
|
||
be run which is expected to do whatever host network integration is
|
||
required. By default this script is called /etc/qemu-ifup but can be
|
||
overridden.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='ethernet'/>
|
||
...
|
||
<interface type='ethernet'>
|
||
<target dev='vnet7'/>
|
||
<script path='/etc/qemu-ifup-mynet'/>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<h5><a id="elementsNICSDirect">Direct attachment to physical interface</a></h5>
|
||
|
||
<p>
|
||
Provides direct attachment of the virtual machine's NIC to the given
|
||
physical interface of the host.
|
||
<span class="since">Since 0.7.7 (QEMU and KVM only)</span><br/>
|
||
This setup requires the Linux macvtap
|
||
driver to be available. <span class="since">(Since Linux 2.6.34.)</span>
|
||
One of the modes 'vepa'
|
||
( <a href="http://www.ieee802.org/1/files/public/docs2009/new-evb-congdon-vepa-modular-0709-v01.pdf">
|
||
'Virtual Ethernet Port Aggregator'</a>), 'bridge' or 'private'
|
||
can be chosen for the operation mode of the macvtap device, 'vepa'
|
||
being the default mode. The individual modes cause the delivery of
|
||
packets to behave as follows:
|
||
</p>
|
||
<p>
|
||
If the model type is set to <code>virtio</code> and
|
||
interface's <code>trustGuestRxFilters</code> attribute is set
|
||
to <code>yes</code>, changes made to the interface mac address,
|
||
unicast/multicast receive filters, and vlan settings in the
|
||
guest will be monitored and propagated to the associated macvtap
|
||
device on the host (<span class="since">Since
|
||
1.2.10</span>). If <code>trustGuestRxFilters</code> is not set,
|
||
or is not supported for the device model in use, an attempted
|
||
change to the mac address originating from the guest side will
|
||
result in a non-working network connection.
|
||
</p>
|
||
|
||
<dl>
|
||
<dt><code>vepa</code></dt>
|
||
<dd>All VMs' packets are sent to the external bridge. Packets
|
||
whose destination is a VM on the same host as where the
|
||
packet originates from are sent back to the host by the VEPA
|
||
capable bridge (today's bridges are typically not VEPA capable).</dd>
|
||
<dt><code>bridge</code></dt>
|
||
<dd>Packets whose destination is on the same host as where they
|
||
originate from are directly delivered to the target macvtap device.
|
||
Both origin and destination devices need to be in bridge mode
|
||
for direct delivery. If either one of them is in <code>vepa</code> mode,
|
||
a VEPA capable bridge is required.</dd>
|
||
<dt><code>private</code></dt>
|
||
<dd>All packets are sent to the external bridge and will only be
|
||
delivered to a target VM on the same host if they are sent through an
|
||
external router or gateway and that device sends them back to the
|
||
host. This procedure is followed if either the source or destination
|
||
device is in <code>private</code> mode.</dd>
|
||
<dt><code>passthrough</code></dt>
|
||
<dd>This feature attaches a virtual function of a SRIOV capable
|
||
NIC directly to a VM without losing the migration capability.
|
||
All packets are sent to the VF/IF of the configured network device.
|
||
Depending on the capabilities of the device additional prerequisites or
|
||
limitations may apply; for example, on Linux this requires
|
||
kernel 2.6.38 or newer. <span class="since">Since 0.9.2</span></dd>
|
||
</dl>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
...
|
||
<interface type='direct' trustGuestRxFilters='no'>
|
||
<source dev='eth0' mode='vepa'/>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
The network access of direct attached virtual machines can be
|
||
managed by the hardware switch to which the physical interface
|
||
of the host machine is connected to.
|
||
</p>
|
||
<p>
|
||
The interface can have additional parameters as shown below,
|
||
if the switch is conforming to the IEEE 802.1Qbg standard.
|
||
The parameters of the virtualport element are documented in more detail
|
||
in the IEEE 802.1Qbg standard. The values are network specific and
|
||
should be provided by the network administrator. In 802.1Qbg terms,
|
||
the Virtual Station Interface (VSI) represents the virtual interface
|
||
of a virtual machine. <span class="since">Since 0.8.2</span>
|
||
</p>
|
||
<p>
|
||
Please note that IEEE 802.1Qbg requires a non-zero value for the
|
||
VLAN ID.
|
||
</p>
|
||
<dl>
|
||
<dt><code>managerid</code></dt>
|
||
<dd>The VSI Manager ID identifies the database containing the VSI type
|
||
and instance definitions. This is an integer value and the
|
||
value 0 is reserved.</dd>
|
||
<dt><code>typeid</code></dt>
|
||
<dd>The VSI Type ID identifies a VSI type characterizing the network
|
||
access. VSI types are typically managed by network administrator.
|
||
This is an integer value.
|
||
</dd>
|
||
<dt><code>typeidversion</code></dt>
|
||
<dd>The VSI Type Version allows multiple versions of a VSI Type.
|
||
This is an integer value.
|
||
</dd>
|
||
<dt><code>instanceid</code></dt>
|
||
<dd>The VSI Instance ID Identifier is generated when a VSI instance
|
||
(i.e. a virtual interface of a virtual machine) is created.
|
||
This is a globally unique identifier.
|
||
</dd>
|
||
</dl>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
...
|
||
<interface type='direct'>
|
||
<source dev='eth0.2' mode='vepa'/>
|
||
<virtualport type="802.1Qbg">
|
||
<parameters managerid="11" typeid="1193047" typeidversion="2" instanceid="09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f"/>
|
||
</virtualport>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
The interface can have additional parameters as shown below
|
||
if the switch is conforming to the IEEE 802.1Qbh standard.
|
||
The values are network specific and should be provided by the
|
||
network administrator. <span class="since">Since 0.8.2</span>
|
||
</p>
|
||
<dl>
|
||
<dt><code>profileid</code></dt>
|
||
<dd>The profile ID contains the name of the port profile that is to
|
||
be applied to this interface. This name is resolved by the port
|
||
profile database into the network parameters from the port profile,
|
||
and those network parameters will be applied to this interface.
|
||
</dd>
|
||
</dl>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
...
|
||
<interface type='direct'>
|
||
<source dev='eth0' mode='private'/>
|
||
<virtualport type='802.1Qbh'>
|
||
<parameters profileid='finance'/>
|
||
</virtualport>
|
||
</interface>
|
||
</devices>
|
||
...
|
||
</pre>
|
||
|
||
|
||
<h5><a id="elementsNICSHostdev">PCI Passthrough</a></h5>
|
||
|
||
<p>
|
||
A PCI network device (specified by the <source> element)
|
||
is directly assigned to the guest using generic device
|
||
passthrough, after first optionally setting the device's MAC
|
||
address to the configured value, and associating the device with
|
||
an 802.1Qbh capable switch using an optionally specified
|
||
<virtualport> element (see the examples of virtualport
|
||
given above for type='direct' network devices). Note that - due
|
||
to limitations in standard single-port PCI ethernet card driver
|
||
design - only SR-IOV (Single Root I/O Virtualization) virtual
|
||
function (VF) devices can be assigned in this manner; to assign
|
||
a standard single-port PCI or PCIe ethernet card to a guest, use
|
||
the traditional <hostdev> device definition and
|
||
<span class="since">Since 0.9.11</span>
|
||
</p>
|
||
|
||
<p>
|
||
To use VFIO device assignment rather than traditional/legacy KVM
|
||
device assignment (VFIO is a new method of device assignment
|
||
that is compatible with UEFI Secure Boot), a type='hostdev'
|
||
interface can have an optional <code>driver</code> sub-element
|
||
with a <code>name</code> attribute set to "vfio". To use legacy
|
||
KVM device assignment you can set <code>name</code> to "kvm" (or
|
||
simply omit the <code><driver></code> element, since "kvm"
|
||
is currently the default).
|
||
<span class="since">Since 1.0.5 (QEMU and KVM only, requires kernel 3.6 or newer)</span>
|
||
</p>
|
||
|
||
<p>
|
||
Note that this "intelligent passthrough" of network devices is
|
||
very similar to the functionality of a standard <hostdev>
|
||
device, the difference being that this method allows specifying
|
||
a MAC address and <virtualport> for the passed-through
|
||
device. If these capabilities are not required, if you have a
|
||
standard single-port PCI, PCIe, or USB network card that doesn't
|
||
support SR-IOV (and hence would anyway lose the configured MAC
|
||
address during reset after being assigned to the guest domain),
|
||
or if you are using a version of libvirt older than 0.9.11, you
|
||
should use standard <hostdev> to assign the device to the
|
||
guest instead of <interface type='hostdev'/>.
|
||
</p>
|
||
|
||
<p>
|
||
Similar to the functionality of a standard <hostdev> device,
|
||
when <code>managed</code> is "yes", it is detached from the host
|
||
before being passed on to the guest, and reattached to the host
|
||
after the guest exits. If <code>managed</code> is omitted or "no",
|
||
the user is responsible to call <code>virNodeDeviceDettach</code>
|
||
(or <code>virsh nodedev-detach</code>) before starting the guest
|
||
or hot-plugging the device, and <code>virNodeDeviceReAttach</code>
|
||
(or <code>virsh nodedev-reattach</code>) after hot-unplug or
|
||
stopping the guest.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='hostdev' managed='yes'>
|
||
<driver name='vfio'/>
|
||
<source>
|
||
<address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
|
||
</source>
|
||
<mac address='52:54:00:6d:90:02'/>
|
||
<virtualport type='802.1Qbh'>
|
||
<parameters profileid='finance'/>
|
||
</virtualport>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
|
||
<h5><a id="elementsNICSMulticast">Multicast tunnel</a></h5>
|
||
|
||
<p>
|
||
A multicast group is setup to represent a virtual network. Any VMs
|
||
whose network devices are in the same multicast group can talk to each
|
||
other even across hosts. This mode is also available to unprivileged
|
||
users. There is no default DNS or DHCP support and no outgoing network
|
||
access. To provide outgoing network access, one of the VMs should have a
|
||
2nd NIC which is connected to one of the first 4 network types and do the
|
||
appropriate routing. The multicast protocol is compatible with that used
|
||
by user mode linux guests too. The source address used must be from the
|
||
multicast address block.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='mcast'>
|
||
<mac address='52:54:00:6d:90:01'/>
|
||
<source address='230.0.0.1' port='5558'/>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<h5><a id="elementsNICSTCP">TCP tunnel</a></h5>
|
||
|
||
<p>
|
||
A TCP client/server architecture provides a virtual network. One VM
|
||
provides the server end of the network, all other VMS are configured as
|
||
clients. All network traffic is routed between the VMs via the server.
|
||
This mode is also available to unprivileged users. There is no default
|
||
DNS or DHCP support and no outgoing network access. To provide outgoing
|
||
network access, one of the VMs should have a 2nd NIC which is connected
|
||
to one of the first 4 network types and do the appropriate routing.</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='server'>
|
||
<mac address='52:54:00:22:c9:42'/>
|
||
<source address='192.168.0.1' port='5558'/>
|
||
</interface>
|
||
...
|
||
<interface type='client'>
|
||
<mac address='52:54:00:8b:c9:51'/>
|
||
<source address='192.168.0.1' port='5558'/>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<h5><a id="elementsNICSUDP">UDP unicast tunnel</a></h5>
|
||
|
||
<p>
|
||
A UDP unicast architecture provides a virtual network which enables
|
||
connections between QEMU instances using QEMU's UDP infrastructure.
|
||
|
||
The xml "source" address is the endpoint address to which the UDP socket
|
||
packets will be sent from the host running QEMU.
|
||
The xml "local" address is the address of the interface from which the
|
||
UDP socket packets will originate from the QEMU host.
|
||
<span class="since">Since 1.2.20</span></p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='udp'>
|
||
<mac address='52:54:00:22:c9:42'/>
|
||
<source address='127.0.0.1' port='11115'>
|
||
<local address='127.0.0.1' port='11116'/>
|
||
</source>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<h5><a id="elementsNICSModel">Setting the NIC model</a></h5>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='network'>
|
||
<source network='default'/>
|
||
<target dev='vnet1'/>
|
||
<b><model type='ne2k_pci'/></b>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
For hypervisors which support this, you can set the model of
|
||
emulated network interface card.
|
||
</p>
|
||
|
||
<p>
|
||
The values for <code>type</code> aren't defined specifically by
|
||
libvirt, but by what the underlying hypervisor supports (if
|
||
any). For QEMU and KVM you can get a list of supported models
|
||
with these commands:
|
||
</p>
|
||
|
||
<pre>
|
||
qemu -net nic,model=? /dev/null
|
||
qemu-kvm -net nic,model=? /dev/null
|
||
</pre>
|
||
|
||
<p>
|
||
Typical values for QEMU and KVM include:
|
||
ne2k_isa i82551 i82557b i82559er ne2k_pci pcnet rtl8139 e1000 virtio
|
||
</p>
|
||
|
||
<h5><a id="elementsDriverBackendOptions">Setting NIC driver-specific options</a></h5>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='network'>
|
||
<source network='default'/>
|
||
<target dev='vnet1'/>
|
||
<model type='virtio'/>
|
||
<b><driver name='vhost' txmode='iothread' ioeventfd='on' event_idx='off' queues='5' rx_queue_size='256' tx_queue_size='256'>
|
||
<host csum='off' gso='off' tso4='off' tso6='off' ecn='off' ufo='off' mrg_rxbuf='off'/>
|
||
<guest csum='off' tso4='off' tso6='off' ecn='off' ufo='off'/>
|
||
</driver>
|
||
</b>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
Some NICs may have tunable driver-specific options. These are
|
||
set as attributes of the <code>driver</code> sub-element of the
|
||
interface definition. Currently the following attributes are
|
||
available for the <code>"virtio"</code> NIC driver:
|
||
</p>
|
||
|
||
<dl>
|
||
<dt><code>name</code></dt>
|
||
<dd>
|
||
The optional <code>name</code> attribute forces which type of
|
||
backend driver to use. The value can be either 'qemu' (a
|
||
user-space backend) or 'vhost' (a kernel backend, which
|
||
requires the vhost module to be provided by the kernel); an
|
||
attempt to require the vhost driver without kernel support
|
||
will be rejected. If this attribute is not present, then the
|
||
domain defaults to 'vhost' if present, but silently falls back
|
||
to 'qemu' without error.
|
||
<span class="since">Since 0.8.8 (QEMU and KVM only)</span>
|
||
</dd>
|
||
<dd>
|
||
For interfaces of type='hostdev' (PCI passthrough devices)
|
||
the <code>name</code> attribute can optionally be set to
|
||
"vfio" or "kvm". "vfio" tells libvirt to use VFIO device
|
||
assignment rather than traditional KVM device assignment (VFIO
|
||
is a new method of device assignment that is compatible with
|
||
UEFI Secure Boot), and "kvm" tells libvirt to use the legacy
|
||
device assignment performed directly by the kvm kernel module
|
||
(the default is currently "kvm", but is subject to change).
|
||
<span class="since">Since 1.0.5 (QEMU and KVM only, requires
|
||
kernel 3.6 or newer)</span>
|
||
</dd>
|
||
<dd>
|
||
For interfaces of type='vhostuser', the <code>name</code>
|
||
attribute is ignored. The backend driver used is always
|
||
vhost-user.
|
||
</dd>
|
||
|
||
<dt><code>txmode</code></dt>
|
||
<dd>
|
||
The <code>txmode</code> attribute specifies how to handle
|
||
transmission of packets when the transmit buffer is full. The
|
||
value can be either 'iothread' or 'timer'.
|
||
<span class="since">Since 0.8.8 (QEMU and KVM only)</span><br/><br/>
|
||
|
||
If set to 'iothread', packet tx is all done in an iothread in
|
||
the bottom half of the driver (this option translates into
|
||
adding "tx=bh" to the qemu commandline -device virtio-net-pci
|
||
option).<br/><br/>
|
||
|
||
If set to 'timer', tx work is done in qemu, and if there is
|
||
more tx data than can be sent at the present time, a timer is
|
||
set before qemu moves on to do other things; when the timer
|
||
fires, another attempt is made to send more data.<br/><br/>
|
||
|
||
The resulting difference, according to the qemu developer who
|
||
added the option is: "bh makes tx more asynchronous and reduces
|
||
latency, but potentially causes more processor bandwidth
|
||
contention since the CPU doing the tx isn't necessarily the
|
||
CPU where the guest generated the packets."<br/><br/>
|
||
|
||
<b>In general you should leave this option alone, unless you
|
||
are very certain you know what you are doing.</b>
|
||
</dd>
|
||
<dt><code>ioeventfd</code></dt>
|
||
<dd>
|
||
This optional attribute allows users to set
|
||
<a href='https://patchwork.kernel.org/patch/43390/'>
|
||
domain I/O asynchronous handling</a> for interface device.
|
||
The default is left to the discretion of the hypervisor.
|
||
Accepted values are "on" and "off". Enabling this allows
|
||
qemu to execute VM while a separate thread handles I/O.
|
||
Typically guests experiencing high system CPU utilization
|
||
during I/O will benefit from this. On the other hand,
|
||
on overloaded host it could increase guest I/O latency.
|
||
<span class="since">Since 0.9.3 (QEMU and KVM only)</span><br/><br/>
|
||
|
||
<b>In general you should leave this option alone, unless you
|
||
are very certain you know what you are doing.</b>
|
||
</dd>
|
||
<dt><code>event_idx</code></dt>
|
||
<dd>
|
||
The <code>event_idx</code> attribute controls some aspects of
|
||
device event processing. The value can be either 'on' or 'off'
|
||
- if it is on, it will reduce the number of interrupts and
|
||
exits for the guest. The default is determined by QEMU;
|
||
usually if the feature is supported, default is on. In case
|
||
there is a situation where this behavior is suboptimal, this
|
||
attribute provides a way to force the feature off.
|
||
<span class="since">Since 0.9.5 (QEMU and KVM only)</span><br/><br/>
|
||
|
||
<b>In general you should leave this option alone, unless you
|
||
are very certain you know what you are doing.</b>
|
||
</dd>
|
||
<dt><code>queues</code></dt>
|
||
<dd>
|
||
The optional <code>queues</code> attribute controls the number
|
||
of queues to be used for either
|
||
<a href="https://www.linux-kvm.org/page/Multiqueue"> Multiqueue
|
||
virtio-net</a> or <a href="#elementVhostuser">vhost-user</a> network
|
||
interfaces. Use of multiple packet processing queues requires the
|
||
interface having the <code><model type='virtio'/></code>
|
||
element. Each queue will potentially be handled by a different
|
||
processor, resulting in much higher throughput.
|
||
<span class="since">virtio-net since 1.0.6 (QEMU and KVM only)</span>
|
||
<span class="since">vhost-user since 1.2.17 (QEMU and KVM only)</span>
|
||
</dd>
|
||
<dt><code>rx_queue_size</code></dt>
|
||
<dd>
|
||
The optional <code>rx_queue_size</code> attribute controls
|
||
the size of virtio ring for each queue as described above.
|
||
The default value is hypervisor dependent and may change
|
||
across its releases. Moreover, some hypervisors may pose
|
||
some restrictions on actual value. For instance, latest
|
||
QEMU (as of 2016-09-01) requires value to be a power of two
|
||
from [256, 1024] range.
|
||
<span class="since">Since 2.3.0 (QEMU and KVM only)</span><br/><br/>
|
||
|
||
<b>In general you should leave this option alone, unless you
|
||
are very certain you know what you are doing.</b>
|
||
</dd>
|
||
<dt><code>tx_queue_size</code></dt>
|
||
<dd>
|
||
The optional <code>tx_queue_size</code> attribute controls
|
||
the size of virtio ring for each queue as described above.
|
||
The default value is hypervisor dependent and may change
|
||
across its releases. Moreover, some hypervisors may pose
|
||
some restrictions on actual value. For instance, QEMU
|
||
v2.9 requires value to be a power of two from [256, 1024]
|
||
range. In addition to that, this may work only for a subset of
|
||
interface types, e.g. aforementioned QEMU enables this option
|
||
only for <code>vhostuser</code> type.
|
||
<span class="since">Since 3.7.0 (QEMU and KVM only)</span><br/><br/>
|
||
|
||
<b>In general you should leave this option alone, unless you
|
||
are very certain you know what you are doing.</b>
|
||
</dd>
|
||
<dt>virtio options</dt>
|
||
<dd>
|
||
For virtio interfaces,
|
||
<a href="#elementsVirtio">Virtio-specific options</a> can also be
|
||
set. (<span class="since">Since 3.5.0</span>)
|
||
</dd>
|
||
</dl>
|
||
<p>
|
||
Offloading options for the host and guest can be configured using
|
||
the following sub-elements:
|
||
</p>
|
||
<dl>
|
||
<dt><code>host</code></dt>
|
||
<dd>
|
||
The <code>csum</code>, <code>gso</code>, <code>tso4</code>,
|
||
<code>tso6</code>, <code>ecn</code> and <code>ufo</code>
|
||
attributes with possible values <code>on</code>
|
||
and <code>off</code> can be used to turn off host offloading options.
|
||
By default, the supported offloads are enabled by QEMU.
|
||
<span class="since">Since 1.2.9 (QEMU only)</span>
|
||
The <code>mrg_rxbuf</code> attribute can be used to control
|
||
mergeable rx buffers on the host side. Possible values are
|
||
<code>on</code> (default) and <code>off</code>.
|
||
<span class="since">Since 1.2.13 (QEMU only)</span>
|
||
</dd>
|
||
<dt><code>guest</code></dt>
|
||
<dd>
|
||
The <code>csum</code>, <code>tso4</code>,
|
||
<code>tso6</code>, <code>ecn</code> and <code>ufo</code>
|
||
attributes with possible values <code>on</code>
|
||
and <code>off</code> can be used to turn off guest offloading options.
|
||
By default, the supported offloads are enabled by QEMU.
|
||
<span class="since">Since 1.2.9 (QEMU only)</span>
|
||
</dd>
|
||
</dl>
|
||
|
||
<h5><a id="elementsBackendOptions">Setting network backend-specific options</a></h5>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='network'>
|
||
<source network='default'/>
|
||
<target dev='vnet1'/>
|
||
<model type='virtio'/>
|
||
<b><backend tap='/dev/net/tun' vhost='/dev/vhost-net'/></b>
|
||
<driver name='vhost' txmode='iothread' ioeventfd='on' event_idx='off' queues='5'/>
|
||
<b><tune>
|
||
<sndbuf>1600</sndbuf>
|
||
</tune></b>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
For tuning the backend of the network, the <code>backend</code> element
|
||
can be used. The <code>vhost</code> attribute can override the default vhost
|
||
device path (<code>/dev/vhost-net</code>) for devices with <code>virtio</code> model.
|
||
The <code>tap</code> attribute overrides the tun/tap device path (default:
|
||
<code>/dev/net/tun</code>) for network and bridge interfaces. This does not work
|
||
in session mode. <span class="since">Since 1.2.9</span>
|
||
</p>
|
||
<p>
|
||
For tap devices there is also <code>sndbuf</code> element which can
|
||
adjust the size of send buffer in the host. <span class="since">Since
|
||
0.8.8</span>
|
||
</p>
|
||
<h5><a id="elementsNICSTargetOverride">Overriding the target element</a></h5>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='network'>
|
||
<source network='default'/>
|
||
<b><target dev='vnet1'/></b>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
If no target is specified, certain hypervisors will
|
||
automatically generate a name for the created tun device. This
|
||
name can be manually specified, however the name <i>should not
|
||
start with either 'vnet', 'vif', 'macvtap', or 'macvlan'</i>,
|
||
which are prefixes reserved by libvirt and certain hypervisors.
|
||
Manually specified targets using these prefixes may be ignored.
|
||
</p>
|
||
|
||
<p>
|
||
Note that for LXC containers, this defines the name of the interface
|
||
on the host side. <span class="since">Since 1.2.7</span>, to define
|
||
the name of the device on the guest side, the <code>guest</code>
|
||
element should be used, as in the following snippet:
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='network'>
|
||
<source network='default'/>
|
||
<b><guest dev='myeth'/></b>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<h5><a id="elementsNICSBoot">Specifying boot order</a></h5>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='network'>
|
||
<source network='default'/>
|
||
<target dev='vnet1'/>
|
||
<b><boot order='1'/></b>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
For hypervisors which support this, you can set a specific NIC to
|
||
be used for network boot. The <code>order</code> attribute determines
|
||
the order in which devices will be tried during boot sequence. The
|
||
per-device <code>boot</code> elements cannot be used together with
|
||
general boot elements in
|
||
<a href="#elementsOSBIOS">BIOS bootloader</a> section.
|
||
<span class="since">Since 0.8.8</span>
|
||
</p>
|
||
|
||
<h5><a id="elementsNICSROM">Interface ROM BIOS configuration</a></h5>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='network'>
|
||
<source network='default'/>
|
||
<target dev='vnet1'/>
|
||
<b><rom bar='on' file='/etc/fake/boot.bin'/></b>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
For hypervisors which support this, you can change how a PCI Network
|
||
device's ROM is presented to the guest. The <code>bar</code>
|
||
attribute can be set to "on" or "off", and determines whether
|
||
or not the device's ROM will be visible in the guest's memory
|
||
map. (In PCI documentation, the "rombar" setting controls the
|
||
presence of the Base Address Register for the ROM). If no rom
|
||
bar is specified, the qemu default will be used (older
|
||
versions of qemu used a default of "off", while newer qemus
|
||
have a default of "on").
|
||
The optional <code>file</code> attribute is used to point to a
|
||
binary file to be presented to the guest as the device's ROM
|
||
BIOS. This can be useful to provide an alternative boot ROM for a
|
||
network device.
|
||
<span class="since">Since 0.9.10 (QEMU and KVM only)</span>.
|
||
</p>
|
||
<h5><a id="elementDomain">Setting up a network backend in a driver domain</a></h5>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
...
|
||
<interface type='bridge'>
|
||
<source bridge='br0'/>
|
||
<b><backenddomain name='netvm'/></b>
|
||
</interface>
|
||
...
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
The optional <code>backenddomain</code> element allows specifying a
|
||
backend domain (aka driver domain) for the interface. Use the
|
||
<code>name</code> attribute to specify the backend domain name. You
|
||
can use it to create a direct network link between domains (so data
|
||
will not go through host system). Use with type 'ethernet' to create
|
||
plain network link, or with type 'bridge' to connect to a bridge inside
|
||
the backend domain.
|
||
<span class="since">Since 1.2.13 (Xen only)</span>
|
||
</p>
|
||
|
||
<h5><a id="elementQoS">Quality of service</a></h5>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='network'>
|
||
<source network='default'/>
|
||
<target dev='vnet0'/>
|
||
<b><bandwidth>
|
||
<inbound average='1000' peak='5000' floor='200' burst='1024'/>
|
||
<outbound average='128' peak='256' burst='256'/>
|
||
</bandwidth></b>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
This part of interface XML provides setting quality of service. Incoming
|
||
and outgoing traffic can be shaped independently.
|
||
The <code>bandwidth</code> element and its child elements are described
|
||
in the <a href="formatnetwork.html#elementQoS">QoS</a> section of
|
||
the Network XML.
|
||
</p>
|
||
|
||
<h5><a id="elementVlanTag">Setting VLAN tag (on supported network types only)</a></h5>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='bridge'>
|
||
<b><vlan></b>
|
||
<b><tag id='42'/></b>
|
||
<b></vlan></b>
|
||
<source bridge='ovsbr0'/>
|
||
<virtualport type='openvswitch'>
|
||
<parameters interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
|
||
</virtualport>
|
||
</interface>
|
||
<interface type='bridge'>
|
||
<b><vlan trunk='yes'></b>
|
||
<b><tag id='42'/></b>
|
||
<b><tag id='123' nativeMode='untagged'/></b>
|
||
<b></vlan></b>
|
||
...
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
If (and only if) the network connection used by the guest
|
||
supports VLAN tagging transparent to the guest, an
|
||
optional <code><vlan></code> element can specify one or
|
||
more VLAN tags to apply to the guest's network
|
||
traffic <span class="since">Since 0.10.0</span>. Network
|
||
connections that support guest-transparent VLAN tagging include
|
||
1) type='bridge' interfaces connected to an Open vSwitch bridge
|
||
<span class="since">Since 0.10.0</span>, 2) SRIOV Virtual
|
||
Functions (VF) used via type='hostdev' (direct device
|
||
assignment) <span class="since">Since 0.10.0</span>, and 3)
|
||
SRIOV VFs used via type='direct' with mode='passthrough'
|
||
(macvtap "passthru" mode) <span class="since">Since
|
||
1.3.5</span>. All other connection types, including standard
|
||
linux bridges and libvirt's own virtual networks, <b>do not</b>
|
||
support it. 802.1Qbh (vn-link) and 802.1Qbg (VEPA) switches
|
||
provide their own way (outside of libvirt) to tag guest traffic
|
||
onto a specific VLAN. Each tag is given in a
|
||
separate <code><tag></code> subelement
|
||
of <code><vlan></code> (for example: <code><tag
|
||
id='42'/></code>). For VLAN trunking of multiple tags (which
|
||
is supported only on Open vSwitch connections),
|
||
multiple <code><tag></code> subelements can be specified,
|
||
which implies that the user wants to do VLAN trunking on the
|
||
interface for all the specified tags. In the case that VLAN
|
||
trunking of a single tag is desired, the optional
|
||
attribute <code>trunk='yes'</code> can be added to the toplevel
|
||
<code><vlan></code> element to differentiate trunking of a
|
||
single tag from normal tagging.
|
||
</p>
|
||
<p>
|
||
For network connections using Open vSwitch it is also possible
|
||
to configure 'native-tagged' and 'native-untagged' VLAN modes
|
||
<span class="since">Since 1.1.0.</span> This is done with the
|
||
optional <code>nativeMode</code> attribute on
|
||
the <code><tag></code> subelement: <code>nativeMode</code>
|
||
may be set to 'tagged' or 'untagged'. The <code>id</code>
|
||
attribute of the <code><tag></code> subelement
|
||
containing <code>nativeMode</code> sets which VLAN is considered
|
||
to be the "native" VLAN for this interface, and
|
||
the <code>nativeMode</code> attribute determines whether or not
|
||
traffic for that VLAN will be tagged.
|
||
</p>
|
||
|
||
<h5><a id="elementLink">Modifying virtual link state</a></h5>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='network'>
|
||
<source network='default'/>
|
||
<target dev='vnet0'/>
|
||
<b><link state='down'/></b>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
This element provides means of setting state of the virtual network link.
|
||
Possible values for attribute <code>state</code> are <code>up</code> and
|
||
<code>down</code>. If <code>down</code> is specified as the value, the interface
|
||
behaves as if it had the network cable disconnected. Default behavior if this
|
||
element is unspecified is to have the link state <code>up</code>.
|
||
<span class="since">Since 0.9.5</span>
|
||
</p>
|
||
|
||
<h5><a id="mtu">MTU configuration</a></h5>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='network'>
|
||
<source network='default'/>
|
||
<target dev='vnet0'/>
|
||
<b><mtu size='1500'/></b>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
This element provides means of setting MTU of the virtual network link.
|
||
Currently there is just one attribute <code>size</code> which accepts a
|
||
non-negative integer which specifies the MTU size for the interface.
|
||
<span class="since">Since 3.1.0</span>
|
||
</p>
|
||
|
||
<h5><a id="coalesce">Coalesce settings</a></h5>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='network'>
|
||
<source network='default'/>
|
||
<target dev='vnet0'/>
|
||
<b><coalesce>
|
||
<rx>
|
||
<frames max='7'/>
|
||
</rx>
|
||
</coalesce></b>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
This element provides means of setting coalesce settings for
|
||
some interface devices (currently only type <code>network</code>
|
||
and <code>bridge</code>. Currently there is just one attribute,
|
||
<code>max</code>, to tweak, in element <code>frames</code> for
|
||
the <code>rx</code> group, which accepts a non-negative integer
|
||
that specifies the maximum number of packets that will be
|
||
received before an interrupt.
|
||
<span class="since">Since 3.3.0</span>
|
||
</p>
|
||
|
||
<h5><a id="ipconfig">IP configuration</a></h5>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='network'>
|
||
<source network='default'/>
|
||
<target dev='vnet0'/>
|
||
<b><ip address='192.168.122.5' prefix='24'/></b>
|
||
<b><ip address='192.168.122.5' prefix='24' peer='10.0.0.10'/></b>
|
||
<b><route family='ipv4' address='192.168.122.0' prefix='24' gateway='192.168.122.1'/></b>
|
||
<b><route family='ipv4' address='192.168.122.8' gateway='192.168.122.1'/></b>
|
||
</interface>
|
||
...
|
||
<hostdev mode='capabilities' type='net'>
|
||
<source>
|
||
<interface>eth0</interface>
|
||
</source>
|
||
<b><ip address='192.168.122.6' prefix='24'/></b>
|
||
<b><route family='ipv4' address='192.168.122.0' prefix='24' gateway='192.168.122.1'/></b>
|
||
<b><route family='ipv4' address='192.168.122.8' gateway='192.168.122.1'/></b>
|
||
</hostdev>
|
||
|
||
</devices>
|
||
...
|
||
</pre>
|
||
|
||
<p>
|
||
<span class="since">Since 1.2.12</span> network devices and
|
||
hostdev devices with network capabilities can optionally be provided
|
||
one or more IP addresses to set on the network device in the
|
||
guest. Note that some hypervisors or network device types will
|
||
simply ignore them or only use the first one.
|
||
The <code>family</code> attribute can be set to
|
||
either <code>ipv4</code> or <code>ipv6</code>, and the
|
||
<code>address</code> attribute contains the IP address. The
|
||
optional <code>prefix</code> is the number of 1 bits in the
|
||
netmask, and will be automatically set if not specified - for
|
||
IPv4 the default prefix is determined according to the network
|
||
"class" (A, B, or C - see RFC870), and for IPv6 the default
|
||
prefix is 64. The optional <code>peer</code> attribute holds the
|
||
IP address of the other end of a point-to-point network
|
||
device <span class="since">(since 2.1.0)</span>.
|
||
</p>
|
||
|
||
<p>
|
||
<span class="since">Since 1.2.12</span> route elements can also be
|
||
added to define IP routes to add in the guest. The attributes of
|
||
this element are described in the documentation for
|
||
the <code>route</code> element
|
||
in <a href="formatnetwork.html#elementsStaticroute">network
|
||
definitions</a>. This is used by the LXC driver.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='ethernet'>
|
||
<b><source/></b>
|
||
<b><ip address='192.168.123.1' prefix='24'/></b>
|
||
<b><ip address='10.0.0.10' prefix='24' peer='192.168.122.5'/></b>
|
||
<b><route family='ipv4' address='192.168.42.0' prefix='24' gateway='192.168.123.4'/></b>
|
||
<b><source/></b>
|
||
...
|
||
</interface>
|
||
...
|
||
</devices>
|
||
...
|
||
</pre>
|
||
|
||
<p>
|
||
<span class="since">Since 2.1.0</span> network devices of type
|
||
"ethernet" can optionally be provided one or more IP addresses
|
||
and one or more routes to set on the <b>host</b> side of the
|
||
network device. These are configured as subelements of
|
||
the <code><source></code> element of the interface, and
|
||
have the same attributes as the similarly named elements used to
|
||
configure the guest side of the interface (described above).
|
||
</p>
|
||
|
||
<h5><a id="elementVhostuser">vhost-user interface</a></h5>
|
||
|
||
<p>
|
||
<span class="since">Since 1.2.7</span> the vhost-user enables the
|
||
communication between a QEMU virtual machine and other userspace process
|
||
using the Virtio transport protocol. A char dev (e.g. Unix socket) is used
|
||
for the control plane, while the data plane is based on shared memory.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface type='vhostuser'>
|
||
<mac address='52:54:00:3b:83:1a'/>
|
||
<source type='unix' path='/tmp/vhost1.sock' mode='server'/>
|
||
<model type='virtio'/>
|
||
</interface>
|
||
<interface type='vhostuser'>
|
||
<mac address='52:54:00:3b:83:1b'/>
|
||
<source type='unix' path='/tmp/vhost2.sock' mode='client'>
|
||
<reconnect enabled='yes' timeout='10'/>
|
||
</source>
|
||
<model type='virtio'/>
|
||
<driver queues='5'/>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
The <code><source></code> element has to be specified
|
||
along with the type of char device.
|
||
Currently, only type='unix' is supported, where the path (the
|
||
directory path of the socket) and mode attributes are required.
|
||
Both <code>mode='server'</code> and <code>mode='client'</code>
|
||
are supported.
|
||
vhost-user requires the virtio model type, thus the
|
||
<code><model></code> element is mandatory.
|
||
<span class="since">Since 4.1.0</span> the element has an
|
||
optional child element <code>reconnect</code> which
|
||
configures reconnect timeout if the connection is lost. It
|
||
has two attributes <code>enabled</code> (which accepts
|
||
<code>yes</code> and <code>no</code>) and
|
||
<code>timeout</code> which specifies the amount of seconds
|
||
after which hypervisor tries to reconnect.
|
||
</p>
|
||
|
||
<h5><a id="elementNwfilter">Traffic filtering with NWFilter</a></h5>
|
||
|
||
<p>
|
||
<span class="since">Since 0.8.0</span> an <code>nwfilter</code> profile
|
||
can be assigned to a domain interface, which allows configuring
|
||
traffic filter rules for the virtual machine.
|
||
|
||
See the <a href="formatnwfilter.html">nwfilter</a> documentation for more
|
||
complete details.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<interface ...>
|
||
...
|
||
<filterref filter='clean-traffic'/>
|
||
</interface>
|
||
<interface ...>
|
||
...
|
||
<filterref filter='myfilter'>
|
||
<parameter name='IP' value='104.207.129.11'/>
|
||
<parameter name='IP6_ADDR' value='2001:19f0:300:2102::'/>
|
||
<parameter name='IP6_MASK' value='64'/>
|
||
...
|
||
</filterref>
|
||
</interface>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
The <code>filter</code> attribute specifies the name of the nwfilter
|
||
to use. Optional <code><parameter></code> elements may be
|
||
specified for passing additional info to the nwfilter via the
|
||
<code>name</code> and <code>value</code> attributes. See
|
||
the <a href="formatnwfilter.html#nwfconceptsvars">nwfilter</a>
|
||
docs for info on parameters.
|
||
</p>
|
||
|
||
|
||
<h4><a id="elementsInput">Input devices</a></h4>
|
||
|
||
<p>
|
||
Input devices allow interaction with the graphical framebuffer
|
||
in the guest virtual machine. When enabling the framebuffer, an
|
||
input device is automatically provided. It may be possible to
|
||
add additional devices explicitly, for example,
|
||
to provide a graphics tablet for absolute cursor movement.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<input type='mouse' bus='usb'/>
|
||
<input type='keyboard' bus='usb'/>
|
||
<input type='mouse' bus='virtio'/>
|
||
<input type='keyboard' bus='virtio'/>
|
||
<input type='tablet' bus='virtio'/>
|
||
<input type='passthrough' bus='virtio'>
|
||
<source evdev='/dev/input/event1/>
|
||
</input>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><code>input</code></dt>
|
||
<dd>The <code>input</code> element has one mandatory attribute,
|
||
the <code>type</code> whose value can be 'mouse', 'tablet',
|
||
(<span class="since">since 1.2.2</span>) 'keyboard' or
|
||
(<span class="since">since 1.3.0</span>) 'passthrough'.
|
||
The tablet provides absolute cursor movement,
|
||
while the mouse uses relative movement. The optional
|
||
<code>bus</code> attribute can be used to refine the exact device type.
|
||
It takes values "xen" (paravirtualized), "ps2" and "usb" or
|
||
(<span class="since">since 1.3.0</span>) "virtio".</dd>
|
||
</dl>
|
||
|
||
<p>
|
||
The <code>input</code> element has an optional
|
||
sub-element <code><address></code> which can tie the
|
||
device to a particular PCI
|
||
slot, <a href="#elementsAddress">documented above</a>.
|
||
On S390, <code>address</code> can be used to provide a CCW address for
|
||
an input device (<span class="since">since 4.2.0</span>).
|
||
|
||
For type <code>passthrough</code>, the mandatory sub-element <code>source</code>
|
||
must have an <code>evdev</code> attribute containing the absolute path to the
|
||
event device passed through to guests. (KVM only)
|
||
</p>
|
||
|
||
<p>
|
||
The subelement <code>driver</code> can be used to tune the virtio
|
||
options of the device:
|
||
<a href="#elementsVirtio">Virtio-specific options</a> can also be
|
||
set. (<span class="since">Since 3.5.0</span>)
|
||
</p>
|
||
|
||
<h4><a id="elementsHub">Hub devices</a></h4>
|
||
|
||
<p>
|
||
A hub is a device that expands a single port into several so
|
||
that there are more ports available to connect devices to a host
|
||
system.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<hub type='usb'/>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><code>hub</code></dt>
|
||
<dd>The <code>hub</code> element has one mandatory attribute,
|
||
the <code>type</code> whose value can only be 'usb'.</dd>
|
||
</dl>
|
||
|
||
<p>
|
||
The <code>hub</code> element has an optional
|
||
sub-element <code><address></code>
|
||
with <code>type='usb'</code>which can tie the device to a
|
||
particular controller, <a href="#elementsAddress">documented
|
||
above</a>.
|
||
</p>
|
||
|
||
<h4><a id="elementsGraphics">Graphical framebuffers</a></h4>
|
||
|
||
<p>
|
||
A graphics device allows for graphical interaction with the
|
||
guest OS. A guest will typically have either a framebuffer
|
||
or a text console configured to allow interaction with the
|
||
admin.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<graphics type='sdl' display=':0.0'/>
|
||
<graphics type='vnc' port='5904' sharePolicy='allow-exclusive'>
|
||
<listen type='address' address='1.2.3.4'/>
|
||
</graphics>
|
||
<graphics type='rdp' autoport='yes' multiUser='yes' />
|
||
<graphics type='desktop' fullscreen='yes'/>
|
||
<graphics type='spice'>
|
||
<listen type='network' network='rednet'/>
|
||
</graphics>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><code>graphics</code></dt>
|
||
<dd>
|
||
<p>
|
||
The <code>graphics</code> element has a mandatory <code>type</code>
|
||
attribute which takes the value <code>sdl</code>, <code>vnc</code>,
|
||
<code>spice</code>, <code>rdp</code>, <code>desktop</code> or
|
||
<code>egl-headless</code>:
|
||
</p>
|
||
<dl>
|
||
<dt><code>sdl</code></dt>
|
||
<dd>
|
||
<p>
|
||
This displays a window on the host desktop, it can take 3 optional
|
||
arguments: a <code>display</code> attribute for the display to use,
|
||
an <code>xauth</code> attribute for the authentication identifier,
|
||
and an optional <code>fullscreen</code> attribute accepting values
|
||
<code>yes</code> or <code>no</code>.
|
||
</p>
|
||
|
||
<p>
|
||
You can use a <code>gl</code> with the <code>enable="yes"</code>
|
||
property to enable OpenGL support in SDL. Likewise you can
|
||
explicitly disable OpenGL support with <code>enable="no"</code>.
|
||
</p>
|
||
</dd>
|
||
<dt><code>vnc</code></dt>
|
||
<dd>
|
||
<p>
|
||
Starts a VNC server. The <code>port</code> attribute specifies
|
||
the TCP port number (with -1 as legacy syntax indicating that it
|
||
should be auto-allocated). The <code>autoport</code> attribute is
|
||
the new preferred syntax for indicating auto-allocation of the TCP
|
||
port to use. The <code>passwd</code> attribute provides a VNC
|
||
password in clear text. If the <code>passwd</code> attribute is
|
||
set to an empty string, then VNC access is disabled. The
|
||
<code>keymap</code> attribute specifies the keymap to use. It is
|
||
possible to set a limit on the validity of the password by giving
|
||
a timestamp <code>passwdValidTo='2010-04-09T15:51:00'</code>
|
||
assumed to be in UTC. The <code>connected</code> attribute allows
|
||
control of connected client during password changes. VNC accepts
|
||
<code>keep</code> value only <span class="since">since 0.9.3</span>.
|
||
NB, this may not be supported by all hypervisors.
|
||
</p>
|
||
<p>
|
||
The optional <code>sharePolicy</code> attribute specifies vnc
|
||
server display sharing policy. <code>allow-exclusive</code> allows
|
||
clients to ask for exclusive access by dropping other connections.
|
||
Connecting multiple clients in parallel requires all clients asking
|
||
for a shared session (vncviewer: -Shared switch). This is
|
||
the default value. <code>force-shared</code> disables exclusive
|
||
client access, every connection has to specify -Shared switch for
|
||
vncviewer. <code>ignore</code> welcomes every connection
|
||
unconditionally <span class="since">since 1.0.6</span>.
|
||
</p>
|
||
<p>
|
||
Rather than using listen/port, QEMU supports a <code>socket</code>
|
||
attribute for listening on a unix domain socket path
|
||
<span class="since">Since 0.8.8</span>.
|
||
</p>
|
||
<p>
|
||
For VNC WebSocket functionality, <code>websocket</code> attribute
|
||
may be used to specify port to listen on (with -1 meaning
|
||
auto-allocation and <code>autoport</code> having no effect due to
|
||
security reasons) <span class="since">Since 1.0.6</span>.
|
||
</p>
|
||
<p>
|
||
Although VNC doesn't support OpenGL natively, it can be paired
|
||
with graphics type <code>egl-headless</code> (see below) which
|
||
will instruct QEMU to open and use drm nodes for OpenGL rendering.
|
||
</p>
|
||
</dd>
|
||
<dt><code>spice</code> <span class="since">Since 0.8.6</span></dt>
|
||
<dd>
|
||
<p>
|
||
Starts a SPICE server. The <code>port</code> attribute specifies
|
||
the TCP port number (with -1 as legacy syntax indicating that it
|
||
should be auto-allocated), while <code>tlsPort</code> gives
|
||
an alternative secure port number. The <code>autoport</code>
|
||
attribute is the new preferred syntax for indicating
|
||
auto-allocation of needed port numbers. The <code>passwd</code>
|
||
attribute provides a SPICE password in clear text. If the
|
||
<code>passwd</code> attribute is set to an empty string, then
|
||
SPICE access is disabled. The <code>keymap</code> attribute
|
||
specifies the keymap to use. It is possible to set a limit on
|
||
the validity of the password by giving a timestamp
|
||
<code>passwdValidTo='2010-04-09T15:51:00'</code> assumed to be
|
||
in UTC.
|
||
</p>
|
||
<p>
|
||
The <code>connected</code> attribute allows control of connected
|
||
client during password changes. SPICE accepts <code>keep</code> to
|
||
keep client connected, <code>disconnect</code> to disconnect client
|
||
and <code>fail</code> to fail changing password . NB, this may not
|
||
be supported by all hypervisors.
|
||
<span class="since">Since 0.9.3</span>
|
||
</p>
|
||
<p>
|
||
The <code>defaultMode</code> attribute sets the default channel
|
||
security policy, valid values are <code>secure</code>,
|
||
<code>insecure</code> and the default <code>any</code> (which is
|
||
secure if possible, but falls back to insecure rather than erroring
|
||
out if no secure path is available).
|
||
<span class="since">Since 0.9.12</span>
|
||
</p>
|
||
<p>
|
||
When SPICE has both a normal and TLS secured TCP port configured,
|
||
it can be desirable to restrict what channels can be run on each
|
||
port. This is achieved by adding one or more <code><channel>
|
||
</code> elements inside the main <code><graphics></code>
|
||
element and setting the <code>mode</code> attribute to either
|
||
<code>secure</code> or <code>insecure</code>. Setting the mode
|
||
attribute overrides the default value as set by
|
||
the <code>defaultMode</code> attribute. (Note that specifying
|
||
<code>any</code> as mode discards the entry as the channel would
|
||
inherit the default mode anyways.) Valid channel names include
|
||
<code>main</code>, <code>display</code>, <code>inputs</code>,
|
||
<code>cursor</code>, <code>playback</code>, <code>record</code>
|
||
(all <span class="since"> since 0.8.6</span>);
|
||
<code>smartcard</code> (<span class="since">since 0.8.8</span>);
|
||
and <code>usbredir</code> (<span class="since">since 0.9.12</span>).
|
||
</p>
|
||
<pre>
|
||
<graphics type='spice' port='-1' tlsPort='-1' autoport='yes'>
|
||
<channel name='main' mode='secure'/>
|
||
<channel name='record' mode='insecure'/>
|
||
<image compression='auto_glz'/>
|
||
<streaming mode='filter'/>
|
||
<clipboard copypaste='no'/>
|
||
<mouse mode='client'/>
|
||
<filetransfer enable='no'/>
|
||
<gl enable='yes' rendernode='/dev/dri/by-path/pci-0000:00:02.0-render'/>
|
||
</graphics></pre>
|
||
<p>
|
||
Spice supports variable compression settings for audio, images and
|
||
streaming. These settings are accessible via the <code>compression
|
||
</code> attribute in all following elements: <code>image</code> to
|
||
set image compression (accepts <code>auto_glz</code>,
|
||
<code>auto_lz</code>, <code>quic</code>, <code>glz</code>,
|
||
<code>lz</code>, <code>off</code>), <code>jpeg</code> for JPEG
|
||
compression for images over wan (accepts <code>auto</code>,
|
||
<code>never</code>, <code>always</code>), <code>zlib</code> for
|
||
configuring wan image compression (accepts <code>auto</code>,
|
||
<code>never</code>, <code>always</code>) and <code>playback</code>
|
||
for enabling audio stream compression (accepts <code>on</code> or
|
||
<code>off</code>). <span class="since">Since 0.9.1</span>
|
||
</p>
|
||
<p>
|
||
Streaming mode is set by the <code>streaming</code> element,
|
||
settings its <code>mode</code> attribute to one of
|
||
<code>filter</code>, <code>all</code> or <code>off</code>.
|
||
<span class="since">Since 0.9.2</span>
|
||
</p>
|
||
<p>
|
||
Copy & Paste functionality (via Spice agent) is set by the
|
||
<code>clipboard</code> element. It is enabled by default, and can
|
||
be disabled by setting the <code>copypaste</code> property to
|
||
<code>no</code>. <span class="since">Since 0.9.3</span>
|
||
</p>
|
||
<p>
|
||
Mouse mode is set by the <code>mouse</code> element, setting its
|
||
<code>mode</code> attribute to one of <code>server</code> or
|
||
<code>client</code>. If no mode is specified, the qemu default will
|
||
be used (client mode). <span class="since">Since 0.9.11</span>
|
||
</p>
|
||
<p>
|
||
File transfer functionality (via Spice agent) is set using the
|
||
<code>filetransfer</code> element. It is enabled by default, and
|
||
can be disabled by setting the <code>enable</code> property to
|
||
<code>no</code>. <span class="since">Since 1.2.2</span>
|
||
</p>
|
||
<p>
|
||
Spice may provide accelerated server-side rendering with OpenGL.
|
||
You can enable or disable OpenGL support explicitly with
|
||
the <code>gl</code> element, by setting the <code>enable</code>
|
||
property. (QEMU only, <span class="since">since 1.3.3</span>).
|
||
Note that this only works locally, since this requires usage of
|
||
UNIX sockets, i.e. using <code>listen</code> types 'socket' or
|
||
'none'. For accelerated OpenGL with remote support, consider
|
||
pairing this element with type <code>egl-headless</code>
|
||
(see below). However, this will deliver weaker performance
|
||
compared to native Spice OpenGL support.
|
||
</p>
|
||
<p>
|
||
By default, QEMU will pick the first available GPU DRM render node.
|
||
You may specify a DRM render node path to use instead. (QEMU only,
|
||
<span class="since">since 3.1.0</span>).
|
||
</p>
|
||
</dd>
|
||
<dt><code>rdp</code></dt>
|
||
<dd>
|
||
<p>
|
||
Starts a RDP server. The <code>port</code> attribute specifies the
|
||
TCP port number (with -1 as legacy syntax indicating that it should
|
||
be auto-allocated). The <code>autoport</code> attribute is the new
|
||
preferred syntax for indicating auto-allocation of the TCP port to
|
||
use. In the VirtualBox driver, the <code>autoport</code> will make
|
||
the hypervisor pick available port from 3389-3689 range when the VM
|
||
is started. The chosen port will be reflected in the <code>port</code>
|
||
attribute. The <code>multiUser</code> attribute is a boolean deciding
|
||
whether multiple simultaneous connections to the VM are permitted.
|
||
The <code>replaceUser</code> attribute is a boolean deciding whether
|
||
the existing connection must be dropped and a new connection must
|
||
be established by the VRDP server, when a new client connects in
|
||
single connection mode.
|
||
</p>
|
||
</dd>
|
||
<dt><code>desktop</code></dt>
|
||
<dd>
|
||
<p>
|
||
This value is reserved for VirtualBox domains for the moment. It
|
||
displays a window on the host desktop, similarly to "sdl", but
|
||
using the VirtualBox viewer. Just like "sdl", it accepts
|
||
the optional attributes <code>display</code> and
|
||
<code>fullscreen</code>.
|
||
</p>
|
||
</dd>
|
||
<dt><code>egl-headless</code><span class="since">Since 4.6.0</span></dt>
|
||
<dd>
|
||
<p>
|
||
This display type provides support for an OpenGL accelerated
|
||
display accessible both locally and remotely (for comparison,
|
||
Spice's native OpenGL support only works locally using UNIX
|
||
sockets at the moment, but has better performance). Since this
|
||
display type doesn't provide any window or graphical console like
|
||
the other types, for practical reasons it should be paired with
|
||
either <code>vnc</code> or <code>spice</code> graphics types.
|
||
This display type is only supported by QEMU domains
|
||
(needs QEMU <span class="since">2.10</span> or newer) and doesn't
|
||
accept any attributes.
|
||
</p>
|
||
<pre>
|
||
<graphics type='spice' autoport='yes'/>
|
||
<graphics type='egl-headless'/>
|
||
</pre>
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
</dl>
|
||
|
||
<p>
|
||
Graphics device uses a <code><listen></code> to set up where
|
||
the device should listen for clients. It has a mandatory attribute
|
||
<code>type</code> which specifies the listen type. Only <code>vnc</code>,
|
||
<code>spice</code> and <code>rdp</code> supports <code><listen>
|
||
</code> element. <span class="since">Since 0.9.4</span>.
|
||
Available types are:
|
||
</p>
|
||
<dl>
|
||
<dt><code>address</code></dt>
|
||
<dd>
|
||
<p>
|
||
Tells a graphics device to use an address specified in the
|
||
<code>address</code> attribute, which will contain either an IP address
|
||
or hostname (which will be resolved to an IP address via a DNS query)
|
||
to listen on.
|
||
</p>
|
||
<p>
|
||
It is possible to omit the <code>address</code> attribute in order to
|
||
use an address from config files <span class="since">Since 1.3.5</span>.
|
||
</p>
|
||
<p>
|
||
The <code>address</code> attribute is duplicated as <code>listen</code>
|
||
attribute in <code>graphics</code> element for backward compatibility.
|
||
If both are provided they must be equal.
|
||
</p>
|
||
</dd>
|
||
<dt><code>network</code></dt>
|
||
<dd>
|
||
<p>
|
||
This is used to specify an existing network in the <code>network</code>
|
||
attribute from libvirt's list of configured networks. The named network
|
||
configuration will be examined to determine an appropriate listen
|
||
address and the address will be stored in live XML in <code>address
|
||
</code> attribute. For example, if the network has an IPv4 address in
|
||
its configuration (e.g. if it has a forward type of <code>route</code>,
|
||
<code>nat</code>, or no forward type (isolated)), the first IPv4
|
||
address listed in the network's configuration will be used.
|
||
If the network is describing a host bridge, the first IPv4 address
|
||
associated with that bridge device will be used, and if the network is
|
||
describing one of the 'direct' (macvtap) modes, the first IPv4 address
|
||
of the first forward dev will be used.
|
||
</p>
|
||
</dd>
|
||
<dt><code>socket</code> <span class="since">since 2.0.0 (QEMU only)</span></dt>
|
||
<dd>
|
||
<p>
|
||
This listen type tells a graphics server to listen on unix socket.
|
||
Attribute <code>socket</code> contains a path to unix socket. If this
|
||
attribute is omitted libvirt will generate this path for you.
|
||
Supported by graphics type <code>vnc</code> and <code>spice</code>.
|
||
</p>
|
||
<p>
|
||
For <code>vnc</code> graphics be backward compatible
|
||
the <code>socket</code> attribute of first <code>listen</code> element
|
||
is duplicated as <code>socket</code> attribute in <code>graphics</code>
|
||
element. If <code>graphics</code> element contains a <code>socket</code>
|
||
attribute all <code>listen</code> elements are ignored.
|
||
</p>
|
||
</dd>
|
||
<dt><code>none</code> <span class="since">since 2.0.0 (QEMU only)</span></dt>
|
||
<dd>
|
||
<p>
|
||
This listen type doesn't have any other attribute. Libvirt supports
|
||
passing a file descriptor through our APIs virDomainOpenGraphics() and
|
||
virDomainOpenGraphicsFD(). No other listen types are allowed if this
|
||
one is used and the graphics device doesn't listen anywhere. You need
|
||
to use one of the two APIs to pass a FD to QEMU in order to connect to
|
||
this graphics device. Supported by graphics type <code>vnc</code> and
|
||
<code>spice</code>.
|
||
</p>
|
||
</dd>
|
||
</dl>
|
||
|
||
<h4><a id="elementsVideo">Video devices</a></h4>
|
||
<p>
|
||
A video device.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<video>
|
||
<model type='vga' vram='16384' heads='1'>
|
||
<acceleration accel3d='yes' accel2d='yes'/>
|
||
</model>
|
||
</video>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><code>video</code></dt>
|
||
<dd>
|
||
<p>
|
||
The <code>video</code> element is the container for describing
|
||
video devices. For backwards compatibility, if no <code>video</code>
|
||
is set but there is a <code>graphics</code> in domain xml, then
|
||
libvirt will add a default <code>video</code> according to the guest
|
||
type.
|
||
</p>
|
||
<p>
|
||
For a guest of type "kvm", the default <code>video</code> is:
|
||
<code>type</code> with value "cirrus", <code>vram</code> with value
|
||
"16384" and <code>heads</code> with value "1". By default, the first
|
||
video device in domain xml is the primary one, but the optional
|
||
attribute <code>primary</code> (<span class="since">since 1.0.2</span>)
|
||
with value 'yes' can be used to mark the primary in cases of multiple
|
||
video device. The non-primary must be type of "qxl" or
|
||
(<span class="since">since 2.4.0</span>) "virtio".
|
||
</p>
|
||
</dd>
|
||
|
||
<dt><code>model</code></dt>
|
||
<dd>
|
||
<p>
|
||
The <code>model</code> element has a mandatory <code>type</code>
|
||
attribute which takes the value "vga", "cirrus", "vmvga", "xen",
|
||
"vbox", "qxl" (<span class="since">since 0.8.6</span>),
|
||
"virtio" (<span class="since">since 1.3.0</span>),
|
||
"gop" (<span class="since">since 3.2.0</span>), or
|
||
"none" (<span class="since">since 4.6.0</span>)
|
||
depending on the hypervisor features available.
|
||
The purpose of the type <code>none</code> is to instruct libvirt not
|
||
to add a default video device in the guest (see the paragraph above).
|
||
This legacy behaviour can be inconvenient in cases where GPU mediated
|
||
devices are meant to be the only rendering device within a guest and
|
||
so specifying another <code>video</code> device along with type
|
||
<code>none</code>.
|
||
Refer to <a id="elementsHostDev">Host device assignment</a> to see
|
||
how to add a mediated device into a guest.
|
||
</p>
|
||
<p>
|
||
You can provide the amount of video memory in kibibytes (blocks of
|
||
1024 bytes) using <code>vram</code>. This is supported only for guest
|
||
type of "libxl", "vz", "qemu", "vbox", "vmx" and "xen". If no
|
||
value is provided the default is used. If the size is not a power of
|
||
two it will be rounded to closest one.
|
||
</p>
|
||
<p>
|
||
The number of screen can be set using <code>heads</code>. This is
|
||
supported only for guests type of "vz", "kvm", "vbox" and "vmx".
|
||
</p>
|
||
<p>
|
||
For guest type of "kvm" or "qemu" and model type "qxl" there are
|
||
optional attributes. Attribute <code>ram</code> (<span class="since">
|
||
since 1.0.2</span>) specifies the size of the primary bar, while the
|
||
attribute <code>vram</code> specifies the secondary bar size.
|
||
If <code>ram</code> or <code>vram</code> are not supplied a default
|
||
value is used. The <code>ram</code> should also be rounded to power of
|
||
two as <code>vram</code>. There is also optional attribute
|
||
<code>vgamem</code> (<span class="since">since 1.2.11</span>) to set
|
||
the size of VGA framebuffer for fallback mode of QXL device.
|
||
Attribute <code>vram64</code> (<span class="since">since 1.3.3</span>)
|
||
extends secondary bar and makes it addressable as 64bit memory.
|
||
</p>
|
||
</dd>
|
||
|
||
<dt><code>acceleration</code></dt>
|
||
<dd>
|
||
Configure if video acceleration should be enabled.
|
||
<dl>
|
||
<dt><code>accel2d</code></dt>
|
||
<dd>Enable 2D acceleration (for vbox driver only,
|
||
<span class="since">since 0.7.1</span>)</dd>
|
||
|
||
<dt><code>accel3d</code></dt>
|
||
<dd>Enable 3D acceleration (for vbox driver
|
||
<span class="since">since 0.7.1</span>, qemu driver
|
||
<span class="since">since 1.3.0</span>)</dd>
|
||
</dl>
|
||
</dd>
|
||
|
||
<dt><code>address</code></dt>
|
||
<dd>
|
||
The optional <code>address</code> sub-element can be used to
|
||
tie the video device to a particular PCI slot.
|
||
On S390, <code>address</code> can be used to provide the
|
||
CCW address for the video device (<span class="since">
|
||
since 4.2.0</span>).
|
||
</dd>
|
||
|
||
<dt><code>driver</code></dt>
|
||
<dd>
|
||
The subelement <code>driver</code> can be used to tune the device:
|
||
<dl>
|
||
<dt>virtio options</dt>
|
||
<dd>
|
||
<a href="#elementsVirtio">Virtio-specific options</a> can also be
|
||
set (<span class="since">Since 3.5.0</span>)
|
||
</dd>
|
||
<dt>VGA configuration</dt>
|
||
<dd>
|
||
Control how the video devices exposed to the guest using the
|
||
<code>vgaconf</code> attribute which takes the value "io", "on" or "off".
|
||
At present, it's only applicable to the bhyve's "gop" video model type
|
||
(<span class="since">Since 3.5.0</span>)
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
</dl>
|
||
|
||
<h4><a id="elementsConsole">Consoles, serial, parallel & channel devices</a></h4>
|
||
|
||
<p>
|
||
A character device provides a way to interact with the virtual machine.
|
||
Paravirtualized consoles, serial ports, parallel ports and channels are
|
||
all classed as character devices and so represented using the same syntax.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<parallel type='pty'>
|
||
<source path='/dev/pts/2'/>
|
||
<target port='0'/>
|
||
</parallel>
|
||
<serial type='pty'>
|
||
<source path='/dev/pts/3'/>
|
||
<target port='0'/>
|
||
</serial>
|
||
<serial type='file'>
|
||
<source path='/tmp/file' append='on'>
|
||
<seclabel model='dac' relabel='no'/>
|
||
</source>
|
||
<target port='0'/>
|
||
</serial>
|
||
<console type='pty'>
|
||
<source path='/dev/pts/4'/>
|
||
<target port='0'/>
|
||
</console>
|
||
<channel type='unix'>
|
||
<source mode='bind' path='/tmp/guestfwd'/>
|
||
<target type='guestfwd' address='10.0.2.1' port='4600'/>
|
||
</channel>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
In each of these directives, the top-level element name (parallel, serial,
|
||
console, channel) describes how the device is presented to the guest. The
|
||
guest interface is configured by the <code>target</code> element.
|
||
</p>
|
||
|
||
<p>
|
||
The interface presented to the host is given in the <code>type</code>
|
||
attribute of the top-level element. The host interface is
|
||
configured by the <code>source</code> element.
|
||
</p>
|
||
|
||
<p>
|
||
The <code>source</code> element may contain an optional
|
||
<code>seclabel</code> to override the way that labelling
|
||
is done on the socket path. If this element is not present,
|
||
the <a href="#seclabel">security label is inherited from
|
||
the per-domain setting</a>.
|
||
</p>
|
||
|
||
<p>
|
||
If the interface <code>type</code> presented to the host is "file",
|
||
then the <code>source</code> element may contain an optional attribute
|
||
<code>append</code> that specifies whether or not the information in
|
||
the file should be preserved on domain restart. Allowed values are
|
||
"on" and "off" (default). <span class="since">Since 1.3.1</span>.
|
||
</p>
|
||
|
||
<p>
|
||
Regardless of the <code>type</code>, character devices can
|
||
have an optional log file associated with them. This is
|
||
expressed via a <code>log</code> sub-element, with a
|
||
<code>file</code> attribute. There can also be an <code>append</code>
|
||
attribute which takes the same values described above.
|
||
<span class="since">Since 1.3.3</span>.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<log file="/var/log/libvirt/qemu/guestname-serial0.log" append="off"/>
|
||
...</pre>
|
||
|
||
<p>
|
||
Each character device element has an optional
|
||
sub-element <code><address></code> which can tie the
|
||
device to a
|
||
particular <a href="#elementsControllers">controller</a> or PCI
|
||
slot.
|
||
</p>
|
||
|
||
<p>
|
||
For character device with type <code>unix</code> or <code>tcp</code>
|
||
the <code>source</code> has an optional element <code>reconnect</code>
|
||
which configures reconnect timeout if the connection is lost.
|
||
There are two attributes, <code>enabled</code> where possible
|
||
values are "yes" and "no" and <code>timeout</code> which is in
|
||
seconds. The <code>reconnect</code> attribute is valid only
|
||
for <code>connect</code> mode.
|
||
<span class="since">Since 3.7.0 (QEMU driver only)</span>.
|
||
</p>
|
||
|
||
<h5><a id="elementsCharGuestInterface">Guest interface</a></h5>
|
||
|
||
<p>
|
||
A character device presents itself to the guest as one of the following
|
||
types.
|
||
</p>
|
||
|
||
<h6><a id="elementCharParallel">Parallel port</a></h6>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<parallel type='pty'>
|
||
<source path='/dev/pts/2'/>
|
||
<target port='0'/>
|
||
</parallel>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
<code>target</code> can have a <code>port</code> attribute, which
|
||
specifies the port number. Ports are numbered starting from 0. There are
|
||
usually 0, 1 or 2 parallel ports.
|
||
</p>
|
||
|
||
<h6><a id="elementCharSerial">Serial port</a></h6>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<!-- Serial port -->
|
||
<serial type='pty'>
|
||
<source path='/dev/pts/3'/>
|
||
<target port='0'/>
|
||
</serial>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<!-- USB serial port -->
|
||
<serial type='pty'>
|
||
<target type='usb-serial' port='0'>
|
||
<model name='usb-serial'/>
|
||
</target>
|
||
<address type='usb' bus='0' port='1'/>
|
||
</serial>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
The <code>target</code> element can have an optional <code>port</code>
|
||
attribute, which specifies the port number (starting from 0), and an
|
||
optional <code>type</code> attribute: valid values are,
|
||
<span class="since">since 1.0.2</span>, <code>isa-serial</code> (usable
|
||
with x86 guests), <code>usb-serial</code> (usable whenever USB support
|
||
is available) and <code>pci-serial</code> (usable whenever PCI support
|
||
is available); <span class="since">since 3.10.0</span>,
|
||
<code>spapr-vio-serial</code> (usable with ppc64/pseries guests),
|
||
<code>system-serial</code> (usable with aarch64/virt and,
|
||
<span class="since">since 4.7.0</span>, riscv/virt guests) and
|
||
<code>sclp-serial</code> (usable with s390 and s390x guests) are
|
||
available as well.
|
||
</p>
|
||
|
||
<p>
|
||
<span class="since">Since 3.10.0</span>, the <code>target</code>
|
||
element can have an optional <code>model</code> subelement;
|
||
valid values for its <code>name</code> attribute are:
|
||
<code>isa-serial</code> (usable with the <code>isa-serial</code> target
|
||
type); <code>usb-serial</code> (usable with the <code>usb-serial</code>
|
||
target type); <code>pci-serial</code>
|
||
(usable with the <code>pci-serial</code> target type);
|
||
<code>spapr-vty</code> (usable with the <code>spapr-vio-serial</code>
|
||
target type); <code>pl011</code> and,
|
||
<span class="since">since 4.7.0</span>, <code>16550a</code> (usable
|
||
with the <code>system-serial</code> target type);
|
||
<code>sclpconsole</code> and <code>sclplmconsole</code> (usable with
|
||
the <code>sclp-serial</code> target type).
|
||
</p>
|
||
|
||
<p>
|
||
If any of the attributes is not specified by the user, libvirt will
|
||
choose a value suitable for most users.
|
||
</p>
|
||
|
||
<p>
|
||
Most target types support configuring the guest-visible device
|
||
address as <a href="#elementsAddress">documented above</a>; more
|
||
specifically, acceptable address types are <code>isa</code> (for
|
||
<code>isa-serial</code>), <code>usb</code> (for <code>usb-serial</code>),
|
||
<code>pci</code> (for <code>pci-serial</code>) and <code>spapr-vio</code>
|
||
(for <code>spapr-vio-serial</code>). The <code>system-serial</code>
|
||
and <code>sclp-serial</code> target types don't support specifying an
|
||
address.
|
||
</p>
|
||
|
||
<p>
|
||
For the relationship between serial ports and consoles,
|
||
<a href="#elementCharSerialAndConsole">see below</a>.
|
||
</p>
|
||
|
||
<h6><a id="elementCharConsole">Console</a></h6>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<!-- Serial console -->
|
||
<console type='pty'>
|
||
<source path='/dev/pts/2'/>
|
||
<target type='serial' port='0'/>
|
||
</console>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<!-- KVM virtio console -->
|
||
<console type='pty'>
|
||
<source path='/dev/pts/5'/>
|
||
<target type='virtio' port='0'/>
|
||
</console>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
The <code>console</code> element is used to represent interactive
|
||
serial consoles. Depending on the type of guest in use and the specifics
|
||
of the configuration, the <code>console</code> element might represent
|
||
the same device as an existing <code>serial</code> element or a separate
|
||
device.
|
||
</p>
|
||
|
||
<p>
|
||
A <code>target</code> subelement is supported and works the same
|
||
way as with the <code>serial</code> element
|
||
(<a href="#elementCharSerial">see above</a> for details).
|
||
Valid values for the <code>type</code> attribute are:
|
||
<code>serial</code> (described below);
|
||
<code>virtio</code> (usable whenever VirtIO support is available);
|
||
<code>xen</code>, <code>lxc</code>, <code>uml</code> and
|
||
<code>openvz</code> (available when the corresponding hypervisor is in
|
||
use). <code>sclp</code> and <code>sclplm</code> (usable for s390 and
|
||
s390x QEMU guests) are supported for compatibility reasons but should
|
||
not be used for new guests: use the <code>sclpconsole</code> and
|
||
<code>sclplmconsole</code> target models, respectively, with the
|
||
<code>serial</code> element instead.
|
||
</p>
|
||
|
||
<p>
|
||
Of the target types listed above, <code>serial</code> is special in
|
||
that it doesn't represents a separate device, but rather the same
|
||
device as the first <code>serial</code> element. Due to this, there can
|
||
only be a single <code>console</code> element with target type
|
||
<code>serial</code> per guest.
|
||
</p>
|
||
|
||
<p>
|
||
Virtio consoles are usually accessible as <code>/dev/hvc[0-7]</code>
|
||
from inside the guest; for more information, see
|
||
<a href="http://fedoraproject.org/wiki/Features/VirtioSerial">http://fedoraproject.org/wiki/Features/VirtioSerial</a>.
|
||
<span class="since">Since 0.8.3</span>
|
||
</p>
|
||
|
||
<p>
|
||
For the relationship between serial ports and consoles,
|
||
<a href="#elementCharSerialAndConsole">see below</a>.
|
||
</p>
|
||
|
||
<h6><a id="elementCharSerialAndConsole">Relationship between serial ports and consoles</a></h6>
|
||
|
||
<p>
|
||
Due to hystorical reasons, the <code>serial</code> and
|
||
<code>console</code> elements have partially overlapping scopes.
|
||
</p>
|
||
|
||
<p>
|
||
In general, both elements are used to configure one or more serial
|
||
consoles to be used for interacting with the guest. The main difference
|
||
between the two is that <code>serial</code> is used for emulated,
|
||
usually native, serial consoles, whereas <code>console</code> is used
|
||
for paravirtualized ones.
|
||
</p>
|
||
|
||
<p>
|
||
Both emulated and paravirtualized serial consoles have advantages and
|
||
disadvantages:
|
||
</p>
|
||
|
||
<ul>
|
||
<li>
|
||
emulated serial consoles are usually initialized much earlier than
|
||
paravirtualized ones, so they can be used to control the bootloader
|
||
and display both firmware and early boot messages;
|
||
</li>
|
||
<li>
|
||
on several platforms, there can only be a single emulated serial
|
||
console per guest but paravirtualized consoles don't suffer from the
|
||
same limitation.
|
||
</li>
|
||
</ul>
|
||
|
||
<p>
|
||
A configuration such as:
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<console type='pty'>
|
||
<target type='serial'/>
|
||
</console>
|
||
<console type='pty'>
|
||
<target type='virtio'/>
|
||
</console>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
will work on any platform and will result in one emulated serial console
|
||
for early boot logging / interactive / recovery use, and one
|
||
paravirtualized serial console to be used eg. as a side channel. Most
|
||
people will be fine with having just the first <code>console</code>
|
||
element in their configuration.
|
||
</p>
|
||
|
||
<p>
|
||
Note that, due to the compatibility concerns mentioned earlier, all the
|
||
following configurations:
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<serial type='pty'/>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<console type='pty'/>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<serial type='pty'/>
|
||
<console type='pty'/>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
will be treated the same and will result in a single emulated serial
|
||
console being available to the guest.
|
||
</p>
|
||
|
||
<h6><a id="elementCharChannel">Channel</a></h6>
|
||
|
||
<p>
|
||
This represents a private communication channel between the host and the
|
||
guest.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<channel type='unix'>
|
||
<source mode='bind' path='/tmp/guestfwd'/>
|
||
<target type='guestfwd' address='10.0.2.1' port='4600'/>
|
||
</channel>
|
||
|
||
<!-- KVM virtio channel -->
|
||
<channel type='pty'>
|
||
<target type='virtio' name='arbitrary.virtio.serial.port.name'/>
|
||
</channel>
|
||
<channel type='unix'>
|
||
<source mode='bind' path='/var/lib/libvirt/qemu/f16x86_64.agent'/>
|
||
<target type='virtio' name='org.qemu.guest_agent.0' state='connected'/>
|
||
</channel>
|
||
<channel type='spicevmc'>
|
||
<target type='virtio' name='com.redhat.spice.0'/>
|
||
</channel>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
This can be implemented in a variety of ways. The specific type of
|
||
channel is given in the <code>type</code> attribute of the
|
||
<code>target</code> element. Different channel types have different
|
||
<code>target</code> attributes.
|
||
</p>
|
||
|
||
<dl>
|
||
<dt><code>guestfwd</code></dt>
|
||
<dd>TCP traffic sent by the guest to a given IP address and port is
|
||
forwarded to the channel device on the host. The <code>target</code>
|
||
element must have <code>address</code> and <code>port</code> attributes.
|
||
<span class="since">Since 0.7.3</span></dd>
|
||
|
||
<dt><code>virtio</code></dt>
|
||
<dd>Paravirtualized virtio channel. Channel is exposed in the guest under
|
||
/dev/vport*, and if the optional element <code>name</code> is specified,
|
||
/dev/virtio-ports/$name (for more info, please see
|
||
<a href="http://fedoraproject.org/wiki/Features/VirtioSerial">http://fedoraproject.org/wiki/Features/VirtioSerial</a>). The
|
||
optional element <code>address</code> can tie the channel to a
|
||
particular <code>type='virtio-serial'</code>
|
||
controller, <a href="#elementsAddress">documented above</a>.
|
||
With qemu, if <code>name</code> is "org.qemu.guest_agent.0",
|
||
then libvirt can interact with a guest agent installed in the
|
||
guest, for actions such as guest shutdown or file system quiescing.
|
||
<span class="since">Since 0.7.7, guest agent interaction
|
||
since 0.9.10</span> Moreover, <span class="since">since 1.0.6</span>
|
||
it is possible to have source path auto generated for virtio unix channels.
|
||
This is very useful in case of a qemu guest agent, where users don't
|
||
usually care about the source path since it's libvirt who talks to
|
||
the guest agent. In case users want to utilize this feature, they should
|
||
leave <code><source></code> element out. <span class="since">Since
|
||
1.2.11</span> the active XML for a virtio channel may contain an optional
|
||
<code>state</code> attribute that reflects whether a process in the
|
||
guest is active on the channel. This is an output-only attribute.
|
||
Possible values for the <code>state</code> attribute are
|
||
<code>connected</code> and <code>disconnected</code>.
|
||
</dd>
|
||
<dt><code>xen</code></dt>
|
||
<dd> Paravirtualized Xen channel. Channel is exposed in the guest as a
|
||
Xen console but identified with a name. Setup and consumption of a Xen
|
||
channel depends on software and configuration in the guest
|
||
(for more info, please see <a href="http://xenbits.xen.org/docs/unstable/misc/channel.txt">http://xenbits.xen.org/docs/unstable/misc/channel.txt</a>).
|
||
Channel source path semantics are the same as the virtio target type.
|
||
The <code>state</code> attribute is not supported since Xen channels
|
||
lack the necessary probing mechanism.
|
||
<span class="since">Since 2.3.0</span>
|
||
</dd>
|
||
<dt><code>spicevmc</code></dt>
|
||
<dd>Paravirtualized SPICE channel. The domain must also have a
|
||
SPICE server as a <a href="#elementsGraphics">graphics
|
||
device</a>, at which point the host piggy-backs messages
|
||
across the <code>main</code> channel. The <code>target</code>
|
||
element must be present, with
|
||
attribute <code>type='virtio'</code>; an optional
|
||
attribute <code>name</code> controls how the guest will have
|
||
access to the channel, and defaults
|
||
to <code>name='com.redhat.spice.0'</code>. The
|
||
optional <code>address</code> element can tie the channel to a
|
||
particular <code>type='virtio-serial'</code> controller.
|
||
<span class="since">Since 0.8.8</span></dd>
|
||
</dl>
|
||
|
||
<h5><a id="elementsCharHostInterface">Host interface</a></h5>
|
||
|
||
<p>
|
||
A character device presents itself to the host as one of the following
|
||
types.
|
||
</p>
|
||
|
||
<h6><a id="elementsCharSTDIO">Domain logfile</a></h6>
|
||
|
||
<p>
|
||
This disables all input on the character device, and sends output
|
||
into the virtual machine's logfile
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<console type='stdio'>
|
||
<target port='1'/>
|
||
</console>
|
||
</devices>
|
||
...</pre>
|
||
|
||
|
||
<h6><a id="elementsCharFle">Device logfile</a></h6>
|
||
|
||
<p>
|
||
A file is opened and all data sent to the character
|
||
device is written to the file.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<serial type="file">
|
||
<source path="/var/log/vm/vm-serial.log"/>
|
||
<target port="1"/>
|
||
</serial>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<h6><a id="elementsCharVC">Virtual console</a></h6>
|
||
|
||
<p>
|
||
Connects the character device to the graphical framebuffer in
|
||
a virtual console. This is typically accessed via a special
|
||
hotkey sequence such as "ctrl+alt+3"
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<serial type='vc'>
|
||
<target port="1"/>
|
||
</serial>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<h6><a id="elementsCharNull">Null device</a></h6>
|
||
|
||
<p>
|
||
Connects the character device to the void. No data is ever
|
||
provided to the input. All data written is discarded.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<serial type='null'>
|
||
<target port="1"/>
|
||
</serial>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<h6><a id="elementsCharPTY">Pseudo TTY</a></h6>
|
||
|
||
<p>
|
||
A Pseudo TTY is allocated using /dev/ptmx. A suitable client
|
||
such as 'virsh console' can connect to interact with the
|
||
serial port locally.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<serial type="pty">
|
||
<source path="/dev/pts/3"/>
|
||
<target port="1"/>
|
||
</serial>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
NB special case if <console type='pty'>, then the TTY
|
||
path is also duplicated as an attribute tty='/dev/pts/3'
|
||
on the top level <console> tag. This provides compat
|
||
with existing syntax for <console> tags.
|
||
</p>
|
||
|
||
<h6><a id="elementsCharHost">Host device proxy</a></h6>
|
||
|
||
<p>
|
||
The character device is passed through to the underlying
|
||
physical character device. The device types must match,
|
||
eg the emulated serial port should only be connected to
|
||
a host serial port - don't connect a serial port to a parallel
|
||
port.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<serial type="dev">
|
||
<source path="/dev/ttyS0"/>
|
||
<target port="1"/>
|
||
</serial>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<h6><a id="elementsCharPipe">Named pipe</a></h6>
|
||
|
||
<p>
|
||
The character device writes output to a named pipe. See pipe(7) for
|
||
more info.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<serial type="pipe">
|
||
<source path="/tmp/mypipe"/>
|
||
<target port="1"/>
|
||
</serial>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<h6><a id="elementsCharTCP">TCP client/server</a></h6>
|
||
|
||
<p>
|
||
The character device acts as a TCP client connecting to a
|
||
remote server.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<serial type="tcp">
|
||
<source mode="connect" host="0.0.0.0" service="2445"/>
|
||
<protocol type="raw"/>
|
||
<target port="1"/>
|
||
</serial>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
Or as a TCP server waiting for a client connection.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<serial type="tcp">
|
||
<source mode="bind" host="127.0.0.1" service="2445"/>
|
||
<protocol type="raw"/>
|
||
<target port="1"/>
|
||
</serial>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
Alternatively you can use <code>telnet</code> instead
|
||
of <code>raw</code> TCP in order to utilize the telnet protocol
|
||
for the connection.
|
||
</p>
|
||
<p>
|
||
<span class="since">Since 0.8.5,</span> some hypervisors support
|
||
use of either <code>telnets</code> (secure telnet) or <code>tls</code>
|
||
(via secure sockets layer) as the transport protocol for connections.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<serial type="tcp">
|
||
<source mode="connect" host="0.0.0.0" service="2445"/>
|
||
<protocol type="telnet"/>
|
||
<target port="1"/>
|
||
</serial>
|
||
...
|
||
<serial type="tcp">
|
||
<source mode="bind" host="127.0.0.1" service="2445"/>
|
||
<protocol type="telnet"/>
|
||
<target port="1"/>
|
||
</serial>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
<span class="since">Since 2.4.0,</span> the optional attribute
|
||
<code>tls</code> can be used to control whether a chardev
|
||
TCP communication channel would utilize a hypervisor configured
|
||
TLS X.509 certificate environment in order to encrypt the data
|
||
channel. For the QEMU hypervisor, usage of a TLS environment can
|
||
be controlled on the host by the <code>chardev_tls</code> and
|
||
<code>chardev_tls_x509_cert_dir</code> or
|
||
<code>default_tls_x509_cert_dir</code> settings in the file
|
||
/etc/libvirt/qemu.conf. If <code>chardev_tls</code> is enabled,
|
||
then unless the <code>tls</code> attribute is set to "no", libvirt
|
||
will use the host configured TLS environment.
|
||
If <code>chardev_tls</code> is disabled, but the <code>tls</code>
|
||
attribute is set to "yes", then libvirt will attempt to use the
|
||
host TLS environment if either the <code>chardev_tls_x509_cert_dir</code>
|
||
or <code>default_tls_x509_cert_dir</code> TLS directory structure exists.
|
||
</p>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<serial type="tcp">
|
||
<source mode='connect' host="127.0.0.1" service="5555" tls="yes"/>
|
||
<protocol type="raw"/>
|
||
<target port="0"/>
|
||
</serial>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<h6><a id="elementsCharUDP">UDP network console</a></h6>
|
||
|
||
<p>
|
||
The character device acts as a UDP netconsole service,
|
||
sending and receiving packets. This is a lossy service.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<serial type="udp">
|
||
<source mode="bind" host="0.0.0.0" service="2445"/>
|
||
<source mode="connect" host="0.0.0.0" service="2445"/>
|
||
<target port="1"/>
|
||
</serial>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<h6><a id="elementsCharUNIX">UNIX domain socket client/server</a></h6>
|
||
|
||
<p>
|
||
The character device acts as a UNIX domain socket server,
|
||
accepting connections from local clients.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<serial type="unix">
|
||
<source mode="bind" path="/tmp/foo"/>
|
||
<target port="1"/>
|
||
</serial>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<h6><a id="elementsCharSpiceport">Spice channel</a></h6>
|
||
|
||
<p>
|
||
The character device is accessible through spice connection
|
||
under a channel name specified in the <code>channel</code>
|
||
attribute. <span class="since">Since 1.2.2</span>
|
||
</p>
|
||
<p>
|
||
Note: depending on the hypervisor, spiceports might (or might not)
|
||
be enabled on domains with or without <a href="#elementsGraphics">spice
|
||
graphics</a>.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<serial type="spiceport">
|
||
<source channel="org.qemu.console.serial.0"/>
|
||
<target port="1"/>
|
||
</serial>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<h6><a id="elementsNmdm">Nmdm device</a></h6>
|
||
|
||
<p>
|
||
The nmdm device driver, available on FreeBSD, provides two
|
||
tty devices connected together by a virtual null modem cable.
|
||
<span class="since">Since 1.2.4</span>
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<serial type="nmdm">
|
||
<source master="/dev/nmdm0A" slave="/dev/nmdm0B"/>
|
||
</serial>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
The <code>source</code> element has these attributes:
|
||
</p>
|
||
|
||
<dl>
|
||
<dt><code>master</code></dt>
|
||
<dd>Master device of the pair, that is passed to the hypervisor.
|
||
Device is specified by a fully qualified path.</dd>
|
||
|
||
<dt><code>slave</code></dt>
|
||
<dd>Slave device of the pair, that is passed to the clients for connection
|
||
to the guest console. Device is specified by a fully qualified path.</dd>
|
||
</dl>
|
||
|
||
<h4><a id="elementsSound">Sound devices</a></h4>
|
||
|
||
<p>
|
||
A virtual sound card can be attached to the host via the
|
||
<code>sound</code> element. <span class="since">Since 0.4.3</span>
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<sound model='es1370'/>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<dl>
|
||
<dt><code>sound</code></dt>
|
||
<dd>
|
||
The <code>sound</code> element has one mandatory attribute,
|
||
<code>model</code>, which specifies what real sound device is emulated.
|
||
Valid values are specific to the underlying hypervisor, though typical
|
||
choices are 'es1370', 'sb16', 'ac97', 'ich6' and 'usb'.
|
||
(<span class="since">
|
||
'ac97' only since 0.6.0, 'ich6' only since 0.8.8,
|
||
'usb' only since 1.2.7</span>)
|
||
</dd>
|
||
</dl>
|
||
|
||
<p>
|
||
<span class="since">Since 0.9.13</span>, a sound element
|
||
with <code>ich6</code> model can have optional
|
||
sub-elements <code><codec></code> to attach various audio
|
||
codecs to the audio device. If not specified, a default codec
|
||
will be attached to allow playback and recording.
|
||
</p>
|
||
<p>
|
||
Valid values are:
|
||
</p>
|
||
<p>
|
||
<ul>
|
||
<li>'duplex' - advertise a line-in and a line-out </li>
|
||
<li>'micro' - advertise a speaker and a microphone </li>
|
||
<li>'output' - advertise a line-out
|
||
<span class="since">Since 4.4.0</span></li>
|
||
</ul>
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<sound model='ich6'>
|
||
<codec type='micro'/>
|
||
</sound>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
Each <code>sound</code> element has an optional
|
||
sub-element <code><address></code> which can tie the
|
||
device to a particular PCI
|
||
slot, <a href="#elementsAddress">documented above</a>.
|
||
</p>
|
||
|
||
<h4><a id="elementsWatchdog">Watchdog device</a></h4>
|
||
|
||
<p>
|
||
A virtual hardware watchdog device can be added to the guest via
|
||
the <code>watchdog</code> element.
|
||
<span class="since">Since 0.7.3, QEMU and KVM only</span>
|
||
</p>
|
||
|
||
<p>
|
||
The watchdog device requires an additional driver and management
|
||
daemon in the guest. Just enabling the watchdog in the libvirt
|
||
configuration does not do anything useful on its own.
|
||
</p>
|
||
|
||
<p>
|
||
Currently libvirt does not support notification when the
|
||
watchdog fires. This feature is planned for a future version of
|
||
libvirt.
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<watchdog model='i6300esb'/>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<watchdog model='i6300esb' action='poweroff'/>
|
||
</devices>
|
||
</domain></pre>
|
||
|
||
<dl>
|
||
<dt><code>model</code></dt>
|
||
<dd>
|
||
<p>
|
||
The required <code>model</code> attribute specifies what real
|
||
watchdog device is emulated. Valid values are specific to the
|
||
underlying hypervisor.
|
||
</p>
|
||
<p>
|
||
QEMU and KVM support:
|
||
</p>
|
||
<ul>
|
||
<li>'i6300esb' - the recommended device,
|
||
emulating a PCI Intel 6300ESB </li>
|
||
<li>'ib700' - emulating an ISA iBase IB700 </li>
|
||
<li>'diag288' - emulating an S390 DIAG288 device
|
||
<span class="since">Since 1.2.17</span></li>
|
||
</ul>
|
||
</dd>
|
||
<dt><code>action</code></dt>
|
||
<dd>
|
||
<p>
|
||
The optional <code>action</code> attribute describes what
|
||
action to take when the watchdog expires. Valid values are
|
||
specific to the underlying hypervisor.
|
||
</p>
|
||
<p>
|
||
QEMU and KVM support:
|
||
</p>
|
||
<ul>
|
||
<li>'reset' - default, forcefully reset the guest</li>
|
||
<li>'shutdown' - gracefully shutdown the guest
|
||
(not recommended) </li>
|
||
<li>'poweroff' - forcefully power off the guest</li>
|
||
<li>'pause' - pause the guest</li>
|
||
<li>'none' - do nothing</li>
|
||
<li>'dump' - automatically dump the guest
|
||
<span class="since">Since 0.8.7</span></li>
|
||
<li>'inject-nmi' - inject a non-maskable interrupt
|
||
into the guest
|
||
<span class="since">Since 1.2.17</span></li>
|
||
</ul>
|
||
<p>
|
||
Note 1: the 'shutdown' action requires that the guest
|
||
is responsive to ACPI signals. In the sort of situations
|
||
where the watchdog has expired, guests are usually unable
|
||
to respond to ACPI signals. Therefore using 'shutdown'
|
||
is not recommended.
|
||
</p>
|
||
<p>
|
||
Note 2: the directory to save dump files can be configured
|
||
by <code>auto_dump_path</code> in file /etc/libvirt/qemu.conf.
|
||
</p>
|
||
</dd>
|
||
</dl>
|
||
|
||
<h4><a id="elementsMemBalloon">Memory balloon device</a></h4>
|
||
|
||
<p>
|
||
A virtual memory balloon device is added to all Xen and KVM/QEMU
|
||
guests. It will be seen as <code>memballoon</code> element.
|
||
It will be automatically added when appropriate, so there is no
|
||
need to explicitly add this element in the guest XML unless a
|
||
specific PCI slot needs to be assigned.
|
||
<span class="since">Since 0.8.3, Xen, QEMU and KVM only</span>
|
||
Additionally, <span class="since">since 0.8.4</span>, if the
|
||
memballoon device needs to be explicitly disabled,
|
||
<code>model='none'</code> may be used.
|
||
</p>
|
||
|
||
<p>
|
||
Example: automatically added device with KVM
|
||
</p>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<memballoon model='virtio'/>
|
||
</devices>
|
||
...</pre>
|
||
|
||
<p>
|
||
Example: manually added device with static PCI slot 2 requested
|
||
</p>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<memballoon model='virtio'>
|
||
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
|
||
<stats period='10'/>
|
||
<driver iommu='on' ats='on'/>
|
||
</memballoon>
|
||
</devices>
|
||
</domain></pre>
|
||
|
||
<dl>
|
||
<dt><code>model</code></dt>
|
||
<dd>
|
||
<p>
|
||
The required <code>model</code> attribute specifies what type
|
||
of balloon device is provided. Valid values are specific to
|
||
the virtualization platform
|
||
</p>
|
||
<ul>
|
||
<li>'virtio' - default with QEMU/KVM</li>
|
||
<li>'xen' - default with Xen</li>
|
||
</ul>
|
||
</dd>
|
||
<dt><code>autodeflate</code></dt>
|
||
<dd>
|
||
<p>
|
||
The optional <code>autodeflate</code> attribute allows to
|
||
enable/disable (values "on"/"off", respectively) the ability of the
|
||
QEMU virtio memory balloon to release some memory at the last moment
|
||
before a guest's process get killed by Out of Memory killer.
|
||
<span class="since">Since 1.3.1, QEMU and KVM only</span>
|
||
</p>
|
||
</dd>
|
||
<dt><code>period</code></dt>
|
||
<dd>
|
||
<p>
|
||
The optional <code>period</code> allows the QEMU virtio memory balloon
|
||
driver to provide statistics through the <code>virsh dommemstat
|
||
[domain]</code> command. By default, collection is not enabled. In
|
||
order to enable, use the <code>virsh dommemstat [domain] --period
|
||
[number]</code> command or <code>virsh edit</code> command to add the
|
||
option to the XML definition. The <code>virsh dommemstat</code> will
|
||
accept the options <code>--live</code>, <code>--current</code>,
|
||
or <code>--config</code>. If an option is not provided, the change
|
||
for a running domain will only be made to the active guest. If the
|
||
QEMU driver is not at the right revision, the attempt to set the
|
||
period will fail. Large values (e.g. many years) might be ignored.
|
||
<span class='since'>Since 1.1.1, requires QEMU 1.5</span>
|
||
</p>
|
||
</dd>
|
||
<dt><code>driver</code></dt>
|
||
<dd>
|
||
For model <code>virtio</code> memballoon,
|
||
<a href="#elementsVirtio">Virtio-specific options</a> can also be
|
||
set. (<span class="since">Since 3.5.0</span>)
|
||
</dd>
|
||
</dl>
|
||
<h4><a id="elementsRng">Random number generator device</a></h4>
|
||
|
||
<p>
|
||
The virtual random number generator device allows the host to pass
|
||
through entropy to guest operating systems.
|
||
<span class="since">Since 1.0.3</span>
|
||
</p>
|
||
|
||
<p>
|
||
Example: usage of the RNG device:
|
||
</p>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<rng model='virtio'>
|
||
<rate period="2000" bytes="1234"/>
|
||
<backend model='random'>/dev/random</backend>
|
||
<!-- OR -->
|
||
<backend model='egd' type='udp'>
|
||
<source mode='bind' service='1234'/>
|
||
<source mode='connect' host='1.2.3.4' service='1234'/>
|
||
</backend>
|
||
</rng>
|
||
</devices>
|
||
...
|
||
</pre>
|
||
<dl>
|
||
<dt><code>model</code></dt>
|
||
<dd>
|
||
<p>
|
||
The required <code>model</code> attribute specifies what type
|
||
of RNG device is provided. Valid values are specific to
|
||
the virtualization platform:
|
||
</p>
|
||
<ul>
|
||
<li>'virtio' - supported by qemu and virtio-rng kernel module</li>
|
||
</ul>
|
||
</dd>
|
||
<dt><code>rate</code></dt>
|
||
<dd>
|
||
<p>
|
||
The optional <code>rate</code> element allows limiting the rate at
|
||
which entropy can be consumed from the source. The mandatory
|
||
attribute <code>bytes</code> specifies how many bytes are permitted
|
||
to be consumed per period. An optional <code>period</code> attribute
|
||
specifies the duration of a period in milliseconds; if omitted, the
|
||
period is taken as 1000 milliseconds (1 second).
|
||
<span class='since'>Since 1.0.4</span>
|
||
</p>
|
||
</dd>
|
||
<dt><code>backend</code></dt>
|
||
<dd>
|
||
<p>
|
||
The <code>backend</code> element specifies the source of entropy
|
||
to be used for the domain. The source model is configured using the
|
||
<code>model</code> attribute. Supported source models are:
|
||
</p>
|
||
<dl>
|
||
<dt><code>random</code></dt>
|
||
<dd>
|
||
<p>
|
||
This backend type expects a non-blocking character device
|
||
as input. The file name is specified as contents of the
|
||
<code>backend</code> element. <span class='since'>Since
|
||
1.3.4</span> any path is accepted. Before that
|
||
<code>/dev/random</code> and <code>/dev/hwrng</code> were
|
||
the only accepted paths. When no file name is specified,
|
||
the hypervisor default is used. For QEMU, the default is
|
||
<code>/dev/random</code>. However, the recommended source
|
||
of entropy is <code>/dev/urandom</code> (as it doesn't
|
||
have the limitations of <code>/dev/random</code>).
|
||
</p>
|
||
</dd>
|
||
<dt><code>egd</code></dt>
|
||
<dd>
|
||
<p>
|
||
This backend connects to a source using the EGD protocol.
|
||
The source is specified as a character device. Refer to
|
||
<a href='#elementsCharHostInterface'>character device host interface</a>
|
||
for more information.
|
||
</p>
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
<dt><code>driver</code></dt>
|
||
<dd>
|
||
The subelement <code>driver</code> can be used to tune the device:
|
||
<dl>
|
||
<dt>virtio options</dt>
|
||
<dd>
|
||
<a href="#elementsVirtio">Virtio-specific options</a> can also be
|
||
set. (<span class="since">Since 3.5.0</span>)
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
|
||
</dl>
|
||
|
||
<h4><a id="elementsTpm">TPM device</a></h4>
|
||
|
||
<p>
|
||
The TPM device enables a QEMU guest to have access to TPM
|
||
functionality. The TPM device may either be a TPM 1.2 or
|
||
a TPM 2.0.
|
||
</p>
|
||
<p>
|
||
The TPM passthrough device type provides access to the host's TPM
|
||
for one QEMU guest. No other software may be using the TPM device,
|
||
typically /dev/tpm0, at the time the QEMU guest is started.
|
||
<span class="since">'passthrough' since 1.0.5</span>
|
||
</p>
|
||
|
||
<p>
|
||
Example: usage of the TPM passthrough device
|
||
</p>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<tpm model='tpm-tis'>
|
||
<backend type='passthrough'>
|
||
<device path='/dev/tpm0'/>
|
||
</backend>
|
||
</tpm>
|
||
</devices>
|
||
...
|
||
</pre>
|
||
|
||
<p>
|
||
The emulator device type gives access to a TPM emulator providing
|
||
TPM functionality for each VM. QEMU talks to it over a Unix socket. With
|
||
the emulator device type each guest gets its own private TPM.
|
||
<span class="since">'emulator' since 4.5.0</span>
|
||
</p>
|
||
<p>
|
||
Example: usage of the TPM Emulator
|
||
</p>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<tpm model='tpm-tis'>
|
||
<backend type='emulator' version='2.0'>
|
||
</backend>
|
||
</tpm>
|
||
</devices>
|
||
...
|
||
</pre>
|
||
<dl>
|
||
<dt><code>model</code></dt>
|
||
<dd>
|
||
<p>
|
||
The <code>model</code> attribute specifies what device
|
||
model QEMU provides to the guest. If no model name is provided,
|
||
<code>tpm-tis</code> will automatically be chosen.
|
||
<span class="since">Since 4.4.0</span>, another available choice
|
||
is the <code>tpm-crb</code>, which should only be used when the
|
||
backend device is a TPM 2.0.
|
||
</p>
|
||
</dd>
|
||
<dt><code>backend</code></dt>
|
||
<dd>
|
||
<p>
|
||
The <code>backend</code> element specifies the type of
|
||
TPM device. The following types are supported:
|
||
</p>
|
||
<dl>
|
||
<dt><code>passthrough</code></dt>
|
||
<dd>
|
||
<p>
|
||
Use the host's TPM device.
|
||
</p>
|
||
<p>
|
||
This backend type requires exclusive access to a TPM device on
|
||
the host. An example for such a device is /dev/tpm0. The fully
|
||
qualified file name is specified by path attribute of the
|
||
<code>source</code> element. If no file name is specified then
|
||
/dev/tpm0 is automatically used.
|
||
</p>
|
||
</dd>
|
||
</dl>
|
||
<dl>
|
||
<dt><code>emulator</code></dt>
|
||
<dd>
|
||
<p>
|
||
For this backend type the 'swtpm' TPM Emulator must be installed on the
|
||
host. Libvirt will automatically start an independent TPM emulator
|
||
for each QEMU guest requesting access to it.
|
||
</p>
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
<dt><code>version</code></dt>
|
||
<dd>
|
||
<p>
|
||
The <code>version</code> attribute indicates the version
|
||
of the TPM. By default a TPM 1.2 is created. This attribute
|
||
only works with the <code>emulator</code> backend. The following
|
||
versions are supported:
|
||
</p>
|
||
<ul>
|
||
<li>'1.2' : creates a TPM 1.2</li>
|
||
<li>'2.0' : creates a TPM 2.0</li>
|
||
</ul>
|
||
</dd>
|
||
</dl>
|
||
|
||
<h4><a id="elementsNVRAM">NVRAM device</a></h4>
|
||
<p>
|
||
nvram device is always added to pSeries guest on PPC64, and its address
|
||
is allowed to be changed. Element <code>nvram</code> (only valid for
|
||
pSeries guest, <span class="since">since 1.0.5</span>) is provided to
|
||
enable the address setting.
|
||
</p>
|
||
<p>
|
||
Example: usage of NVRAM configuration
|
||
</p>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<nvram>
|
||
<address type='spapr-vio' reg='0x3000'/>
|
||
</nvram>
|
||
</devices>
|
||
...
|
||
</pre>
|
||
<dl>
|
||
<dt><code>spapr-vio</code></dt>
|
||
<dd>
|
||
<p>
|
||
VIO device address type, only valid for PPC64.
|
||
</p>
|
||
</dd>
|
||
<dt><code>reg</code></dt>
|
||
<dd>
|
||
<p>
|
||
Device address
|
||
</p>
|
||
</dd>
|
||
</dl>
|
||
|
||
<h4><a id="elementsPanic">panic device</a></h4>
|
||
<p>
|
||
panic device enables libvirt to receive panic notification from a QEMU
|
||
guest.
|
||
<span class="since">Since 1.2.1, QEMU and KVM only</span>
|
||
</p>
|
||
<p>
|
||
This feature is always enabled for:
|
||
</p>
|
||
<ul>
|
||
<li>pSeries guests, since it's implemented by the guest firmware</li>
|
||
<li>S390 guests, since it's an integral part of the S390 architecture</li>
|
||
</ul>
|
||
<p>
|
||
For the guest types listed above, libvirt automatically adds a
|
||
<code>panic</code> element to the domain XML.
|
||
</p>
|
||
<p>
|
||
Example: usage of panic configuration
|
||
</p>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<panic model='hyperv'/>
|
||
<panic model='isa'>
|
||
<address type='isa' iobase='0x505'/>
|
||
</panic>
|
||
</devices>
|
||
...
|
||
</pre>
|
||
<dl>
|
||
<dt><code>model</code></dt>
|
||
<dd>
|
||
<p>
|
||
The optional <code>model</code> attribute specifies what type
|
||
of panic device is provided. The panic model used when this attribute
|
||
is missing depends on the hypervisor and guest arch.
|
||
</p>
|
||
<ul>
|
||
<li>'isa' - for ISA pvpanic device</li>
|
||
<li>'pseries' - default and valid only for pSeries guests.</li>
|
||
<li>'hyperv' - for Hyper-V crash CPU feature.
|
||
<span class="since">Since 1.3.0, QEMU and KVM only</span></li>
|
||
<li>'s390' - default for S390 guests.
|
||
<span class="since">Since 1.3.5</span></li>
|
||
</ul>
|
||
</dd>
|
||
<dt><code>address</code></dt>
|
||
<dd>
|
||
<p>
|
||
address of panic. The default ioport is 0x505. Most users
|
||
don't need to specify an address, and doing so is forbidden
|
||
altogether for s390, pseries and hyperv models.
|
||
</p>
|
||
</dd>
|
||
</dl>
|
||
|
||
<h4><a id="elementsShmem">Shared memory device</a></h4>
|
||
|
||
<p>
|
||
A shared memory device allows to share a memory region between
|
||
different virtual machines and the host.
|
||
<span class="since">Since 1.2.10, QEMU and KVM only</span>
|
||
</p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<shmem name='my_shmem0'>
|
||
<model type='ivshmem-plain'/>
|
||
<size unit='M'>4</size>
|
||
</shmem>
|
||
<shmem name='shmem_server'>
|
||
<model type='ivshmem-doorbell'/>
|
||
<size unit='M'>2</size>
|
||
<server path='/tmp/socket-shmem'/>
|
||
<msi vectors='32' ioeventfd='on'/>
|
||
</shmem>
|
||
</devices>
|
||
...
|
||
</pre>
|
||
|
||
<dl>
|
||
<dt><code>shmem</code></dt>
|
||
<dd>
|
||
The <code>shmem</code> element has one mandatory attribute,
|
||
<code>name</code> to identify the shared memory. This attribute cannot
|
||
be directory specific to <code>.</code> or <code>..</code> as well as
|
||
it cannot involve path separator <code>/</code>.
|
||
</dd>
|
||
<dt><code>model</code></dt>
|
||
<dd>
|
||
Attribute <code>type</code> of the optional element <code>model</code>
|
||
specifies the model of the underlying device providing the
|
||
<code>shmem</code> device. The models currently supported are
|
||
<code>ivshmem</code> (supports both server and server-less shmem, but is
|
||
deprecated by newer QEMU in favour of the -plain and -doorbell variants),
|
||
<code>ivshmem-plain</code> (only for server-less shmem) and
|
||
<code>ivshmem-doorbell</code> (only for shmem with the server).
|
||
</dd>
|
||
<dt><code>size</code></dt>
|
||
<dd>
|
||
The optional <code>size</code> element specifies the size of the shared
|
||
memory. This must be power of 2 and greater than or equal to 1 MiB.
|
||
</dd>
|
||
<dt><code>server</code></dt>
|
||
<dd>
|
||
The optional <code>server</code> element can be used to configure a server
|
||
socket the device is supposed to connect to. The optional
|
||
<code>path</code> attribute specifies the absolute path to the unix socket
|
||
and defaults to <code>/var/lib/libvirt/shmem/$shmem-$name-sock</code>.
|
||
</dd>
|
||
<dt><code>msi</code></dt>
|
||
<dd>
|
||
The optional <code>msi</code> element enables/disables (values "on"/"off",
|
||
respectively) MSI interrupts. This option can currently be used only
|
||
together with the <code>server</code> element. The <code>vectors</code>
|
||
attribute can be used to specify the number of interrupt
|
||
vectors. The <code>ioeventd</code> attribute enables/disables (values
|
||
"on"/"off", respectively) ioeventfd.
|
||
</dd>
|
||
</dl>
|
||
|
||
<h4><a id="elementsMemory">Memory devices</a></h4>
|
||
|
||
<p>
|
||
In addition to the initial memory assigned to the guest, memory devices
|
||
allow additional memory to be assigned to the guest in the form of
|
||
memory modules.
|
||
|
||
A memory device can be hot-plugged or hot-unplugged depending on the
|
||
guests' memory resource needs.
|
||
|
||
Some hypervisors may require NUMA configured for the guest.
|
||
</p>
|
||
|
||
<p>
|
||
Example: usage of the memory devices
|
||
</p>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<memory model='dimm' access='private' discard='yes'>
|
||
<target>
|
||
<size unit='KiB'>524287</size>
|
||
<node>0</node>
|
||
</target>
|
||
</memory>
|
||
<memory model='dimm'>
|
||
<source>
|
||
<pagesize unit='KiB'>4096</pagesize>
|
||
<nodemask>1-3</nodemask>
|
||
</source>
|
||
<target>
|
||
<size unit='KiB'>524287</size>
|
||
<node>1</node>
|
||
</target>
|
||
</memory>
|
||
<memory model='nvdimm'>
|
||
<source>
|
||
<path>/tmp/nvdimm</path>
|
||
</source>
|
||
<target>
|
||
<size unit='KiB'>524288</size>
|
||
<node>1</node>
|
||
<label>
|
||
<size unit='KiB'>128</size>
|
||
</label>
|
||
</target>
|
||
</memory>
|
||
</devices>
|
||
...
|
||
</pre>
|
||
<dl>
|
||
<dt><code>model</code></dt>
|
||
<dd>
|
||
<p>
|
||
Provide <code>dimm</code> to add a virtual DIMM module to the guest.
|
||
<span class="since">Since 1.2.14</span>
|
||
Provide <code>nvdimm</code> model adds a Non-Volatile DIMM
|
||
module. <span class="since">Since 3.2.0</span>
|
||
</p>
|
||
</dd>
|
||
|
||
<dt><code>access</code></dt>
|
||
<dd>
|
||
<p>
|
||
An optional attribute <code>access</code>
|
||
(<span class="since">since 3.2.0</span>) that provides
|
||
capability to fine tune mapping of the memory on per
|
||
module basis. Values are the same as
|
||
<a href="#elementsMemoryBacking">Memory Backing</a>:
|
||
<code>shared</code> and <code>private</code>.
|
||
</p>
|
||
</dd>
|
||
|
||
<dt><code>discard</code></dt>
|
||
<dd>
|
||
<p>
|
||
An optional attribute <code>discard</code>
|
||
(<span class="since">since 4.4.0</span>) that provides
|
||
capability to fine tune discard of data on per module
|
||
basis. Accepted values are <code>yes</code> and
|
||
<code>no</code>. The feature is described here:
|
||
<a href="#elementsMemoryBacking">Memory Backing</a>.
|
||
This attribute is allowed only for
|
||
<code>model='dimm'</code>.
|
||
</p>
|
||
</dd>
|
||
|
||
<dt><code>source</code></dt>
|
||
<dd>
|
||
<p>
|
||
For model <code>dimm</code> this element is optional and allows to
|
||
fine tune the source of the memory used for the given memory device.
|
||
If the element is not provided defaults configured via
|
||
<code>numatune</code> are used. If <code>dimm</code> is provided,
|
||
then the following optional elements can be provided as well:
|
||
</p>
|
||
|
||
<dl>
|
||
<dt><code>pagesize</code></dt>
|
||
<dd>
|
||
<p>
|
||
This element can be used to override the default
|
||
host page size used for backing the memory device.
|
||
The configured value must correspond to a page size
|
||
supported by the host.
|
||
</p>
|
||
</dd>
|
||
|
||
<dt><code>nodemask</code></dt>
|
||
<dd>
|
||
<p>
|
||
This element can be used to override the default
|
||
set of NUMA nodes where the memory would be
|
||
allocated.
|
||
</p>
|
||
</dd>
|
||
</dl>
|
||
|
||
<p>
|
||
For model <code>nvdimm</code> this element is mandatory and has a
|
||
single child element <code>path</code> that represents a path
|
||
in the host that backs the nvdimm module in the guest.
|
||
</p>
|
||
</dd>
|
||
|
||
<dt><code>target</code></dt>
|
||
<dd>
|
||
<p>
|
||
The mandatory <code>target</code> element configures the placement and
|
||
sizing of the added memory from the perspective of the guest.
|
||
</p>
|
||
<p>
|
||
The mandatory <code>size</code> subelement configures the size of the
|
||
added memory as a scaled integer.
|
||
</p>
|
||
<p>
|
||
The <code>node</code> subelement configures the guest NUMA node to
|
||
attach the memory to. The element shall be used only if the guest has
|
||
NUMA nodes configured.
|
||
</p>
|
||
<p>
|
||
For NVDIMM type devices one can optionally use
|
||
<code>label</code> and its subelement <code>size</code>
|
||
to configure the size of namespaces label storage
|
||
within the NVDIMM module. The <code>size</code> element
|
||
has usual meaning described
|
||
<a href="#elementsMemoryAllocation">here</a>.
|
||
For QEMU domains the following restrictions apply:
|
||
</p>
|
||
<ol>
|
||
<li>the minimum label size is 128KiB,</li>
|
||
<li>the remaining size (total-size - label-size) has to be aligned to
|
||
4KiB</li>
|
||
</ol>
|
||
</dd>
|
||
</dl>
|
||
|
||
<h4><a id="elementsIommu">IOMMU devices</a></h4>
|
||
|
||
<p>
|
||
The <code>iommu</code> element can be used to add an IOMMU device.
|
||
<span class="since">Since 2.1.0</span>
|
||
</p>
|
||
|
||
<p>
|
||
Example:
|
||
</p>
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<iommu model='intel'>
|
||
<driver intremap='on'/>
|
||
</iommu>
|
||
</devices>
|
||
...
|
||
</pre>
|
||
<dl>
|
||
<dt><code>model</code></dt>
|
||
<dd>
|
||
<p>
|
||
Currently only the <code>intel</code> model is supported.
|
||
</p>
|
||
</dd>
|
||
<dt><code>driver</code></dt>
|
||
<dd>
|
||
<p>
|
||
The <code>driver</code> subelement can be used to configure
|
||
additional options:
|
||
</p>
|
||
<dl>
|
||
<dt><code>intremap</code></dt>
|
||
<dd>
|
||
<p>
|
||
The <code>intremap</code> attribute with possible values
|
||
<code>on</code> and <code>off</code> can be used to
|
||
turn on interrupt remapping, a part of the VT-d functionality.
|
||
Currently this requires split I/O APIC
|
||
(<code><ioapic driver='qemu'/></code>).
|
||
<span class="since">Since 3.4.0</span> (QEMU/KVM only)
|
||
</p>
|
||
</dd>
|
||
<dt><code>caching_mode</code></dt>
|
||
<dd>
|
||
<p>
|
||
The <code>caching_mode</code> attribute with possible values
|
||
<code>on</code> and <code>off</code> can be used to
|
||
turn on the VT-d caching mode (useful for assigned devices).
|
||
<span class="since">Since 3.4.0</span> (QEMU/KVM only)
|
||
</p>
|
||
</dd>
|
||
<dt><code>eim</code></dt>
|
||
<dd>
|
||
<p>
|
||
The <code>eim</code> attribute (with possible values
|
||
<code>on</code> and <code>off</code>) can be used to
|
||
configure Extended Interrupt Mode. A q35 domain with
|
||
split I/O APIC (as described in
|
||
<a href="#elementsFeatures">hypervisor features</a>),
|
||
and both interrupt remapping and EIM turned on for
|
||
the IOMMU, will be able to use more than 255 vCPUs.
|
||
<span class="since">Since 3.4.0</span> (QEMU/KVM only)
|
||
</p>
|
||
</dd>
|
||
<dt><code>iotlb</code></dt>
|
||
<dd>
|
||
<p>
|
||
The <code>iotlb</code> attribute with possible values
|
||
<code>on</code> and <code>off</code> can be used to
|
||
turn on the IOTLB used to cache address translation
|
||
requests from devices.
|
||
<span class="since">Since 3.5.0</span> (QEMU/KVM only)
|
||
</p>
|
||
</dd>
|
||
</dl>
|
||
</dd>
|
||
</dl>
|
||
|
||
<h3><a id="vsock">Vsock</a></h3>
|
||
|
||
<p>A vsock host/guest interface. The <code>model</code> attribute
|
||
defaults to <code>virtio</code>.
|
||
The optional attribute <code>address</code> of the <code>cid</code>
|
||
element specifies the CID assigned to the guest. If the attribute
|
||
<code>auto</code> is set to <code>yes</code>, libvirt
|
||
will assign a free CID automatically on domain startup.
|
||
<span class="since">Since 4.4.0</span></p>
|
||
|
||
<pre>
|
||
...
|
||
<devices>
|
||
<vsock model='virtio'>
|
||
<cid auto='no' address='3'/>
|
||
</vsock>
|
||
</devices>
|
||
...</pre>
|
||
|
||
|
||
<h3><a id="seclabel">Security label</a></h3>
|
||
|
||
<p>
|
||
The <code>seclabel</code> element allows control over the
|
||
operation of the security drivers. There are three basic
|
||
modes of operation, 'dynamic' where libvirt automatically
|
||
generates a unique security label, 'static' where the
|
||
application/administrator chooses the labels, or 'none'
|
||
where confinement is disabled. With dynamic
|
||
label generation, libvirt will always automatically
|
||
relabel any resources associated with the virtual machine.
|
||
With static label assignment, by default, the administrator
|
||
or application must ensure labels are set correctly on any
|
||
resources, however, automatic relabeling can be enabled
|
||
if desired. <span class="since">'dynamic' since 0.6.1, 'static'
|
||
since 0.6.2, and 'none' since 0.9.10.</span>
|
||
</p>
|
||
|
||
<p>
|
||
If more than one security driver is used by libvirt, multiple
|
||
<code>seclabel</code> tags can be used, one for each driver and
|
||
the security driver referenced by each tag can be defined using
|
||
the attribute <code>model</code>
|
||
</p>
|
||
|
||
<p>
|
||
Valid input XML configurations for the top-level security label
|
||
are:
|
||
</p>
|
||
|
||
<pre>
|
||
<seclabel type='dynamic' model='selinux'/>
|
||
|
||
<seclabel type='dynamic' model='selinux'>
|
||
<baselabel>system_u:system_r:my_svirt_t:s0</baselabel>
|
||
</seclabel>
|
||
|
||
<seclabel type='static' model='selinux' relabel='no'>
|
||
<label>system_u:system_r:svirt_t:s0:c392,c662</label>
|
||
</seclabel>
|
||
|
||
<seclabel type='static' model='selinux' relabel='yes'>
|
||
<label>system_u:system_r:svirt_t:s0:c392,c662</label>
|
||
</seclabel>
|
||
|
||
<seclabel type='none'/>
|
||
</pre>
|
||
|
||
<p>
|
||
If no 'type' attribute is provided in the input XML, then
|
||
the security driver default setting will be used, which
|
||
may be either 'none' or 'dynamic'. If a 'baselabel' is set
|
||
but no 'type' is set, then the type is presumed to be 'dynamic'
|
||
</p>
|
||
|
||
<p>
|
||
When viewing the XML for a running guest with automatic
|
||
resource relabeling active, an additional XML element,
|
||
<code>imagelabel</code>, will be included. This is an
|
||
output-only element, so will be ignored in user supplied
|
||
XML documents
|
||
</p>
|
||
<dl>
|
||
<dt><code>type</code></dt>
|
||
<dd>Either <code>static</code>, <code>dynamic</code> or <code>none</code>
|
||
to determine whether libvirt automatically generates a unique security
|
||
label or not.
|
||
</dd>
|
||
<dt><code>model</code></dt>
|
||
<dd>A valid security model name, matching the currently
|
||
activated security model
|
||
</dd>
|
||
<dt><code>relabel</code></dt>
|
||
<dd>Either <code>yes</code> or <code>no</code>. This must always
|
||
be <code>yes</code> if dynamic label assignment is used. With
|
||
static label assignment it will default to <code>no</code>.
|
||
</dd>
|
||
<dt><code>label</code></dt>
|
||
<dd>If static labelling is used, this must specify the full
|
||
security label to assign to the virtual domain. The format
|
||
of the content depends on the security driver in use:
|
||
<ul>
|
||
<li>SELinux: a SELinux context.</li>
|
||
<li>AppArmor: an AppArmor profile.</li>
|
||
<li>
|
||
DAC: owner and group separated by colon. They can be
|
||
defined both as user/group names or uid/gid. The driver will first
|
||
try to parse these values as names, but a leading plus sign can
|
||
used to force the driver to parse them as uid or gid.
|
||
</li>
|
||
</ul>
|
||
</dd>
|
||
<dt><code>baselabel</code></dt>
|
||
<dd>If dynamic labelling is used, this can optionally be
|
||
used to specify the base security label that will be used to generate
|
||
the actual label. The format of the content depends on the security
|
||
driver in use.
|
||
|
||
The SELinux driver uses only the <code>type</code> field of the
|
||
baselabel in the generated label. Other fields are inherited from
|
||
the parent process when using SELinux baselabels.
|
||
|
||
(The example above demonstrates the use of <code>my_svirt_t</code>
|
||
as the value for the <code>type</code> field.)
|
||
</dd>
|
||
<dt><code>imagelabel</code></dt>
|
||
<dd>This is an output only element, which shows the
|
||
security label used on resources associated with the virtual domain.
|
||
The format of the content depends on the security driver in use
|
||
</dd>
|
||
</dl>
|
||
|
||
<p>When relabeling is in effect, it is also possible to fine-tune
|
||
the labeling done for specific source file names, by either
|
||
disabling the labeling (useful if the file lives on NFS or other
|
||
file system that lacks security labeling) or requesting an
|
||
alternate label (useful when a management application creates a
|
||
special label to allow sharing of some, but not all, resources
|
||
between domains), <span class="since">since 0.9.9</span>. When
|
||
a <code>seclabel</code> element is attached to a specific path
|
||
rather than the top-level domain assignment, only the
|
||
attribute <code>relabel</code> or the
|
||
sub-element <code>label</code> are supported. Additionally,
|
||
<span class="since">since 1.1.2</span>, an output-only
|
||
element <code>labelskip</code> will be present for active
|
||
domains on disks where labeling was skipped due to the image
|
||
being on a file system that lacks security labeling.
|
||
</p>
|
||
|
||
<h3><a id="keywrap">Key Wrap</a></h3>
|
||
|
||
<p>The content of the optional <code>keywrap</code> element specifies
|
||
whether the guest will be allowed to perform the S390 cryptographic key
|
||
management operations. A clear key can be protected by encrypting it
|
||
under a unique wrapping key that is generated for each guest VM running
|
||
on the host. Two variations of wrapping keys are generated: one version
|
||
for encrypting protected keys using the DEA/TDEA algorithm, and another
|
||
version for keys encrypted using the AES algorithm. If a
|
||
<code>keywrap</code> element is not included, the guest will be granted
|
||
access to both AES and DEA/TDEA key wrapping by default.</p>
|
||
|
||
<pre>
|
||
<domain>
|
||
...
|
||
<keywrap>
|
||
<cipher name='aes' state='off'/>
|
||
</keywrap>
|
||
...
|
||
</domain>
|
||
</pre>
|
||
<p>
|
||
At least one <code>cipher</code> element must be nested within the
|
||
<code>keywrap</code> element.
|
||
</p>
|
||
<dl>
|
||
<dt><code>cipher</code></dt>
|
||
<dd>The <code>name</code> attribute identifies the algorithm
|
||
for encrypting a protected key. The values supported for this attribute
|
||
are <code>aes</code> for encryption under the AES wrapping key, or
|
||
<code>dea</code> for encryption under the DEA/TDEA wrapping key. The
|
||
<code>state</code> attribute indicates whether the cryptographic key
|
||
management operations should be turned on for the specified encryption
|
||
algorithm. The value can be set to <code>on</code> or <code>off</code>.
|
||
</dd>
|
||
</dl>
|
||
|
||
<p>Note: DEA/TDEA is synonymous with DES/TDES.</p>
|
||
|
||
<h3><a id="sev">Launch Security</a></h3>
|
||
|
||
<p>
|
||
The contents of the <code><launchSecurity type='sev'></code> element
|
||
is used to provide the guest owners input used for creating an encrypted
|
||
VM using the AMD SEV feature (Secure Encrypted Virtualization).
|
||
|
||
SEV is an extension to the AMD-V architecture which supports running
|
||
encrypted virtual machine (VMs) under the control of KVM. Encrypted
|
||
VMs have their pages (code and data) secured such that only the guest
|
||
itself has access to the unencrypted version. Each encrypted VM is
|
||
associated with a unique encryption key; if its data is accessed to a
|
||
different entity using a different key the encrypted guests data will
|
||
be incorrectly decrypted, leading to unintelligible data.
|
||
|
||
For more information see various input parameters and its format see the SEV API spec
|
||
<a href="https://support.amd.com/TechDocs/55766_SEV-KM%20API_Specification.pdf"> https://support.amd.com/TechDocs/55766_SEV-KM%20API_Specification.pdf </a>
|
||
<span class="since">Since 4.4.0</span>
|
||
</p>
|
||
<pre>
|
||
<domain>
|
||
...
|
||
<launchSecurity type='sev'>
|
||
<policy>0x0001</policy>
|
||
<cbitpos>47</cbitpos>
|
||
<reducedPhysBits>1</reducedPhysBits>
|
||
<dhCert>RBBBSDDD=FDDCCCDDDG</dhCert>
|
||
<session>AAACCCDD=FFFCCCDSDS</session>
|
||
</launchSecurity>
|
||
...
|
||
</domain>
|
||
</pre>
|
||
|
||
<dl>
|
||
<dt><code>cbitpos</code></dt>
|
||
<dd>The required <code>cbitpos</code> element provides the C-bit (aka encryption bit)
|
||
location in guest page table entry. The value of <code>cbitpos</code> is
|
||
hypervisor dependent and can be obtained through the <code>sev</code> element
|
||
from the domain capabilities.
|
||
</dd>
|
||
<dt><code>reducedPhysBits</code></dt>
|
||
<dd>The required <code>reducedPhysBits</code> element provides the physical
|
||
address bit reducation. Similar to <code>cbitpos</code> the value of <code>
|
||
reduced-phys-bit</code> is hypervisor dependent and can be obtained
|
||
through the <code>sev</code> element from the domain capabilities.
|
||
</dd>
|
||
<dt><code>policy</code></dt>
|
||
<dd>The required <code>policy</code> element provides the guest policy
|
||
which must be maintained by the SEV firmware. This policy is enforced by
|
||
the firmware and restricts what configuration and operational commands
|
||
can be performed on this guest by the hypervisor. The guest policy
|
||
provided during guest launch is bound to the guest and cannot be changed
|
||
throughout the lifetime of the guest. The policy is also transmitted
|
||
during snapshot and migration flows and enforced on the destination platform.
|
||
|
||
The guest policy is a 4 unsigned byte with the fields shown in Table:
|
||
|
||
<table class="top_table">
|
||
<tr>
|
||
<th> Bit(s) </th>
|
||
<th> Description </th>
|
||
</tr>
|
||
<tr>
|
||
<td> 0 </td>
|
||
<td> Debugging of the guest is disallowed when set </td>
|
||
</tr>
|
||
<tr>
|
||
<td> 1 </td>
|
||
<td> Sharing keys with other guests is disallowed when set </td>
|
||
</tr>
|
||
<tr>
|
||
<td> 2 </td>
|
||
<td> SEV-ES is required when set</td>
|
||
</tr>
|
||
<tr>
|
||
<td> 3 </td>
|
||
<td> Sending the guest to another platform is disallowed when set</td>
|
||
</tr>
|
||
<tr>
|
||
<td> 4 </td>
|
||
<td> The guest must not be transmitted to another platform that is
|
||
not in the domain when set. </td>
|
||
</tr>
|
||
<tr>
|
||
<td> 5 </td>
|
||
<td> The guest must not be transmitted to another platform that is
|
||
not SEV capable when set. </td>
|
||
</tr>
|
||
<tr>
|
||
<td> 6:15 </td>
|
||
<td> reserved </td>
|
||
</tr>
|
||
<tr>
|
||
<td> 16:32 </td>
|
||
<td> The guest must not be transmitted to another platform with a
|
||
lower firmware version. </td>
|
||
</tr>
|
||
</table>
|
||
|
||
</dd>
|
||
<dt><code>dhCert</code></dt>
|
||
<dd>The optional <code>dhCert</code> element provides the guest owners
|
||
base64 encoded Diffie-Hellman (DH) key. The key is used to negotiate a
|
||
master secret key between the SEV firmware and guest owner. This master
|
||
secret key is then used to establish a trusted channel between SEV
|
||
firmware and guest owner.
|
||
</dd>
|
||
<dt><code>session</code></dt>
|
||
<dd>The optional <code>session</code> element provides the guest owners
|
||
base64 encoded session blob defined in the SEV API spec.
|
||
|
||
See SEV spec LAUNCH_START section for the session blob format.
|
||
</dd>
|
||
</dl>
|
||
|
||
<h2><a id="examples">Example configs</a></h2>
|
||
|
||
<p>
|
||
Example configurations for each driver are provide on the
|
||
driver specific pages listed below
|
||
</p>
|
||
|
||
<ul>
|
||
<li><a href="drvxen.html#xmlconfig">Xen examples</a></li>
|
||
<li><a href="drvqemu.html#xmlconfig">QEMU/KVM examples</a></li>
|
||
</ul>
|
||
</body>
|
||
</html>
|