mirror of
https://gitlab.com/libvirt/libvirt.git
synced 2025-01-08 22:15:21 +00:00
347de9b3c0
Signed-off-by: Han Han <hhan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
9850 lines
416 KiB
XML
9850 lines
416 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE html>
|
|
<html xmlns="http://www.w3.org/1999/xhtml">
|
|
<body>
|
|
<h1>Domain XML format</h1>
|
|
|
|
<ul id="toc"></ul>
|
|
|
|
<p>
|
|
This section describes the XML format used to represent domains, there are
|
|
variations on the format based on the kind of domains run and the options
|
|
used to launch them. For hypervisor specific details consult the
|
|
<a href="drivers.html">driver docs</a>
|
|
</p>
|
|
|
|
|
|
<h2><a id="elements">Element and attribute overview</a></h2>
|
|
|
|
<p>
|
|
The root element required for all virtual machines is
|
|
named <code>domain</code>. It has two attributes, the
|
|
<a id="attributeDomainType"><code>type</code></a>
|
|
specifies the hypervisor used for running
|
|
the domain. The allowed values are driver specific, but
|
|
include "xen", "kvm", "qemu" and "lxc". The
|
|
second attribute is <code>id</code> which is a unique
|
|
integer identifier for the running guest machine. Inactive
|
|
machines have no id value.
|
|
</p>
|
|
|
|
|
|
<h3><a id="elementsMetadata">General metadata</a></h3>
|
|
|
|
<pre>
|
|
<domain type='kvm' id='1'>
|
|
<name>MyGuest</name>
|
|
<uuid>4dea22b3-1d52-d8f3-2516-782e98ab3fa0</uuid>
|
|
<genid>43dc0cf8-809b-4adb-9bea-a9abb5f3d90e</genid>
|
|
<title>A short description - title - of the domain</title>
|
|
<description>Some human readable description</description>
|
|
<metadata>
|
|
<app1:foo xmlns:app1="http://app1.org/app1/">..</app1:foo>
|
|
<app2:bar xmlns:app2="http://app1.org/app2/">..</app2:bar>
|
|
</metadata>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><code>name</code></dt>
|
|
<dd>The content of the <code>name</code> element provides
|
|
a short name for the virtual machine. This name should
|
|
consist only of alpha-numeric characters and is required
|
|
to be unique within the scope of a single host. It is
|
|
often used to form the filename for storing the persistent
|
|
configuration file. <span class="since">Since 0.0.1</span></dd>
|
|
<dt><code>uuid</code></dt>
|
|
<dd>The content of the <code>uuid</code> element provides
|
|
a globally unique identifier for the virtual machine.
|
|
The format must be RFC 4122 compliant,
|
|
eg <code>3e3fce45-4f53-4fa7-bb32-11f34168b82b</code>.
|
|
If omitted when defining/creating a new machine, a random
|
|
UUID is generated. It is also possible to provide the UUID
|
|
via a <a href="#elementsSysinfo"><code>sysinfo</code></a>
|
|
specification. <span class="since">Since 0.0.1, sysinfo
|
|
since 0.8.7</span></dd>
|
|
|
|
<dt><code>genid</code></dt>
|
|
<dd><span class="since">Since 4.4.0</span>, the <code>genid</code>
|
|
element can be used to add a Virtual Machine Generation ID which
|
|
exposes a 128-bit, cryptographically random, integer value identifier,
|
|
referred to as a Globally Unique Identifier (GUID) using the same
|
|
format as the <code>uuid</code>. The value is used to help notify
|
|
the guest operating system when the virtual machine is re-executing
|
|
something that has already executed before, such as:
|
|
|
|
<ul>
|
|
<li>VM starts executing a snapshot</li>
|
|
<li>VM is recovered from backup</li>
|
|
<li>VM is failover in a disaster recovery environment</li>
|
|
<li>VM is imported, copied, or cloned</li>
|
|
</ul>
|
|
|
|
The guest operating system notices the change and is then able to
|
|
react as appropriate by marking its copies of distributed databases
|
|
as dirty, re-initializing its random number generator, etc.
|
|
|
|
<p>
|
|
The libvirt XML parser will accept both a provided GUID value
|
|
or just <genid/> in which case a GUID will be generated
|
|
and saved in the XML. For the transitions such as above, libvirt
|
|
will change the GUID before re-executing.</p></dd>
|
|
|
|
<dt><code>title</code></dt>
|
|
<dd>The optional element <code>title</code> provides space for a
|
|
short description of the domain. The title should not contain
|
|
any newlines. <span class="since">Since 0.9.10</span>.</dd>
|
|
|
|
<dt><code>description</code></dt>
|
|
<dd>The content of the <code>description</code> element provides a
|
|
human readable description of the virtual machine. This data is not
|
|
used by libvirt in any way, it can contain any information the user
|
|
wants. <span class="since">Since 0.7.2</span></dd>
|
|
|
|
<dt><code>metadata</code></dt>
|
|
<dd>The <code>metadata</code> node can be used by applications
|
|
to store custom metadata in the form of XML
|
|
nodes/trees. Applications must use custom namespaces on their
|
|
XML nodes/trees, with only one top-level element per namespace
|
|
(if the application needs structure, they should have
|
|
sub-elements to their namespace
|
|
element). <span class="since">Since 0.9.10</span></dd>
|
|
</dl>
|
|
|
|
<h3><a id="elementsOS">Operating system booting</a></h3>
|
|
|
|
<p>
|
|
There are a number of different ways to boot virtual machines
|
|
each with their own pros and cons.
|
|
</p>
|
|
|
|
<h4><a id="elementsOSBIOS">BIOS bootloader</a></h4>
|
|
|
|
<p>
|
|
Booting via the BIOS is available for hypervisors supporting
|
|
full virtualization. In this case the BIOS has a boot order
|
|
priority (floppy, harddisk, cdrom, network) determining where
|
|
to obtain/find the boot image.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<os firmware='efi'>
|
|
<type>hvm</type>
|
|
<loader readonly='yes' secure='no' type='rom'>/usr/lib/xen/boot/hvmloader</loader>
|
|
<nvram template='/usr/share/OVMF/OVMF_VARS.fd'>/var/lib/libvirt/nvram/guest_VARS.fd</nvram>
|
|
<boot dev='hd'/>
|
|
<boot dev='cdrom'/>
|
|
<bootmenu enable='yes' timeout='3000'/>
|
|
<smbios mode='sysinfo'/>
|
|
<bios useserial='yes' rebootTimeout='0'/>
|
|
</os>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><code>firmware</code></dt>
|
|
<dd>The <code>firmware</code> attribute allows management
|
|
applications to automatically fill <code><loader/></code>
|
|
and <code><nvram/></code> elements and possibly enable
|
|
some features required by selected firmware. Accepted values are
|
|
<code>bios</code> and <code>efi</code>.<br/>
|
|
The selection process scans for files describing installed
|
|
firmware images in specified location and uses the most specific
|
|
one which fulfils domain requirements. The locations in order of
|
|
preference (from generic to most specific one) are:
|
|
<ul>
|
|
<li><code>/usr/share/qemu/firmware</code></li>
|
|
<li><code>/etc/qemu/firmware</code></li>
|
|
<li><code>$XDG_CONFIG_HOME/qemu/firmware</code></li>
|
|
</ul>
|
|
For more information refer to firmware metadata specification as
|
|
described in <code>docs/interop/firmware.json</code> in QEMU
|
|
repository. Regular users do not need to bother.
|
|
<span class="since">Since 5.2.0 (QEMU and KVM only)</span><br/>
|
|
For VMware guests, this is set to <code>efi</code> when the guest
|
|
uses UEFI, and it is not set when using BIOS.
|
|
<span class="since">Since 5.3.0 (VMware ESX and Workstation/Player)</span>
|
|
</dd>
|
|
<dt><code>type</code></dt>
|
|
<dd>The content of the <code>type</code> element specifies the
|
|
type of operating system to be booted in the virtual machine.
|
|
<code>hvm</code> indicates that the OS is one designed to run
|
|
on bare metal, so requires full virtualization. <code>linux</code>
|
|
(badly named!) refers to an OS that supports the Xen 3 hypervisor
|
|
guest ABI. There are also two optional attributes, <code>arch</code>
|
|
specifying the CPU architecture to virtualization,
|
|
and <a id="attributeOSTypeMachine"><code>machine</code></a> referring
|
|
to the machine type. The <a href="formatcaps.html">Capabilities XML</a>
|
|
provides details on allowed values for
|
|
these. If <code>arch</code> is omitted then for most hypervisor
|
|
drivers, the host native arch will be chosen. For the <code>test</code>,
|
|
<code>ESX</code> and <code>VMWare</code> hypervisor drivers, however,
|
|
the <code>i686</code> arch will always be chosen even on an
|
|
<code>x86_64</code> host. <span class="since">Since 0.0.1</span></dd>
|
|
<dt><a id="elementLoader"><code>loader</code></a></dt>
|
|
<dd>The optional <code>loader</code> tag refers to a firmware blob,
|
|
which is specified by absolute path,
|
|
used to assist the domain creation process. It is used by Xen
|
|
fully virtualized domains as well as setting the QEMU BIOS file
|
|
path for QEMU/KVM domains. <span class="since">Xen since 0.1.0,
|
|
QEMU/KVM since 0.9.12</span> Then, <span class="since">since
|
|
1.2.8</span> it's possible for the element to have two
|
|
optional attributes: <code>readonly</code> (accepted values are
|
|
<code>yes</code> and <code>no</code>) to reflect the fact that the
|
|
image should be writable or read-only. The second attribute
|
|
<code>type</code> accepts values <code>rom</code> and
|
|
<code>pflash</code>. It tells the hypervisor where in the guest
|
|
memory the file should be mapped. For instance, if the loader
|
|
path points to an UEFI image, <code>type</code> should be
|
|
<code>pflash</code>. Moreover, some firmwares may
|
|
implement the Secure boot feature. Attribute
|
|
<code>secure</code> can be used then to control it.
|
|
<span class="since">Since 2.1.0</span></dd>
|
|
<dt><code>nvram</code></dt>
|
|
<dd>Some UEFI firmwares may want to use a non-volatile memory to store
|
|
some variables. In the host, this is represented as a file and the
|
|
absolute path to the file is stored in this element. Moreover, when the
|
|
domain is started up libvirt copies so called master NVRAM store file
|
|
defined in <code>qemu.conf</code>. If needed, the <code>template</code>
|
|
attribute can be used to per domain override map of master NVRAM stores
|
|
from the config file. Note, that for transient domains if the NVRAM file
|
|
has been created by libvirt it is left behind and it is management
|
|
application's responsibility to save and remove file (if needed to be
|
|
persistent). <span class="since">Since 1.2.8</span></dd>
|
|
<dt><code>boot</code></dt>
|
|
<dd>The <code>dev</code> attribute takes one of the values "fd", "hd",
|
|
"cdrom" or "network" and is used to specify the next boot device
|
|
to consider. The <code>boot</code> element can be repeated multiple
|
|
times to setup a priority list of boot devices to try in turn.
|
|
Multiple devices of the same type are sorted according to their
|
|
targets while preserving the order of buses. After defining the
|
|
domain, its XML configuration returned by libvirt (through
|
|
virDomainGetXMLDesc) lists devices in the sorted order. Once sorted,
|
|
the first device is marked as bootable. Thus, e.g., a domain
|
|
configured to boot from "hd" with vdb, hda, vda, and hdc disks
|
|
assigned to it will boot from vda (the sorted list is vda, vdb, hda,
|
|
hdc). Similar domain with hdc, vda, vdb, and hda disks will boot from
|
|
hda (sorted disks are: hda, hdc, vda, vdb). It can be tricky to
|
|
configure in the desired way, which is why per-device boot elements
|
|
(see <a href="#elementsDisks">disks</a>,
|
|
<a href="#elementsNICS">network interfaces</a>, and
|
|
<a href="#elementsHostDev">USB and PCI devices</a> sections below) were
|
|
introduced and they are the preferred way providing full control over
|
|
booting order. The <code>boot</code> element and per-device boot
|
|
elements are mutually exclusive. <span class="since">Since 0.1.3,
|
|
per-device boot since 0.8.8</span>
|
|
</dd>
|
|
<dt><code>smbios</code></dt>
|
|
<dd>How to populate SMBIOS information visible in the guest.
|
|
The <code>mode</code> attribute must be specified, and is either
|
|
"emulate" (let the hypervisor generate all values), "host" (copy
|
|
all of Block 0 and Block 1, except for the UUID, from the host's
|
|
SMBIOS values;
|
|
the <a href="html/libvirt-libvirt-host.html#virConnectGetSysinfo">
|
|
<code>virConnectGetSysinfo</code></a> call can be
|
|
used to see what values are copied), or "sysinfo" (use the values in
|
|
the <a href="#elementsSysinfo">sysinfo</a> element). If not
|
|
specified, the hypervisor default is used. <span class="since">
|
|
Since 0.8.7</span>
|
|
</dd>
|
|
</dl>
|
|
<p>Up till here the BIOS/UEFI configuration knobs are generic enough to
|
|
be implemented by majority (if not all) firmwares out there. However,
|
|
from now on not every single setting makes sense to all firmwares. For
|
|
instance, <code>rebootTimeout</code> doesn't make sense for UEFI,
|
|
<code>useserial</code> might not be usable with a BIOS firmware that
|
|
doesn't produce any output onto serial line, etc. Moreover, firmwares
|
|
don't usually export their capabilities for libvirt (or users) to check.
|
|
And the set of their capabilities can change with every new release.
|
|
Hence users are advised to try the settings they use before relying on
|
|
them in production.</p>
|
|
<dl>
|
|
<dt><code>bootmenu</code></dt>
|
|
<dd> Whether or not to enable an interactive boot menu prompt on guest
|
|
startup. The <code>enable</code> attribute can be either "yes" or "no".
|
|
If not specified, the hypervisor default is used. <span class="since">
|
|
Since 0.8.3</span>
|
|
Additional attribute <code>timeout</code> takes the number of milliseconds
|
|
the boot menu should wait until it times out. Allowed values are numbers
|
|
in range [0, 65535] inclusive and it is ignored unless <code>enable</code>
|
|
is set to "yes". <span class="since">Since 1.2.8</span>
|
|
</dd>
|
|
<dt><code>bios</code></dt>
|
|
<dd>This element has attribute <code>useserial</code> with possible
|
|
values <code>yes</code> or <code>no</code>. It enables or disables
|
|
Serial Graphics Adapter which allows users to see BIOS messages
|
|
on a serial port. Therefore, one needs to have
|
|
<a href="#elementCharSerial">serial port</a> defined.
|
|
<span class="since">Since 0.9.4</span>.
|
|
<span class="since">Since 0.10.2 (QEMU only)</span> there is
|
|
another attribute, <code>rebootTimeout</code> that controls
|
|
whether and after how long the guest should start booting
|
|
again in case the boot fails (according to BIOS). The value is
|
|
in milliseconds with maximum of <code>65535</code> and special
|
|
value <code>-1</code> disables the reboot.
|
|
</dd>
|
|
</dl>
|
|
|
|
<h4><a id="elementsOSBootloader">Host bootloader</a></h4>
|
|
|
|
<p>
|
|
Hypervisors employing paravirtualization do not usually emulate
|
|
a BIOS, and instead the host is responsible to kicking off the
|
|
operating system boot. This may use a pseudo-bootloader in the
|
|
host to provide an interface to choose a kernel for the guest.
|
|
An example is <code>pygrub</code> with Xen. The Bhyve hypervisor
|
|
also uses a host bootloader, either <code>bhyveload</code> or
|
|
<code>grub-bhyve</code>.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<bootloader>/usr/bin/pygrub</bootloader>
|
|
<bootloader_args>--append single</bootloader_args>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><code>bootloader</code></dt>
|
|
<dd>The content of the <code>bootloader</code> element provides
|
|
a fully qualified path to the bootloader executable in the
|
|
host OS. This bootloader will be run to choose which kernel
|
|
to boot. The required output of the bootloader is dependent
|
|
on the hypervisor in use. <span class="since">Since 0.1.0</span></dd>
|
|
<dt><code>bootloader_args</code></dt>
|
|
<dd>The optional <code>bootloader_args</code> element allows
|
|
command line arguments to be passed to the bootloader.
|
|
<span class="since">Since 0.2.3</span>
|
|
</dd>
|
|
|
|
</dl>
|
|
|
|
<h4><a id="elementsOSKernel">Direct kernel boot</a></h4>
|
|
|
|
<p>
|
|
When installing a new guest OS it is often useful to boot directly
|
|
from a kernel and initrd stored in the host OS, allowing command
|
|
line arguments to be passed directly to the installer. This capability
|
|
is usually available for both para and full virtualized guests.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<os>
|
|
<type>hvm</type>
|
|
<loader>/usr/lib/xen/boot/hvmloader</loader>
|
|
<kernel>/root/f8-i386-vmlinuz</kernel>
|
|
<initrd>/root/f8-i386-initrd</initrd>
|
|
<cmdline>console=ttyS0 ks=http://example.com/f8-i386/os/</cmdline>
|
|
<dtb>/root/ppc.dtb</dtb>
|
|
<acpi>
|
|
<table type='slic'>/path/to/slic.dat</table>
|
|
</acpi>
|
|
</os>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><code>type</code></dt>
|
|
<dd>This element has the same semantics as described earlier in the
|
|
<a href="#elementsOSBIOS">BIOS boot section</a></dd>
|
|
<dt><code>loader</code></dt>
|
|
<dd>This element has the same semantics as described earlier in the
|
|
<a href="#elementsOSBIOS">BIOS boot section</a></dd>
|
|
<dt><code>kernel</code></dt>
|
|
<dd>The contents of this element specify the fully-qualified path
|
|
to the kernel image in the host OS.</dd>
|
|
<dt><code>initrd</code></dt>
|
|
<dd>The contents of this element specify the fully-qualified path
|
|
to the (optional) ramdisk image in the host OS.</dd>
|
|
<dt><code>cmdline</code></dt>
|
|
<dd>The contents of this element specify arguments to be passed to
|
|
the kernel (or installer) at boot time. This is often used to
|
|
specify an alternate primary console (eg serial port), or the
|
|
installation media source / kickstart file</dd>
|
|
<dt><code>dtb</code></dt>
|
|
<dd>The contents of this element specify the fully-qualified path
|
|
to the (optional) device tree binary (dtb) image in the host OS.
|
|
<span class="since">Since 1.0.4</span></dd>
|
|
<dt><code>acpi</code></dt>
|
|
<dd>The <code>table</code> element contains a fully-qualified path
|
|
to the ACPI table. The <code>type</code> attribute contains the
|
|
ACPI table type (currently only <code>slic</code> is supported)
|
|
<span class="since">Since 1.3.5 (QEMU)</span>
|
|
<span class="since">Since 5.9.0 (Xen)</span></dd>
|
|
</dl>
|
|
|
|
<h4><a id="elementsOSContainer">Container boot</a></h4>
|
|
|
|
<p>
|
|
When booting a domain using container based virtualization, instead
|
|
of a kernel / boot image, a path to the init binary is required, using
|
|
the <code>init</code> element. By default this will be launched with
|
|
no arguments. To specify the initial argv, use the <code>initarg</code>
|
|
element, repeated as many time as is required. The <code>cmdline</code>
|
|
element, if set will be used to provide an equivalent to <code>/proc/cmdline</code>
|
|
but will not affect init argv.
|
|
</p>
|
|
<p>
|
|
To set environment variables, use the <code>initenv</code> element, one
|
|
for each variable.
|
|
</p>
|
|
<p>
|
|
To set a custom work directory for the init, use the <code>initdir</code>
|
|
element.
|
|
</p>
|
|
<p>
|
|
To run the init command as a given user or group, use the <code>inituser</code>
|
|
or <code>initgroup</code> elements respectively. Both elements can be provided
|
|
either a user (resp. group) id or a name. Prefixing the user or group id with
|
|
a <code>+</code> will force it to be considered like a numeric value. Without
|
|
this, it will be first tried as a user or group name.
|
|
</p>
|
|
|
|
<pre>
|
|
<os>
|
|
<type arch='x86_64'>exe</type>
|
|
<init>/bin/systemd</init>
|
|
<initarg>--unit</initarg>
|
|
<initarg>emergency.service</initarg>
|
|
<initenv name='MYENV'>some value</initenv>
|
|
<initdir>/my/custom/cwd</initdir>
|
|
<inituser>tester</inituser>
|
|
<initgroup>1000</initgroup>
|
|
</os>
|
|
</pre>
|
|
|
|
|
|
<p>
|
|
If you want to enable user namespace, set the <code>idmap</code> element.
|
|
The <code>uid</code> and <code>gid</code> elements have three attributes:
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>start</code></dt>
|
|
<dd>First user ID in container. It must be '0'.</dd>
|
|
<dt><code>target</code></dt>
|
|
<dd>The first user ID in container will be mapped to this target user
|
|
ID in host.</dd>
|
|
<dt><code>count</code></dt>
|
|
<dd>How many users in container are allowed to map to host's user.</dd>
|
|
</dl>
|
|
|
|
<pre>
|
|
<idmap>
|
|
<uid start='0' target='1000' count='10'/>
|
|
<gid start='0' target='1000' count='10'/>
|
|
</idmap>
|
|
</pre>
|
|
|
|
|
|
<h3><a id="elementsSysinfo">SMBIOS System Information</a></h3>
|
|
|
|
<p>
|
|
Some hypervisors allow control over what system information is
|
|
presented to the guest (for example, SMBIOS fields can be
|
|
populated by a hypervisor and inspected via
|
|
the <code>dmidecode</code> command in the guest). The
|
|
optional <code>sysinfo</code> element covers all such categories
|
|
of information. <span class="since">Since 0.8.7</span>
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<os>
|
|
<smbios mode='sysinfo'/>
|
|
...
|
|
</os>
|
|
<sysinfo type='smbios'>
|
|
<bios>
|
|
<entry name='vendor'>LENOVO</entry>
|
|
</bios>
|
|
<system>
|
|
<entry name='manufacturer'>Fedora</entry>
|
|
<entry name='product'>Virt-Manager</entry>
|
|
<entry name='version'>0.9.4</entry>
|
|
</system>
|
|
<baseBoard>
|
|
<entry name='manufacturer'>LENOVO</entry>
|
|
<entry name='product'>20BE0061MC</entry>
|
|
<entry name='version'>0B98401 Pro</entry>
|
|
<entry name='serial'>W1KS427111E</entry>
|
|
</baseBoard>
|
|
<chassis>
|
|
<entry name='manufacturer'>Dell Inc.</entry>
|
|
<entry name='version'>2.12</entry>
|
|
<entry name='serial'>65X0XF2</entry>
|
|
<entry name='asset'>40000101</entry>
|
|
<entry name='sku'>Type3Sku1</entry>
|
|
</chassis>
|
|
<oemStrings>
|
|
<entry>myappname:some arbitrary data</entry>
|
|
<entry>otherappname:more arbitrary data</entry>
|
|
</oemStrings>
|
|
</sysinfo>
|
|
<sysinfo type='fwcfg'>
|
|
<entry name='opt/com.example/name'>example value</entry>
|
|
<entry name='opt/com.coreos/config' file='/tmp/provision.ign'/>
|
|
</sysinfo>
|
|
...</pre>
|
|
|
|
<p>
|
|
The <code>sysinfo</code> element has a mandatory
|
|
attribute <code>type</code> that determine the layout of
|
|
sub-elements, with supported values of:
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>smbios</code></dt>
|
|
<dd>Sub-elements call out specific SMBIOS values, which will
|
|
affect the guest if used in conjunction with
|
|
the <code>smbios</code> sub-element of
|
|
the <a href="#elementsOS"><code>os</code></a> element. Each
|
|
sub-element of <code>sysinfo</code> names a SMBIOS block, and
|
|
within those elements can be a list of <code>entry</code>
|
|
elements that describe a field within the block. The following
|
|
blocks and entries are recognized:
|
|
<dl>
|
|
<dt><code>bios</code></dt>
|
|
<dd>
|
|
This is block 0 of SMBIOS, with entry names drawn from:
|
|
<dl>
|
|
<dt><code>vendor</code></dt>
|
|
<dd>BIOS Vendor's Name</dd>
|
|
<dt><code>version</code></dt>
|
|
<dd>BIOS Version</dd>
|
|
<dt><code>date</code></dt>
|
|
<dd>BIOS release date. If supplied, is in either mm/dd/yy or
|
|
mm/dd/yyyy format. If the year portion of the string is
|
|
two digits, the year is assumed to be 19yy.</dd>
|
|
<dt><code>release</code></dt>
|
|
<dd>System BIOS Major and Minor release number values
|
|
concatenated together as one string separated by
|
|
a period, for example, 10.22.</dd>
|
|
</dl>
|
|
</dd>
|
|
<dt><code>system</code></dt>
|
|
<dd>
|
|
This is block 1 of SMBIOS, with entry names drawn from:
|
|
<dl>
|
|
<dt><code>manufacturer</code></dt>
|
|
<dd>Manufacturer of BIOS</dd>
|
|
<dt><code>product</code></dt>
|
|
<dd>Product Name</dd>
|
|
<dt><code>version</code></dt>
|
|
<dd>Version of the product</dd>
|
|
<dt><code>serial</code></dt>
|
|
<dd>Serial number</dd>
|
|
<dt><code>uuid</code></dt>
|
|
<dd>Universal Unique ID number. If this entry is provided
|
|
alongside a top-level
|
|
<a href="#elementsMetadata"><code>uuid</code></a> element,
|
|
then the two values must match.</dd>
|
|
<dt><code>sku</code></dt>
|
|
<dd>SKU number to identify a particular configuration.</dd>
|
|
<dt><code>family</code></dt>
|
|
<dd>Identify the family a particular computer belongs to.</dd>
|
|
</dl>
|
|
</dd>
|
|
<dt><code>baseBoard</code></dt>
|
|
<dd>
|
|
This is block 2 of SMBIOS. This element can be repeated multiple
|
|
times to describe all the base boards; however, not all
|
|
hypervisors necessarily support the repetition. The element can
|
|
have the following children:
|
|
<dl>
|
|
<dt><code>manufacturer</code></dt>
|
|
<dd>Manufacturer of BIOS</dd>
|
|
<dt><code>product</code></dt>
|
|
<dd>Product Name</dd>
|
|
<dt><code>version</code></dt>
|
|
<dd>Version of the product</dd>
|
|
<dt><code>serial</code></dt>
|
|
<dd>Serial number</dd>
|
|
<dt><code>asset</code></dt>
|
|
<dd>Asset tag</dd>
|
|
<dt><code>location</code></dt>
|
|
<dd>Location in chassis</dd>
|
|
</dl>
|
|
NB: Incorrectly supplied entries for the
|
|
<code>bios</code>, <code>system</code> or <code>baseBoard</code>
|
|
blocks will be ignored without error. Other than <code>uuid</code>
|
|
validation and <code>date</code> format checking, all values are
|
|
passed as strings to the hypervisor driver.
|
|
</dd>
|
|
<dt><code>chassis</code></dt>
|
|
<dd>
|
|
<span class="since">Since 4.1.0,</span> this is block 3 of
|
|
SMBIOS, with entry names drawn from:
|
|
<dl>
|
|
<dt><code>manufacturer</code></dt>
|
|
<dd>Manufacturer of Chassis</dd>
|
|
<dt><code>version</code></dt>
|
|
<dd>Version of the Chassis</dd>
|
|
<dt><code>serial</code></dt>
|
|
<dd>Serial number</dd>
|
|
<dt><code>asset</code></dt>
|
|
<dd>Asset tag</dd>
|
|
<dt><code>sku</code></dt>
|
|
<dd>SKU number</dd>
|
|
</dl>
|
|
</dd>
|
|
<dt><code>oemStrings</code></dt>
|
|
<dd>
|
|
This is block 11 of SMBIOS. This element should appear once and
|
|
can have multiple <code>entry</code> child elements, each providing
|
|
arbitrary string data. There are no restrictions on what data can
|
|
be provided in the entries, however, if the data is intended to be
|
|
consumed by an application in the guest, it is recommended to use
|
|
the application name as a prefix in the string. (<span class="since">Since 4.1.0</span>)
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
|
|
<dt><code>fwcfg</code></dt>
|
|
<dd>
|
|
Some hypervisors provide unified way to tweak how firmware configures
|
|
itself, or may contain tables to be installed for the guest OS, for
|
|
instance boot order, ACPI, SMBIOS, etc. It even allows users to define
|
|
their own config blobs. In case of QEMU, these then appear under domain's
|
|
sysfs, under <code>/sys/firmware/qemu_fw_cfg</code>. Note, that these
|
|
values apply regardless the <smbios/> mode under <os/>.
|
|
<span class="since">Since 6.5.0</span>
|
|
|
|
<pre>
|
|
<smbios type='fwcfg'>
|
|
<entry name='opt/com.example/name'>example value</entry>
|
|
<entry name='opt/com.coreos/config' file='/tmp/provision.ign'/>
|
|
</smbios>
|
|
</pre>
|
|
|
|
The <code>smbios</code> element can have multiple <code>entry</code>
|
|
child elements. Each element then has mandatory <code>name</code>
|
|
attribute, which defines the name of the blob and must begin with
|
|
<code>"opt/"</code> and to avoid clashing with other names is advised to
|
|
be in form <code>"opt/$RFQDN/$name"</code> where <code>$RFQDN</code> is a
|
|
reverse fully qualified domain name you control.
|
|
Then, the element can either contain the value (to set the blob value
|
|
directly), or <code>file</code> attribute (to set the blob value from
|
|
the file).
|
|
</dd>
|
|
</dl>
|
|
|
|
<h3><a id="elementsCPUAllocation">CPU Allocation</a></h3>
|
|
|
|
<pre>
|
|
<domain>
|
|
...
|
|
<vcpu placement='static' cpuset="1-4,^3,6" current="1">2</vcpu>
|
|
<vcpus>
|
|
<vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
|
|
<vcpu id='1' enabled='no' hotpluggable='yes'/>
|
|
</vcpus>
|
|
...
|
|
</domain>
|
|
</pre>
|
|
|
|
<dl>
|
|
<dt><code>vcpu</code></dt>
|
|
<dd>The content of this element defines the maximum number of virtual
|
|
CPUs allocated for the guest OS, which must be between 1 and
|
|
the maximum supported by the hypervisor.
|
|
<dl>
|
|
<dt><code>cpuset</code></dt>
|
|
<dd>
|
|
The optional attribute <code>cpuset</code> is a comma-separated
|
|
list of physical CPU numbers that domain process and virtual CPUs
|
|
can be pinned to by default. (NB: The pinning policy of domain
|
|
process and virtual CPUs can be specified separately by
|
|
<code>cputune</code>. If the attribute <code>emulatorpin</code>
|
|
of <code>cputune</code> is specified, the <code>cpuset</code>
|
|
specified by <code>vcpu</code> here will be ignored. Similarly,
|
|
for virtual CPUs which have the <code>vcpupin</code> specified,
|
|
the <code>cpuset</code> specified by <code>cpuset</code> here
|
|
will be ignored. For virtual CPUs which don't have
|
|
<code>vcpupin</code> specified, each will be pinned to the physical
|
|
CPUs specified by <code>cpuset</code> here).
|
|
Each element in that list is either a single CPU number,
|
|
a range of CPU numbers, or a caret followed by a CPU number to
|
|
be excluded from a previous range.
|
|
<span class="since">Since 0.4.4</span>
|
|
</dd>
|
|
<dt><code>current</code></dt>
|
|
<dd>
|
|
The optional attribute <code>current</code> can
|
|
be used to specify whether fewer than the maximum number of
|
|
virtual CPUs should be enabled.
|
|
<span class="since">Since 0.8.5</span>
|
|
</dd>
|
|
<dt><code>placement</code></dt>
|
|
<dd>
|
|
The optional attribute <code>placement</code> can be used to
|
|
indicate the CPU placement mode for domain process. The value can
|
|
be either "static" or "auto", but defaults to <code>placement</code>
|
|
of <code>numatune</code> or "static" if <code>cpuset</code> is
|
|
specified. Using "auto" indicates the domain process will be pinned
|
|
to the advisory nodeset from querying numad and the value of
|
|
attribute <code>cpuset</code> will be ignored if it's specified.
|
|
If both <code>cpuset</code> and <code>placement</code> are not
|
|
specified or if <code>placement</code> is "static", but no
|
|
<code>cpuset</code> is specified, the domain process will be
|
|
pinned to all the available physical CPUs.
|
|
<span class="since">Since 0.9.11 (QEMU and KVM only)</span>
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
<dt><code>vcpus</code></dt>
|
|
<dd>
|
|
The vcpus element allows to control state of individual vCPUs.
|
|
|
|
The <code>id</code> attribute specifies the vCPU id as used by libvirt
|
|
in other places such as vCPU pinning, scheduler information and NUMA
|
|
assignment. Note that the vCPU ID as seen in the guest may differ from
|
|
libvirt ID in certain cases. Valid IDs are from 0 to the maximum vCPU
|
|
count as set by the <code>vcpu</code> element minus 1.
|
|
|
|
The <code>enabled</code> attribute allows to control the state of the
|
|
vCPU. Valid values are <code>yes</code> and <code>no</code>.
|
|
|
|
<code>hotpluggable</code> controls whether given vCPU can be hotplugged
|
|
and hotunplugged in cases when the CPU is enabled at boot. Note that
|
|
all disabled vCPUs must be hotpluggable. Valid values are
|
|
<code>yes</code> and <code>no</code>.
|
|
|
|
<code>order</code> allows to specify the order to add the online vCPUs.
|
|
For hypervisors/platforms that require to insert multiple vCPUs at once
|
|
the order may be duplicated across all vCPUs that need to be
|
|
enabled at once. Specifying order is not necessary, vCPUs are then
|
|
added in an arbitrary order. If order info is used, it must be used for
|
|
all online vCPUs. Hypervisors may clear or update ordering information
|
|
during certain operations to assure valid configuration.
|
|
|
|
Note that hypervisors may create hotpluggable vCPUs differently from
|
|
boot vCPUs thus special initialization may be necessary.
|
|
|
|
Hypervisors may require that vCPUs enabled on boot which are not
|
|
hotpluggable are clustered at the beginning starting with ID 0. It may
|
|
be also required that vCPU 0 is always present and non-hotpluggable.
|
|
|
|
Note that providing state for individual CPUs may be necessary to enable
|
|
support of addressable vCPU hotplug and this feature may not be
|
|
supported by all hypervisors.
|
|
|
|
For QEMU the following conditions are required. vCPU 0 needs to be
|
|
enabled and non-hotpluggable. On PPC64 along with it vCPUs that are in
|
|
the same core need to be enabled as well. All non-hotpluggable CPUs
|
|
present at boot need to be grouped after vCPU 0.
|
|
<span class="since">Since 2.2.0 (QEMU only)</span>
|
|
</dd>
|
|
</dl>
|
|
|
|
<h3><a id="elementsIOThreadsAllocation">IOThreads Allocation</a></h3>
|
|
<p>
|
|
IOThreads are dedicated event loop threads for supported disk
|
|
devices to perform block I/O requests in order to improve
|
|
scalability especially on an SMP host/guest with many LUNs.
|
|
<span class="since">Since 1.2.8 (QEMU only)</span>
|
|
</p>
|
|
|
|
<pre>
|
|
<domain>
|
|
...
|
|
<iothreads>4</iothreads>
|
|
...
|
|
</domain>
|
|
</pre>
|
|
<pre>
|
|
<domain>
|
|
...
|
|
<iothreadids>
|
|
<iothread id="2"/>
|
|
<iothread id="4"/>
|
|
<iothread id="6"/>
|
|
<iothread id="8"/>
|
|
</iothreadids>
|
|
...
|
|
</domain>
|
|
</pre>
|
|
|
|
<dl>
|
|
<dt><code>iothreads</code></dt>
|
|
<dd>
|
|
The content of this optional element defines the number
|
|
of IOThreads to be assigned to the domain for use by
|
|
supported target storage devices. There
|
|
should be only 1 or 2 IOThreads per host CPU. There may be more
|
|
than one supported device assigned to each IOThread.
|
|
<span class="since">Since 1.2.8</span>
|
|
</dd>
|
|
<dt><code>iothreadids</code></dt>
|
|
<dd>
|
|
The optional <code>iothreadids</code> element provides the capability
|
|
to specifically define the IOThread ID's for the domain. By default,
|
|
IOThread ID's are sequentially numbered starting from 1 through the
|
|
number of <code>iothreads</code> defined for the domain. The
|
|
<code>id</code> attribute is used to define the IOThread ID. The
|
|
<code>id</code> attribute must be a positive integer greater than 0.
|
|
If there are less <code>iothreadids</code> defined than
|
|
<code>iothreads</code> defined for the domain, then libvirt will
|
|
sequentially fill <code>iothreadids</code> starting at 1 avoiding
|
|
any predefined <code>id</code>. If there are more
|
|
<code>iothreadids</code> defined than <code>iothreads</code>
|
|
defined for the domain, then the <code>iothreads</code> value
|
|
will be adjusted accordingly.
|
|
<span class="since">Since 1.2.15</span>
|
|
</dd>
|
|
</dl>
|
|
|
|
<h3><a id="elementsCPUTuning">CPU Tuning</a></h3>
|
|
|
|
<pre>
|
|
<domain>
|
|
...
|
|
<cputune>
|
|
<vcpupin vcpu="0" cpuset="1-4,^2"/>
|
|
<vcpupin vcpu="1" cpuset="0,1"/>
|
|
<vcpupin vcpu="2" cpuset="2,3"/>
|
|
<vcpupin vcpu="3" cpuset="0,4"/>
|
|
<emulatorpin cpuset="1-3"/>
|
|
<iothreadpin iothread="1" cpuset="5,6"/>
|
|
<iothreadpin iothread="2" cpuset="7,8"/>
|
|
<shares>2048</shares>
|
|
<period>1000000</period>
|
|
<quota>-1</quota>
|
|
<global_period>1000000</global_period>
|
|
<global_quota>-1</global_quota>
|
|
<emulator_period>1000000</emulator_period>
|
|
<emulator_quota>-1</emulator_quota>
|
|
<iothread_period>1000000</iothread_period>
|
|
<iothread_quota>-1</iothread_quota>
|
|
<vcpusched vcpus='0-4,^3' scheduler='fifo' priority='1'/>
|
|
<iothreadsched iothreads='2' scheduler='batch'/>
|
|
<cachetune vcpus='0-3'>
|
|
<cache id='0' level='3' type='both' size='3' unit='MiB'/>
|
|
<cache id='1' level='3' type='both' size='3' unit='MiB'/>
|
|
<monitor level='3' vcpus='1'/>
|
|
<monitor level='3' vcpus='0-3'/>
|
|
</cachetune>
|
|
<cachetune vcpus='4-5'>
|
|
<monitor level='3' vcpus='4'/>
|
|
<monitor level='3' vcpus='5'/>
|
|
</cachetune>
|
|
<memorytune vcpus='0-3'>
|
|
<node id='0' bandwidth='60'/>
|
|
</memorytune>
|
|
|
|
</cputune>
|
|
...
|
|
</domain>
|
|
</pre>
|
|
|
|
<dl>
|
|
<dt><code>cputune</code></dt>
|
|
<dd>
|
|
The optional <code>cputune</code> element provides details
|
|
regarding the CPU tunable parameters for the domain.
|
|
Note: for the qemu driver, the optional <code>vcpupin</code>
|
|
and <code>emulatorpin</code> pinning settings are honored after
|
|
the emulator is launched and NUMA constraints considered. This
|
|
means that it is expected that other physical CPUs of the host
|
|
will be used during this time by the domain, which will be
|
|
reflected by the output of <code>virsh cpu-stats</code>.
|
|
<span class="since">Since 0.9.0</span>
|
|
</dd>
|
|
<dt><code>vcpupin</code></dt>
|
|
<dd>
|
|
The optional <code>vcpupin</code> element specifies which of host's
|
|
physical CPUs the domain vCPU will be pinned to. If this is omitted,
|
|
and attribute <code>cpuset</code> of element <code>vcpu</code> is
|
|
not specified, the vCPU is pinned to all the physical CPUs by default.
|
|
It contains two required attributes, the attribute <code>vcpu</code>
|
|
specifies vCPU id, and the attribute <code>cpuset</code> is same as
|
|
attribute <code>cpuset</code> of element <code>vcpu</code>.
|
|
(NB: Only qemu driver support)
|
|
<span class="since">Since 0.9.0</span>
|
|
</dd>
|
|
<dt><code>emulatorpin</code></dt>
|
|
<dd>
|
|
The optional <code>emulatorpin</code> element specifies which of host
|
|
physical CPUs the "emulator", a subset of a domain not including vCPU
|
|
or iothreads will be pinned to. If this is omitted, and attribute
|
|
<code>cpuset</code> of element <code>vcpu</code> is not specified,
|
|
"emulator" is pinned to all the physical CPUs by default. It contains
|
|
one required attribute <code>cpuset</code> specifying which physical
|
|
CPUs to pin to.
|
|
</dd>
|
|
<dt><code>iothreadpin</code></dt>
|
|
<dd>
|
|
The optional <code>iothreadpin</code> element specifies which of host
|
|
physical CPUs the IOThreads will be pinned to. If this is omitted
|
|
and attribute <code>cpuset</code> of element <code>vcpu</code> is
|
|
not specified, the IOThreads are pinned to all the physical CPUs
|
|
by default. There are two required attributes, the attribute
|
|
<code>iothread</code> specifies the IOThread ID and the attribute
|
|
<code>cpuset</code> specifying which physical CPUs to pin to. See
|
|
the <code>iothreadids</code>
|
|
<a href="#elementsIOThreadsAllocation"><code>description</code></a>
|
|
for valid <code>iothread</code> values.
|
|
<span class="since">Since 1.2.9</span>
|
|
</dd>
|
|
<dt><code>shares</code></dt>
|
|
<dd>
|
|
The optional <code>shares</code> element specifies the proportional
|
|
weighted share for the domain. If this is omitted, it defaults to
|
|
the OS provided defaults. NB, There is no unit for the value,
|
|
it's a relative measure based on the setting of other VM,
|
|
e.g. A VM configured with value
|
|
2048 will get twice as much CPU time as a VM configured with value 1024.
|
|
<span class="since">Since 0.9.0</span>
|
|
</dd>
|
|
<dt><code>period</code></dt>
|
|
<dd>
|
|
The optional <code>period</code> element specifies the enforcement
|
|
interval (unit: microseconds). Within <code>period</code>, each vCPU of
|
|
the domain will not be allowed to consume more than <code>quota</code>
|
|
worth of runtime. The value should be in range [1000, 1000000]. A period
|
|
with value 0 means no value.
|
|
<span class="since">Only QEMU driver support since 0.9.4, LXC since
|
|
0.9.10</span>
|
|
</dd>
|
|
<dt><code>quota</code></dt>
|
|
<dd>
|
|
The optional <code>quota</code> element specifies the maximum allowed
|
|
bandwidth (unit: microseconds). A domain with <code>quota</code> as any
|
|
negative value indicates that the domain has infinite bandwidth for
|
|
vCPU threads, which means that it is not bandwidth controlled. The value
|
|
should be in range [1000, 18446744073709551] or less than 0. A quota
|
|
with value 0 means no value. You can use this feature to ensure that all
|
|
vCPUs run at the same speed.
|
|
<span class="since">Only QEMU driver support since 0.9.4, LXC since
|
|
0.9.10</span>
|
|
</dd>
|
|
<dt><code>global_period</code></dt>
|
|
<dd>
|
|
The optional <code>global_period</code> element specifies the
|
|
enforcement CFS scheduler interval (unit: microseconds) for the whole
|
|
domain in contrast with <code>period</code> which enforces the interval
|
|
per vCPU. The value should be in range 1000, 1000000]. A
|
|
<code>global_period</code> with value 0 means no value.
|
|
<span class="since">Only QEMU driver support since 1.3.3</span>
|
|
</dd>
|
|
<dt><code>global_quota</code></dt>
|
|
<dd>
|
|
The optional <code>global_quota</code> element specifies the maximum
|
|
allowed bandwidth (unit: microseconds) within a period for the whole
|
|
domain. A domain with <code>global_quota</code> as any negative
|
|
value indicates that the domain has infinite bandwidth, which means that
|
|
it is not bandwidth controlled. The value should be in range
|
|
[1000, 18446744073709551] or less than 0. A <code>global_quota</code>
|
|
with value 0 means no value.
|
|
<span class="since">Only QEMU driver support since 1.3.3</span>
|
|
</dd>
|
|
|
|
<dt><code>emulator_period</code></dt>
|
|
<dd>
|
|
The optional <code>emulator_period</code> element specifies the enforcement
|
|
interval (unit: microseconds). Within <code>emulator_period</code>, emulator
|
|
threads (those excluding vCPUs) of the domain will not be allowed to consume
|
|
more than <code>emulator_quota</code> worth of runtime. The value should be
|
|
in range [1000, 1000000]. A period with value 0 means no value.
|
|
<span class="since">Only QEMU driver support since 0.10.0</span>
|
|
</dd>
|
|
<dt><code>emulator_quota</code></dt>
|
|
<dd>
|
|
The optional <code>emulator_quota</code> element specifies the maximum
|
|
allowed bandwidth (unit: microseconds) for domain's emulator threads (those
|
|
excluding vCPUs). A domain with <code>emulator_quota</code> as any negative
|
|
value indicates that the domain has infinite bandwidth for emulator threads
|
|
(those excluding vCPUs), which means that it is not bandwidth controlled.
|
|
The value should be in range [1000, 18446744073709551] or less than 0. A
|
|
quota with value 0 means no value.
|
|
<span class="since">Only QEMU driver support since 0.10.0</span>
|
|
</dd>
|
|
|
|
<dt><code>iothread_period</code></dt>
|
|
<dd>
|
|
The optional <code>iothread_period</code> element specifies the
|
|
enforcement interval (unit: microseconds) for IOThreads. Within
|
|
<code>iothread_period</code>, each IOThread of the domain will
|
|
not be allowed to consume more than <code>iothread_quota</code>
|
|
worth of runtime. The value should be in range [1000, 1000000].
|
|
An iothread_period with value 0 means no value.
|
|
<span class="since">Only QEMU driver support since 2.1.0</span>
|
|
</dd>
|
|
<dt><code>iothread_quota</code></dt>
|
|
<dd>
|
|
The optional <code>iothread_quota</code> element specifies the maximum
|
|
allowed bandwidth (unit: microseconds) for IOThreads. A domain with
|
|
<code>iothread_quota</code> as any negative value indicates that the
|
|
domain IOThreads have infinite bandwidth, which means that it is
|
|
not bandwidth controlled. The value should be in range
|
|
[1000, 18446744073709551] or less than 0. An <code>iothread_quota</code>
|
|
with value 0 means no value. You can use this feature to ensure that
|
|
all IOThreads run at the same speed.
|
|
<span class="since">Only QEMU driver support since 2.1.0</span>
|
|
</dd>
|
|
|
|
<dt><code>vcpusched</code>, <code>iothreadsched</code>
|
|
and <code>emulatorsched</code></dt>
|
|
<dd>
|
|
The optional
|
|
<code>vcpusched</code>, <code>iothreadsched</code>
|
|
and <code>emulatorsched</code> elements specify the scheduler type
|
|
(values <code>batch</code>, <code>idle</code>, <code>fifo</code>,
|
|
<code>rr</code>) for particular vCPU, IOThread and emulator threads
|
|
respecively. For <code>vcpusched</code> and <code>iothreadsched</code>
|
|
the attributes <code>vcpus</code> and <code>iothreads</code> select
|
|
which vCPUs/IOThreads this setting applies to, leaving them out sets the
|
|
default. The element <code>emulatorsched</code> does not have that
|
|
attribute. Valid <code>vcpus</code> values start at 0 through one less
|
|
than the number of vCPU's defined for the
|
|
domain. Valid <code>iothreads</code> values are described in
|
|
the <code>iothreadids</code>
|
|
<a href="#elementsIOThreadsAllocation"><code>description</code></a>.
|
|
If no <code>iothreadids</code> are defined, then libvirt numbers
|
|
IOThreads from 1 to the number of <code>iothreads</code> available
|
|
for the domain. For real-time schedulers (<code>fifo</code>,
|
|
<code>rr</code>), priority must be specified as
|
|
well (and is ignored for non-real-time ones). The value range
|
|
for the priority depends on the host kernel (usually 1-99).
|
|
<span class="since">Since 1.2.13</span>
|
|
<code>emulatorsched</code> <span class="since">since 5.3.0</span>
|
|
</dd>
|
|
|
|
<dt><code>cachetune</code><span class="since">Since 4.1.0</span></dt>
|
|
<dd>
|
|
Optional <code>cachetune</code> element can control allocations for CPU
|
|
caches using the resctrl on the host. Whether or not is this supported
|
|
can be gathered from capabilities where some limitations like minimum
|
|
size and required granularity are reported as well. The required
|
|
attribute <code>vcpus</code> specifies to which vCPUs this allocation
|
|
applies. A vCPU can only be member of one <code>cachetune</code> element
|
|
allocation. The vCPUs specified by cachetune can be identical with those
|
|
in memorytune, however they are not allowed to overlap.
|
|
Supported subelements are:
|
|
<dl>
|
|
<dt><code>cache</code></dt>
|
|
<dd>
|
|
This optional element controls the allocation of CPU cache and has
|
|
the following attributes:
|
|
<dl>
|
|
<dt><code>level</code></dt>
|
|
<dd>
|
|
Host cache level from which to allocate.
|
|
</dd>
|
|
<dt><code>id</code></dt>
|
|
<dd>
|
|
Host cache id from which to allocate.
|
|
</dd>
|
|
<dt><code>type</code></dt>
|
|
<dd>
|
|
Type of allocation. Can be <code>code</code> for code
|
|
(instructions), <code>data</code> for data or <code>both</code>
|
|
for both code and data (unified). Currently the allocation can
|
|
be done only with the same type as the host supports, meaning
|
|
you cannot request <code>both</code> for host with CDP
|
|
(code/data prioritization) enabled.
|
|
</dd>
|
|
<dt><code>size</code></dt>
|
|
<dd>
|
|
The size of the region to allocate. The value by default is in
|
|
bytes, but the <code>unit</code> attribute can be used to scale
|
|
the value.
|
|
</dd>
|
|
<dt><code>unit</code> (optional)</dt>
|
|
<dd>
|
|
If specified it is the unit such as KiB, MiB, GiB, or TiB
|
|
(described in the <code>memory</code> element
|
|
for <a href="#elementsMemoryAllocation">Memory Allocation</a>)
|
|
in which <code>size</code> is specified, defaults to bytes.
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
<dt><code>monitor</code><span class="since">Since 4.10.0</span></dt>
|
|
<dd>
|
|
The optional element <code>monitor</code> creates the cache
|
|
monitor(s) for current cache allocation and has the following
|
|
required attributes:
|
|
<dl>
|
|
<dt><code>level</code></dt>
|
|
<dd>
|
|
Host cache level the monitor belongs to.
|
|
</dd>
|
|
<dt><code>vcpus</code></dt>
|
|
<dd>
|
|
vCPU list the monitor applies to. A monitor's vCPU list
|
|
can only be the member(s) of the vCPU list of the associated
|
|
allocation. The default monitor has the same vCPU list as the
|
|
associated allocation. For non-default monitors, overlapping
|
|
vCPUs are not permitted.
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
|
|
<dt><code>memorytune</code><span class="since">Since 4.7.0</span></dt>
|
|
<dd>
|
|
Optional <code>memorytune</code> element can control allocations for
|
|
memory bandwidth using the resctrl on the host. Whether or not is this
|
|
supported can be gathered from capabilities where some limitations like
|
|
minimum bandwidth and required granularity are reported as well. The
|
|
required attribute <code>vcpus</code> specifies to which vCPUs this
|
|
allocation applies. A vCPU can only be member of one
|
|
<code>memorytune</code> element allocation. The <code>vcpus</code> specified
|
|
by <code>memorytune</code> can be identical to those specified by
|
|
<code>cachetune</code>. However they are not allowed to overlap each other.
|
|
Supported subelements are:
|
|
<dl>
|
|
<dt><code>node</code></dt>
|
|
<dd>
|
|
This element controls the allocation of CPU memory bandwidth and has the
|
|
following attributes:
|
|
<dl>
|
|
<dt><code>id</code></dt>
|
|
<dd>
|
|
Host node id from which to allocate memory bandwidth.
|
|
</dd>
|
|
<dt><code>bandwidth</code></dt>
|
|
<dd>
|
|
The memory bandwidth to allocate from this node. The value by default
|
|
is in percentage.
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
</dl>
|
|
|
|
|
|
<h3><a id="elementsMemoryAllocation">Memory Allocation</a></h3>
|
|
|
|
<pre>
|
|
<domain>
|
|
...
|
|
<maxMemory slots='16' unit='KiB'>1524288</maxMemory>
|
|
<memory unit='KiB'>524288</memory>
|
|
<currentMemory unit='KiB'>524288</currentMemory>
|
|
...
|
|
</domain>
|
|
</pre>
|
|
|
|
<dl>
|
|
<dt><code>memory</code></dt>
|
|
<dd>The maximum allocation of memory for the guest at boot time. The
|
|
memory allocation includes possible additional memory devices specified
|
|
at start or hotplugged later.
|
|
The units for this value are determined by the optional
|
|
attribute <code>unit</code>, which defaults to "KiB"
|
|
(kibibytes, 2<sup>10</sup> or blocks of 1024 bytes). Valid
|
|
units are "b" or "bytes" for bytes, "KB" for kilobytes
|
|
(10<sup>3</sup> or 1,000 bytes), "k" or "KiB" for kibibytes
|
|
(1024 bytes), "MB" for megabytes (10<sup>6</sup> or 1,000,000
|
|
bytes), "M" or "MiB" for mebibytes (2<sup>20</sup> or
|
|
1,048,576 bytes), "GB" for gigabytes (10<sup>9</sup> or
|
|
1,000,000,000 bytes), "G" or "GiB" for gibibytes
|
|
(2<sup>30</sup> or 1,073,741,824 bytes), "TB" for terabytes
|
|
(10<sup>12</sup> or 1,000,000,000,000 bytes), or "T" or "TiB"
|
|
for tebibytes (2<sup>40</sup> or 1,099,511,627,776 bytes).
|
|
However, the value will be rounded up to the nearest kibibyte
|
|
by libvirt, and may be further rounded to the granularity
|
|
supported by the hypervisor. Some hypervisors also enforce a
|
|
minimum, such as 4000KiB.
|
|
|
|
In case <a href="#elementsCPU">NUMA</a> is configured for the guest the
|
|
<code>memory</code> element can be omitted.
|
|
|
|
In the case of crash, optional attribute <code>dumpCore</code>
|
|
can be used to control whether the guest memory should be
|
|
included in the generated coredump or not (values "on", "off").
|
|
|
|
<span class='since'><code>unit</code> since 0.9.11</span>,
|
|
<span class='since'><code>dumpCore</code> since 0.10.2
|
|
(QEMU only)</span></dd>
|
|
<dt><code>maxMemory</code></dt>
|
|
<dd>The run time maximum memory allocation of the guest. The initial
|
|
memory specified by either the <code><memory></code> element or
|
|
the NUMA cell size configuration can be increased by hot-plugging of
|
|
memory to the limit specified by this element.
|
|
|
|
The <code>unit</code> attribute behaves the same as for
|
|
<code><memory></code>.
|
|
|
|
The <code>slots</code> attribute specifies the number of slots
|
|
available for adding memory to the guest. The bounds are hypervisor
|
|
specific.
|
|
|
|
Note that due to alignment of the memory chunks added via memory
|
|
hotplug the full size allocation specified by this element may be
|
|
impossible to achieve.
|
|
<span class='since'>Since 1.2.14 supported by the QEMU driver.</span>
|
|
</dd>
|
|
|
|
<dt><code>currentMemory</code></dt>
|
|
<dd>The actual allocation of memory for the guest. This value can
|
|
be less than the maximum allocation, to allow for ballooning
|
|
up the guests memory on the fly. If this is omitted, it defaults
|
|
to the same value as the <code>memory</code> element.
|
|
The <code>unit</code> attribute behaves the same as
|
|
for <code>memory</code>.</dd>
|
|
</dl>
|
|
|
|
|
|
<h3><a id="elementsMemoryBacking">Memory Backing</a></h3>
|
|
|
|
<pre>
|
|
<domain>
|
|
...
|
|
<memoryBacking>
|
|
<hugepages>
|
|
<page size="1" unit="G" nodeset="0-3,5"/>
|
|
<page size="2" unit="M" nodeset="4"/>
|
|
</hugepages>
|
|
<nosharepages/>
|
|
<locked/>
|
|
<source type="file|anonymous|memfd"/>
|
|
<access mode="shared|private"/>
|
|
<allocation mode="immediate|ondemand"/>
|
|
<discard/>
|
|
</memoryBacking>
|
|
...
|
|
</domain>
|
|
</pre>
|
|
|
|
<p>The optional <code>memoryBacking</code> element may contain several
|
|
elements that influence how virtual memory pages are backed by host
|
|
pages.</p>
|
|
|
|
<dl>
|
|
<dt><code>hugepages</code></dt>
|
|
<dd>This tells the hypervisor that the guest should have its memory
|
|
allocated using hugepages instead of the normal native page size.
|
|
<span class='since'>Since 1.2.5</span> it's possible to set hugepages
|
|
more specifically per numa node. The <code>page</code> element is
|
|
introduced. It has one compulsory attribute <code>size</code> which
|
|
specifies which hugepages should be used (especially useful on systems
|
|
supporting hugepages of different sizes). The default unit for the
|
|
<code>size</code> attribute is kilobytes (multiplier of 1024). If you
|
|
want to use different unit, use optional <code>unit</code> attribute.
|
|
For systems with NUMA, the optional <code>nodeset</code> attribute may
|
|
come handy as it ties given guest's NUMA nodes to certain hugepage
|
|
sizes. From the example snippet, one gigabyte hugepages are used for
|
|
every NUMA node except node number four. For the correct syntax see
|
|
<a href="#elementsNUMATuning">this</a>.</dd>
|
|
<dt><code>nosharepages</code></dt>
|
|
<dd>Instructs hypervisor to disable shared pages (memory merge, KSM) for
|
|
this domain. <span class="since">Since 1.0.6</span></dd>
|
|
<dt><code>locked</code></dt>
|
|
<dd>When set and supported by the hypervisor, memory pages belonging
|
|
to the domain will be locked in host's memory and the host will not
|
|
be allowed to swap them out, which might be required for some
|
|
workloads such as real-time. For QEMU/KVM guests, the memory used by
|
|
the QEMU process itself will be locked too: unlike guest memory, this
|
|
is an amount libvirt has no way of figuring out in advance, so it has
|
|
to remove the limit on locked memory altogether. Thus, enabling this
|
|
option opens up to a potential security risk: the host will be unable
|
|
to reclaim the locked memory back from the guest when it's running out
|
|
of memory, which means a malicious guest allocating large amounts of
|
|
locked memory could cause a denial-of-service attack on the host.
|
|
Because of this, using this option is discouraged unless your workload
|
|
demands it; even then, it's highly recommended to set a
|
|
<code>hard_limit</code> (see
|
|
<a href="#elementsMemoryTuning">memory tuning</a>) on memory allocation
|
|
suitable for the specific environment at the same time to mitigate
|
|
the risks described above. <span class="since">Since 1.0.6</span></dd>
|
|
<dt><code>source</code></dt>
|
|
<dd>Using the <code>type</code> attribute, it's possible to
|
|
provide "file" to utilize file memorybacking or keep the
|
|
default "anonymous". <span class="since">Since 4.10.0</span>,
|
|
you may choose "memfd" backing. (QEMU/KVM only)</dd>
|
|
<dt><code>access</code></dt>
|
|
<dd>Using the <code>mode</code> attribute, specify if the memory is
|
|
to be "shared" or "private". This can be overridden per numa node by
|
|
<code>memAccess</code>.</dd>
|
|
<dt><code>allocation</code></dt>
|
|
<dd>Using the <code>mode</code> attribute, specify when to allocate
|
|
the memory by supplying either "immediate" or "ondemand".</dd>
|
|
<dt><code>discard</code></dt>
|
|
<dd>When set and supported by hypervisor the memory
|
|
content is discarded just before guest shuts down (or
|
|
when DIMM module is unplugged). Please note that this is
|
|
just an optimization and is not guaranteed to work in
|
|
all cases (e.g. when hypervisor crashes).
|
|
<span class="since">Since 4.4.0</span> (QEMU/KVM only)
|
|
</dd>
|
|
</dl>
|
|
|
|
|
|
<h3><a id="elementsMemoryTuning">Memory Tuning</a></h3>
|
|
|
|
<pre>
|
|
<domain>
|
|
...
|
|
<memtune>
|
|
<hard_limit unit='G'>1</hard_limit>
|
|
<soft_limit unit='M'>128</soft_limit>
|
|
<swap_hard_limit unit='G'>2</swap_hard_limit>
|
|
<min_guarantee unit='bytes'>67108864</min_guarantee>
|
|
</memtune>
|
|
...
|
|
</domain>
|
|
</pre>
|
|
|
|
<dl>
|
|
<dt><code>memtune</code></dt>
|
|
<dd> The optional <code>memtune</code> element provides details
|
|
regarding the memory tunable parameters for the domain. If this is
|
|
omitted, it defaults to the OS provided defaults. For QEMU/KVM, the
|
|
parameters are applied to the QEMU process as a whole. Thus, when
|
|
counting them, one needs to add up guest RAM, guest video RAM, and
|
|
some memory overhead of QEMU itself. The last piece is hard to
|
|
determine so one needs guess and try. For each tunable, it
|
|
is possible to designate which unit the number is in on
|
|
input, using the same values as
|
|
for <code><memory></code>. For backwards
|
|
compatibility, output is always in
|
|
KiB. <span class='since'><code>unit</code>
|
|
since 0.9.11</span>
|
|
Possible values for all *_limit parameters are in range from 0 to
|
|
VIR_DOMAIN_MEMORY_PARAM_UNLIMITED.</dd>
|
|
<dt><code>hard_limit</code></dt>
|
|
<dd> The optional <code>hard_limit</code> element is the maximum memory
|
|
the guest can use. The units for this value are kibibytes (i.e. blocks
|
|
of 1024 bytes). Users of QEMU and KVM are strongly advised not to set
|
|
this limit as domain may get killed by the kernel if the guess is too
|
|
low, and determining the memory needed for a process to run is an
|
|
<a href="http://en.wikipedia.org/wiki/Undecidable_problem">
|
|
undecidable problem</a>; that said, if you already set
|
|
<code>locked</code> in
|
|
<a href="#elementsMemoryBacking">memory backing</a> because your
|
|
workload demands it, you'll have to take into account the specifics of
|
|
your deployment and figure out a value for <code>hard_limit</code> that
|
|
is large enough to support the memory requirements of your guest, but
|
|
small enough to protect your host against a malicious guest locking all
|
|
memory.</dd>
|
|
<dt><code>soft_limit</code></dt>
|
|
<dd> The optional <code>soft_limit</code> element is the memory limit to
|
|
enforce during memory contention. The units for this value are
|
|
kibibytes (i.e. blocks of 1024 bytes)</dd>
|
|
<dt><code>swap_hard_limit</code></dt>
|
|
<dd> The optional <code>swap_hard_limit</code> element is the maximum
|
|
memory plus swap the guest can use. The units for this value are
|
|
kibibytes (i.e. blocks of 1024 bytes). This has to be more than
|
|
hard_limit value provided</dd>
|
|
<dt><code>min_guarantee</code></dt>
|
|
<dd> The optional <code>min_guarantee</code> element is the guaranteed
|
|
minimum memory allocation for the guest. The units for this value are
|
|
kibibytes (i.e. blocks of 1024 bytes). This element is only supported
|
|
by VMware ESX and OpenVZ drivers.</dd>
|
|
</dl>
|
|
|
|
|
|
<h3><a id="elementsNUMATuning">NUMA Node Tuning</a></h3>
|
|
|
|
<pre>
|
|
<domain>
|
|
...
|
|
<numatune>
|
|
<memory mode="strict" nodeset="1-4,^3"/>
|
|
<memnode cellid="0" mode="strict" nodeset="1"/>
|
|
<memnode cellid="2" mode="preferred" nodeset="2"/>
|
|
</numatune>
|
|
...
|
|
</domain>
|
|
</pre>
|
|
|
|
<dl>
|
|
<dt><code>numatune</code></dt>
|
|
<dd>
|
|
The optional <code>numatune</code> element provides details of
|
|
how to tune the performance of a NUMA host via controlling NUMA policy
|
|
for domain process. NB, only supported by QEMU driver.
|
|
<span class='since'>Since 0.9.3</span>
|
|
</dd>
|
|
<dt><code>memory</code></dt>
|
|
<dd>
|
|
The optional <code>memory</code> element specifies how to allocate memory
|
|
for the domain process on a NUMA host. It contains several optional
|
|
attributes. Attribute <code>mode</code> is either 'interleave',
|
|
'strict', or 'preferred', defaults to 'strict'. Attribute
|
|
<code>nodeset</code> specifies the NUMA nodes, using the same syntax as
|
|
attribute <code>cpuset</code> of element <code>vcpu</code>. Attribute
|
|
<code>placement</code> (<span class='since'>since 0.9.12</span>) can be
|
|
used to indicate the memory placement mode for domain process, its value
|
|
can be either "static" or "auto", defaults to <code>placement</code> of
|
|
<code>vcpu</code>, or "static" if <code>nodeset</code> is specified.
|
|
"auto" indicates the domain process will only allocate memory from the
|
|
advisory nodeset returned from querying numad, and the value of attribute
|
|
<code>nodeset</code> will be ignored if it's specified.
|
|
|
|
If <code>placement</code> of <code>vcpu</code> is 'auto', and
|
|
<code>numatune</code> is not specified, a default <code>numatune</code>
|
|
with <code>placement</code> 'auto' and <code>mode</code> 'strict' will
|
|
be added implicitly.
|
|
|
|
<span class='since'>Since 0.9.3</span>
|
|
</dd>
|
|
<dt><code>memnode</code></dt>
|
|
<dd>
|
|
Optional <code>memnode</code> elements can specify memory allocation
|
|
policies per each guest NUMA node. For those nodes having no
|
|
corresponding <code>memnode</code> element, the default from
|
|
element <code>memory</code> will be used. Attribute <code>cellid</code>
|
|
addresses guest NUMA node for which the settings are applied.
|
|
Attributes <code>mode</code> and <code>nodeset</code> have the same
|
|
meaning and syntax as in <code>memory</code> element.
|
|
|
|
This setting is not compatible with automatic placement.
|
|
<span class='since'>QEMU Since 1.2.7</span>
|
|
</dd>
|
|
</dl>
|
|
|
|
|
|
<h3><a id="elementsBlockTuning">Block I/O Tuning</a></h3>
|
|
<pre>
|
|
<domain>
|
|
...
|
|
<blkiotune>
|
|
<weight>800</weight>
|
|
<device>
|
|
<path>/dev/sda</path>
|
|
<weight>1000</weight>
|
|
</device>
|
|
<device>
|
|
<path>/dev/sdb</path>
|
|
<weight>500</weight>
|
|
<read_bytes_sec>10000</read_bytes_sec>
|
|
<write_bytes_sec>10000</write_bytes_sec>
|
|
<read_iops_sec>20000</read_iops_sec>
|
|
<write_iops_sec>20000</write_iops_sec>
|
|
</device>
|
|
</blkiotune>
|
|
...
|
|
</domain>
|
|
</pre>
|
|
|
|
<dl>
|
|
<dt><code>blkiotune</code></dt>
|
|
<dd> The optional <code>blkiotune</code> element provides the ability
|
|
to tune Blkio cgroup tunable parameters for the domain. If this is
|
|
omitted, it defaults to the OS provided
|
|
defaults. <span class="since">Since 0.8.8</span></dd>
|
|
<dt><code>weight</code></dt>
|
|
<dd> The optional <code>weight</code> element is the overall I/O
|
|
weight of the guest. The value should be in the range [100,
|
|
1000]. After kernel 2.6.39, the value could be in the
|
|
range [10, 1000].</dd>
|
|
<dt><code>device</code></dt>
|
|
<dd>The domain may have multiple <code>device</code> elements
|
|
that further tune the weights for each host block device in
|
|
use by the domain. Note that
|
|
multiple <a href="#elementsDisks">guest disks</a> can share a
|
|
single host block device, if they are backed by files within
|
|
the same host file system, which is why this tuning parameter
|
|
is at the global domain level rather than associated with each
|
|
guest disk device (contrast this to
|
|
the <a href="#elementsDisks"><code><iotune></code></a>
|
|
element which can apply to an
|
|
individual <code><disk></code>).
|
|
Each <code>device</code> element has two
|
|
mandatory sub-elements, <code>path</code> describing the
|
|
absolute path of the device, and <code>weight</code> giving
|
|
the relative weight of that device, in the range [100,
|
|
1000]. After kernel 2.6.39, the value could be in the
|
|
range [10, 1000]. <span class="since">Since 0.9.8</span><br/>
|
|
Additionally, the following optional sub-elements can be used:
|
|
<dl>
|
|
<dt><code>read_bytes_sec</code></dt>
|
|
<dd>Read throughput limit in bytes per second.
|
|
<span class="since">Since 1.2.2</span></dd>
|
|
<dt><code>write_bytes_sec</code></dt>
|
|
<dd>Write throughput limit in bytes per second.
|
|
<span class="since">Since 1.2.2</span></dd>
|
|
<dt><code>read_iops_sec</code></dt>
|
|
<dd>Read I/O operations per second limit.
|
|
<span class="since">Since 1.2.2</span></dd>
|
|
<dt><code>write_iops_sec</code></dt>
|
|
<dd>Write I/O operations per second limit.
|
|
<span class="since">Since 1.2.2</span></dd>
|
|
</dl></dd></dl>
|
|
|
|
|
|
<h3><a id="resPartition">Resource partitioning</a></h3>
|
|
|
|
<p>
|
|
Hypervisors may allow for virtual machines to be placed into
|
|
resource partitions, potentially with nesting of said partitions.
|
|
The <code>resource</code> element groups together configuration
|
|
related to resource partitioning. It currently supports a child
|
|
element <code>partition</code> whose content defines the absolute path
|
|
of the resource partition in which to place the domain. If no
|
|
partition is listed, then the domain will be placed in a default
|
|
partition. It is the responsibility of the app/admin to ensure
|
|
that the partition exists prior to starting the guest. Only the
|
|
(hypervisor specific) default partition can be assumed to exist
|
|
by default.
|
|
</p>
|
|
<pre>
|
|
...
|
|
<resource>
|
|
<partition>/virtualmachines/production</partition>
|
|
</resource>
|
|
...
|
|
</pre>
|
|
|
|
<p>
|
|
Resource partitions are currently supported by the QEMU and
|
|
LXC drivers, which map partition paths to cgroups directories,
|
|
in all mounted controllers. <span class="since">Since 1.0.5</span>
|
|
</p>
|
|
|
|
<h3><a id="elementsCPU">CPU model and topology</a></h3>
|
|
|
|
<p>
|
|
Requirements for CPU model, its features and topology can be specified
|
|
using the following collection of elements.
|
|
<span class="since">Since 0.7.5</span>
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<cpu match='exact'>
|
|
<model fallback='allow'>core2duo</model>
|
|
<vendor>Intel</vendor>
|
|
<topology sockets='1' dies='1' cores='2' threads='1'/>
|
|
<cache level='3' mode='emulate'/>
|
|
<feature policy='disable' name='lahf_lm'/>
|
|
</cpu>
|
|
...</pre>
|
|
|
|
<pre>
|
|
<cpu mode='host-model'>
|
|
<model fallback='forbid'/>
|
|
<topology sockets='1' dies='1' cores='2' threads='1'/>
|
|
</cpu>
|
|
...</pre>
|
|
|
|
<pre>
|
|
<cpu mode='host-passthrough' migratable='off'>
|
|
<cache mode='passthrough'/>
|
|
<feature policy='disable' name='lahf_lm'/>
|
|
...</pre>
|
|
|
|
<p>
|
|
In case no restrictions need to be put on CPU model and its features, a
|
|
simpler <code>cpu</code> element can be used.
|
|
<span class="since">Since 0.7.6</span>
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<cpu>
|
|
<topology sockets='1' dies='1' cores='2' threads='1'/>
|
|
</cpu>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><code>cpu</code></dt>
|
|
<dd>The <code>cpu</code> element is the main container for describing
|
|
guest CPU requirements. Its <code>match</code> attribute specifies how
|
|
strictly the virtual CPU provided to the guest matches these
|
|
requirements. <span class="since">Since 0.7.6</span> the
|
|
<code>match</code> attribute can be omitted if <code>topology</code>
|
|
is the only element within <code>cpu</code>. Possible values for the
|
|
<code>match</code> attribute are:
|
|
|
|
<dl>
|
|
<dt><code>minimum</code></dt>
|
|
<dd>The specified CPU model and features describes the minimum
|
|
requested CPU. A better CPU will be provided to the guest if it
|
|
is possible with the requested hypervisor on the current host.
|
|
This is a constrained <code>host-model</code> mode; the domain
|
|
will not be created if the provided virtual CPU does not meet
|
|
the requirements.</dd>
|
|
|
|
<dt><code>exact</code></dt>
|
|
<dd>The virtual CPU provided to the guest should exactly match the
|
|
specification. If such CPU is not supported, libvirt will refuse
|
|
to start the domain.</dd>
|
|
|
|
<dt><code>strict</code></dt>
|
|
<dd>The domain will not be created unless the host CPU exactly
|
|
matches the specification. This is not very useful in practice
|
|
and should only be used if there is a real reason.</dd>
|
|
</dl>
|
|
|
|
<span class="since">Since 0.8.5</span> the <code>match</code>
|
|
attribute can be omitted and will default to <code>exact</code>.
|
|
|
|
Sometimes the hypervisor is not able to create a virtual CPU exactly
|
|
matching the specification passed by libvirt.
|
|
<span class="since">Since 3.2.0</span>, an optional <code>check</code>
|
|
attribute can be used to request a specific way of checking whether
|
|
the virtual CPU matches the specification. It is usually safe to omit
|
|
this attribute when starting a domain and stick with the default
|
|
value. Once the domain starts, libvirt will automatically change the
|
|
<code>check</code> attribute to the best supported value to ensure the
|
|
virtual CPU does not change when the domain is migrated to another
|
|
host. The following values can be used:
|
|
|
|
<dl>
|
|
<dt><code>none</code></dt>
|
|
<dd>Libvirt does no checking and it is up to the hypervisor to
|
|
refuse to start the domain if it cannot provide the requested CPU.
|
|
With QEMU this means no checking is done at all since the default
|
|
behavior of QEMU is to emit warnings, but start the domain anyway.
|
|
</dd>
|
|
|
|
<dt><code>partial</code></dt>
|
|
<dd>Libvirt will check the guest CPU specification before starting
|
|
a domain, but the rest is left on the hypervisor. It can still
|
|
provide a different virtual CPU.</dd>
|
|
|
|
<dt><code>full</code></dt>
|
|
<dd>The virtual CPU created by the hypervisor will be checked
|
|
against the CPU specification and the domain will not be started
|
|
unless the two CPUs match.</dd>
|
|
</dl>
|
|
|
|
<span class="since">Since 0.9.10</span>, an optional <code>mode</code>
|
|
attribute may be used to make it easier to configure a guest CPU to be
|
|
as close to host CPU as possible. Possible values for the
|
|
<code>mode</code> attribute are:
|
|
|
|
<dl>
|
|
<dt><code>custom</code></dt>
|
|
<dd>In this mode, the <code>cpu</code> element describes the CPU
|
|
that should be presented to the guest. This is the default when no
|
|
<code>mode</code> attribute is specified. This mode makes it so that
|
|
a persistent guest will see the same hardware no matter what host
|
|
the guest is booted on.</dd>
|
|
<dt><code>host-model</code></dt>
|
|
<dd>The <code>host-model</code> mode is essentially a shortcut to
|
|
copying host CPU definition from capabilities XML into domain XML.
|
|
Since the CPU definition is copied just before starting a domain,
|
|
exactly the same XML can be used on different hosts while still
|
|
providing the best guest CPU each host supports. The
|
|
<code>match</code> attribute can't be used in this mode. Specifying
|
|
CPU model is not supported either, but <code>model</code>'s
|
|
<code>fallback</code> attribute may still be used. Using the
|
|
<code>feature</code> element, specific flags may be enabled or
|
|
disabled specifically in addition to the host model. This may be
|
|
used to fine tune features that can be emulated.
|
|
<span class="since">(Since 1.1.1)</span>.
|
|
Libvirt does not model every aspect of each CPU so
|
|
the guest CPU will not match the host CPU exactly. On the other
|
|
hand, the ABI provided to the guest is reproducible. During
|
|
migration, complete CPU model definition is transferred to the
|
|
destination host so the migrated guest will see exactly the same CPU
|
|
model for the running instance of the guest, even if the destination
|
|
host contains more capable CPUs or newer kernel; but shutting down and restarting
|
|
the guest may present different hardware to the guest according to
|
|
the capabilities of the new host. Prior to libvirt 3.2.0 and QEMU
|
|
2.9.0 detection of the host CPU model via QEMU is not supported.
|
|
Thus the CPU configuration created using <code>host-model</code>
|
|
may not work as expected.
|
|
<span class="since">Since 3.2.0 and QEMU 2.9.0</span> this mode
|
|
works the way it was designed and it is indicated by the
|
|
<code>fallback</code> attribute set to <code>forbid</code> in the
|
|
host-model CPU definition advertised in
|
|
<a href="formatdomaincaps.html#elementsCPU">domain capabilities XML</a>.
|
|
When <code>fallback</code> attribute is set to <code>allow</code>
|
|
in the domain capabilities XML, it is recommended to use
|
|
<code>custom</code> mode with just the CPU model from the host
|
|
capabilities XML. <span class="since">Since 1.2.11</span> PowerISA
|
|
allows processors to run VMs in binary compatibility mode supporting
|
|
an older version of ISA. Libvirt on PowerPC architecture uses the
|
|
<code>host-model</code> to signify a guest mode CPU running in
|
|
binary compatibility mode. Example:
|
|
When a user needs a power7 VM to run in compatibility mode
|
|
on a Power8 host, this can be described in XML as follows :
|
|
<pre>
|
|
<cpu mode='host-model'>
|
|
<model>power7</model>
|
|
</cpu>
|
|
...</pre>
|
|
</dd>
|
|
<dt><code>host-passthrough</code></dt>
|
|
<dd>With this mode, the CPU visible to the guest should be exactly
|
|
the same as the host CPU even in the aspects that libvirt does not
|
|
understand. Though the downside of this mode is that the guest
|
|
environment cannot be reproduced on different hardware. Thus, if you
|
|
hit any bugs, you are on your own. Further details of that CPU can
|
|
be changed using <code>feature</code> elements. Migration of a guest
|
|
using host-passthrough is dangerous if the source and destination hosts
|
|
are not identical in both hardware, QEMU version, microcode version
|
|
and configuration. If such a migration is attempted then the guest may
|
|
hang or crash upon resuming execution on the destination host.
|
|
Depending on hypervisor version the virtual CPU may or may not
|
|
contain features which may block migration even to an identical host.
|
|
<span class="since">Since 6.5.0</span> optional
|
|
<code>migratable</code> attribute may be used to explicitly request
|
|
such features to be removed from (<code>on</code>) or kept in
|
|
(<code>off</code>) the virtual CPU. This attribute does not make
|
|
migration to another host safer: even with
|
|
<code>migratable='on'</code> migration will be dangerous unless both
|
|
hosts are identical as described above.
|
|
</dd>
|
|
</dl>
|
|
|
|
Both <code>host-model</code> and <code>host-passthrough</code> modes
|
|
make sense when a domain can run directly on the host CPUs (for
|
|
example, domains with type <code>kvm</code>). The actual host CPU is
|
|
irrelevant for domains with emulated virtual CPUs (such as domains with
|
|
type <code>qemu</code>). However, for backward compatibility
|
|
<code>host-model</code> may be implemented even for domains running on
|
|
emulated CPUs in which case the best CPU the hypervisor is able to
|
|
emulate may be used rather then trying to mimic the host CPU model.
|
|
</dd>
|
|
|
|
<dt><code>model</code></dt>
|
|
<dd>The content of the <code>model</code> element specifies CPU model
|
|
requested by the guest. The list of available CPU models and their
|
|
definition can be found in <code>cpu_map.xml</code> file installed
|
|
in libvirt's data directory. If a hypervisor is not able to use the
|
|
exact CPU model, libvirt automatically falls back to a closest model
|
|
supported by the hypervisor while maintaining the list of CPU
|
|
features. <span class="since">Since 0.9.10</span>, an optional
|
|
<code>fallback</code> attribute can be used to forbid this behavior,
|
|
in which case an attempt to start a domain requesting an unsupported
|
|
CPU model will fail. Supported values for <code>fallback</code>
|
|
attribute are: <code>allow</code> (this is the default), and
|
|
<code>forbid</code>. The optional <code>vendor_id</code> attribute
|
|
(<span class="since">Since 0.10.0</span>) can be used to set the
|
|
vendor id seen by the guest. It must be exactly 12 characters long.
|
|
If not set the vendor id of the host is used. Typical possible
|
|
values are "AuthenticAMD" and "GenuineIntel".</dd>
|
|
|
|
<dt><code>vendor</code></dt>
|
|
<dd><span class="since">Since 0.8.3</span> the content of the
|
|
<code>vendor</code> element specifies CPU vendor requested by the
|
|
guest. If this element is missing, the guest can be run on a CPU
|
|
matching given features regardless on its vendor. The list of
|
|
supported vendors can be found in <code>cpu_map.xml</code>.</dd>
|
|
|
|
<dt><code>topology</code></dt>
|
|
<dd>The <code>topology</code> element specifies requested topology of
|
|
virtual CPU provided to the guest. Four attributes, <code>sockets</code>,
|
|
<code>dies</code>, <code>cores</code>, and <code>threads</code>,
|
|
accept non-zero positive integer values. They refer to the number of
|
|
CPU sockets per NUMA node, number of dies per socket, number of cores
|
|
per die, and number of threads per core, respectively. The <code>dies</code>
|
|
attribute is optional and will default to 1 if omitted, while the other
|
|
attributes are all mandatory. Hypervisors may require that the maximum
|
|
number of vCPUs specified by the <code>cpus</code> element equals to
|
|
the number of vcpus resulting from the topology.</dd>
|
|
|
|
<dt><code>feature</code></dt>
|
|
<dd>The <code>cpu</code> element can contain zero or more
|
|
<code>elements</code> used to fine-tune features provided by the
|
|
selected CPU model. The list of known feature names can be found in
|
|
the same file as CPU models. The meaning of each <code>feature</code>
|
|
element depends on its <code>policy</code> attribute, which has to be
|
|
set to one of the following values:
|
|
|
|
<dl>
|
|
<dt><code>force</code></dt>
|
|
<dd>The virtual CPU will claim the feature is supported regardless
|
|
of it being supported by host CPU.</dd>
|
|
<dt><code>require</code></dt>
|
|
<dd>Guest creation will fail unless the feature is supported by the
|
|
host CPU or the hypervisor is able to emulate it.</dd>
|
|
<dt><code>optional</code></dt>
|
|
<dd>The feature will be supported by virtual CPU if and only if it
|
|
is supported by host CPU.</dd>
|
|
<dt><code>disable</code></dt>
|
|
<dd>The feature will not be supported by virtual CPU.</dd>
|
|
<dt><code>forbid</code></dt>
|
|
<dd>Guest creation will fail if the feature is supported by host
|
|
CPU.</dd>
|
|
</dl>
|
|
|
|
<span class="since">Since 0.8.5</span> the <code>policy</code>
|
|
attribute can be omitted and will default to <code>require</code>.
|
|
|
|
<p> Individual CPU feature names are specified as part of the
|
|
<code>name</code> attribute. For example, to explicitly specify
|
|
the 'pcid' feature with Intel IvyBridge CPU model:
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<cpu match='exact'>
|
|
<model fallback='forbid'>IvyBridge</model>
|
|
<vendor>Intel</vendor>
|
|
<feature policy='require' name='pcid'/>
|
|
</cpu>
|
|
...</pre>
|
|
|
|
</dd>
|
|
|
|
<dt><code>cache</code></dt>
|
|
<dd><span class="since">Since 3.3.0</span> the <code>cache</code>
|
|
element describes the virtual CPU cache. If the element is missing,
|
|
the hypervisor will use a sensible default.
|
|
|
|
<dl>
|
|
<dt><code>level</code></dt>
|
|
<dd>This optional attribute specifies which cache level is described
|
|
by the element. Missing attribute means the element describes all
|
|
CPU cache levels at once. Mixing <code>cache</code> elements with
|
|
the <code>level</code> attribute set and those without the
|
|
attribute is forbidden.</dd>
|
|
|
|
<dt><code>mode</code></dt>
|
|
<dd>
|
|
The following values are supported:
|
|
<dl>
|
|
<dt><code>emulate</code></dt>
|
|
<dd>The hypervisor will provide a fake CPU cache data.</dd>
|
|
|
|
<dt><code>passthrough</code></dt>
|
|
<dd>The real CPU cache data reported by the host CPU will be
|
|
passed through to the virtual CPU.</dd>
|
|
|
|
<dt><code>disable</code></dt>
|
|
<dd>The virtual CPU will report no CPU cache of the specified
|
|
level (or no cache at all if the <code>level</code> attribute
|
|
is missing).</dd>
|
|
</dl>
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
</dl>
|
|
|
|
<p>
|
|
Guest NUMA topology can be specified using the <code>numa</code> element.
|
|
<span class="since">Since 0.9.8</span>
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<cpu>
|
|
...
|
|
<numa>
|
|
<cell id='0' cpus='0-3' memory='512000' unit='KiB' discard='yes'/>
|
|
<cell id='1' cpus='4-7' memory='512000' unit='KiB' memAccess='shared'/>
|
|
</numa>
|
|
...
|
|
</cpu>
|
|
...</pre>
|
|
|
|
<p>
|
|
Each <code>cell</code> element specifies a NUMA cell or a NUMA node.
|
|
<code>cpus</code> specifies the CPU or range of CPUs that are
|
|
part of the node. <span class="since">Since 6.5.0</span> For the qemu
|
|
driver, if the emulator binary supports disjointed <code>cpus</code> ranges
|
|
in each <code>cell</code>, the sum of all CPUs declared in each <code>cell</code>
|
|
will be matched with the maximum number of virtual CPUs declared in the
|
|
<code>vcpu</code> element. This is done by filling any remaining CPUs
|
|
into the first NUMA <code>cell</code>. Users are encouraged to supply a
|
|
complete NUMA topology, where the sum of the NUMA CPUs matches the maximum
|
|
virtual CPUs number declared in <code>vcpus</code>, to make the domain
|
|
consistent across qemu and libvirt versions.
|
|
<code>memory</code> specifies the node memory
|
|
in kibibytes (i.e. blocks of 1024 bytes).
|
|
<span class="since">Since 6.6.0</span> the <code>cpus</code> attribute
|
|
is optional and if omitted a CPU-less NUMA node is created.
|
|
<span class="since">Since 1.2.11</span> one can use an additional <a
|
|
href="#elementsMemoryAllocation"><code>unit</code></a> attribute to
|
|
define units in which <code>memory</code> is specified.
|
|
<span class="since">Since 1.2.7</span> all cells should
|
|
have <code>id</code> attribute in case referring to some cell is
|
|
necessary in the code, otherwise the cells are
|
|
assigned <code>id</code>s in the increasing order starting from
|
|
0. Mixing cells with and without the <code>id</code> attribute
|
|
is not recommended as it may result in unwanted behaviour.
|
|
|
|
<span class='since'>Since 1.2.9</span> the optional attribute
|
|
<code>memAccess</code> can control whether the memory is to be
|
|
mapped as "shared" or "private". This is valid only for
|
|
hugepages-backed memory and nvdimm modules.
|
|
|
|
Each <code>cell</code> element can have an optional
|
|
<code>discard</code> attribute which fine tunes the discard
|
|
feature for given numa node as described under
|
|
<a href="#elementsMemoryBacking">Memory Backing</a>.
|
|
Accepted values are <code>yes</code> and <code>no</code>.
|
|
<span class='since'>Since 4.4.0</span>
|
|
</p>
|
|
|
|
<p>
|
|
This guest NUMA specification is currently available only for
|
|
QEMU/KVM and Xen.
|
|
</p>
|
|
|
|
<p>
|
|
A NUMA hardware architecture supports the notion of distances
|
|
between NUMA cells. <span class="since">Since 3.10.0</span> it
|
|
is possible to define the distance between NUMA cells using the
|
|
<code>distances</code> element within a NUMA <code>cell</code>
|
|
description. The <code>sibling</code> sub-element is used to
|
|
specify the distance value between sibling NUMA cells. For more
|
|
details, see the chapter explaining the system's SLIT (System
|
|
Locality Information Table) within the ACPI (Advanced
|
|
Configuration and Power Interface) specification.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<cpu>
|
|
...
|
|
<numa>
|
|
<cell id='0' cpus='0,4-7' memory='512000' unit='KiB'>
|
|
<distances>
|
|
<sibling id='0' value='10'/>
|
|
<sibling id='1' value='21'/>
|
|
<sibling id='2' value='31'/>
|
|
<sibling id='3' value='41'/>
|
|
</distances>
|
|
</cell>
|
|
<cell id='1' cpus='1,8-10,12-15' memory='512000' unit='KiB' memAccess='shared'>
|
|
<distances>
|
|
<sibling id='0' value='21'/>
|
|
<sibling id='1' value='10'/>
|
|
<sibling id='2' value='21'/>
|
|
<sibling id='3' value='31'/>
|
|
</distances>
|
|
</cell>
|
|
<cell id='2' cpus='2,11' memory='512000' unit='KiB' memAccess='shared'>
|
|
<distances>
|
|
<sibling id='0' value='31'/>
|
|
<sibling id='1' value='21'/>
|
|
<sibling id='2' value='10'/>
|
|
<sibling id='3' value='21'/>
|
|
</distances>
|
|
</cell>
|
|
<cell id='3' cpus='3' memory='512000' unit='KiB'>
|
|
<distances>
|
|
<sibling id='0' value='41'/>
|
|
<sibling id='1' value='31'/>
|
|
<sibling id='2' value='21'/>
|
|
<sibling id='3' value='10'/>
|
|
</distances>
|
|
</cell>
|
|
</numa>
|
|
...
|
|
</cpu>
|
|
...</pre>
|
|
|
|
<p>
|
|
Describing distances between NUMA cells is currently only supported
|
|
by Xen and QEMU. If no <code>distances</code> are given to describe
|
|
the SLIT data between different cells, it will default to a scheme
|
|
using 10 for local and 20 for remote distances.
|
|
</p>
|
|
|
|
<h4><a id="hmat">ACPI Heterogeneous Memory Attribute Table</a></h4>
|
|
|
|
<pre>
|
|
...
|
|
<cpu>
|
|
...
|
|
<numa>
|
|
<cell id='0' cpus='0-3' memory='512000' unit='KiB' discard='yes'/>
|
|
<cell id='1' cpus='4-7' memory='512000' unit='KiB' memAccess='shared'/>
|
|
<cell id='3' cpus='0-3' memory='2097152' unit='KiB'>
|
|
<cache level='1' associativity='direct' policy='writeback'>
|
|
<size value='10' unit='KiB'/>
|
|
<line value='8' unit='B'/>
|
|
</cache>
|
|
</cell>
|
|
<interconnects>
|
|
<latency initiator='0' target='0' type='access' value='5'/>
|
|
<latency initiator='0' target='0' cache='1' type='access' value='10'/>
|
|
<bandwidth initiator='0' target='0' type='access' value='204800' unit='KiB'/>
|
|
</interconnects>
|
|
</numa>
|
|
...
|
|
</cpu>
|
|
...</pre>
|
|
|
|
<p>
|
|
<span class='since'>Since 6.6.0</span> the <code>cell</code> element can
|
|
have a <code>cache</code> child element which describes memory side cache
|
|
for memory proximity domains. The <code>cache</code> element has a
|
|
<code>level</code> attribute describing the cache level and thus the
|
|
element can be repeated multiple times to describe different levels of
|
|
the cache.
|
|
</p>
|
|
|
|
<p>
|
|
The <code>cache</code> element then has following mandatory attributes:
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>level</code></dt>
|
|
<dd>
|
|
Level of the cache this description refers to.
|
|
</dd>
|
|
|
|
<dt><code>associativity</code></dt>
|
|
<dd>
|
|
Describes cache associativity (accepted values are <code>none</code>,
|
|
<code>direct</code> and <code>full</code>).
|
|
</dd>
|
|
|
|
<dt><code>policy</code></dt>
|
|
<dd>
|
|
Describes cache write associativity (accepted values are
|
|
<code>none</code>, <code>writeback</code> and
|
|
<code>writethrough</code>).
|
|
</dd>
|
|
</dl>
|
|
|
|
<p>
|
|
The <code>cache</code> element has two mandatory child elements then:
|
|
<code>size</code> and <code>line</code> which describe cache size and
|
|
cache line size. Both elements accept two attributes: <code>value</code>
|
|
and <code>unit</code> which set the value of corresponding cache
|
|
attribute.
|
|
</p>
|
|
|
|
<p>
|
|
The NUMA description has an optional <code>interconnects</code> element that
|
|
describes the normalized memory read/write latency, read/write bandwidth
|
|
between Initiator Proximity Domains (Processor or I/O) and Target
|
|
Proximity Domains (Memory).
|
|
</p>
|
|
|
|
<p>
|
|
The <code>interconnects</code> element can have zero or more
|
|
<code>latency</code> child elements to describe latency between two
|
|
memory nodes and zero or more <code>bandwidth</code> child elements to
|
|
describe bandwidth between two memory nodes. Both these have the
|
|
following mandatory attributes:
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>initiator</code></dt>
|
|
<dd>Refers to the source NUMA node</dd>
|
|
|
|
<dt><code>target</code></dt>
|
|
<dd>Refers to the target NUMA node</dd>
|
|
|
|
<dt><code>type</code></dt>
|
|
<dd>The type of the access. Accepted values: <code>access</code>,
|
|
<code>read</code>, <code>write</code></dd>
|
|
|
|
<dt><code>value</code></dt>
|
|
<dd>The actual value. For latency this is delay in nanoseconds, for
|
|
bandwidth this value is in kibibytes per second. Use additional
|
|
<code>unit</code> attribute to change the units.</dd>
|
|
</dl>
|
|
|
|
<p>
|
|
To describe latency from one NUMA node to a cache of another NUMA node
|
|
the <code>latency</code> element has optional <code>cache</code>
|
|
attribute which in combination with <code>target</code> attribute creates
|
|
full reference to distant NUMA node's cache level. For instance,
|
|
<code>target='0' cache='1'</code> refers to the first level cache of NUMA
|
|
node 0.
|
|
</p>
|
|
|
|
<h3><a id="elementsEvents">Events configuration</a></h3>
|
|
|
|
<p>
|
|
It is sometimes necessary to override the default actions taken
|
|
on various events. Not all hypervisors support all events and actions.
|
|
The actions may be taken as a result of calls to libvirt APIs
|
|
<a href="html/libvirt-libvirt-domain.html#virDomainReboot">
|
|
<code>virDomainReboot</code>
|
|
</a>,
|
|
<a href="html/libvirt-libvirt-domain.html#virDomainShutdown">
|
|
<code>virDomainShutdown</code>
|
|
</a>,
|
|
or
|
|
<a href="html/libvirt-libvirt-domain.html#virDomainShutdownFlags">
|
|
<code>virDomainShutdownFlags</code>
|
|
</a>.
|
|
Using <code>virsh reboot</code> or <code>virsh shutdown</code> would
|
|
also trigger the event.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<on_poweroff>destroy</on_poweroff>
|
|
<on_reboot>restart</on_reboot>
|
|
<on_crash>restart</on_crash>
|
|
<on_lockfailure>poweroff</on_lockfailure>
|
|
...</pre>
|
|
|
|
<p>
|
|
The following collections of elements allow the actions to be
|
|
specified when a guest OS triggers a lifecycle operation. A
|
|
common use case is to force a reboot to be treated as a poweroff
|
|
when doing the initial OS installation. This allows the VM to be
|
|
re-configured for the first post-install bootup.
|
|
</p>
|
|
<dl>
|
|
<dt><code>on_poweroff</code></dt>
|
|
<dd>The content of this element specifies the action to take when
|
|
the guest requests a poweroff.</dd>
|
|
<dt><code>on_reboot</code></dt>
|
|
<dd>The content of this element specifies the action to take when
|
|
the guest requests a reboot.</dd>
|
|
<dt><code>on_crash</code></dt>
|
|
<dd>The content of this element specifies the action to take when
|
|
the guest crashes.</dd>
|
|
</dl>
|
|
|
|
<p>
|
|
Each of these states allow for the same four possible actions.
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>destroy</code></dt>
|
|
<dd>The domain will be terminated completely and all resources
|
|
released.</dd>
|
|
<dt><code>restart</code></dt>
|
|
<dd>The domain will be terminated and then restarted with
|
|
the same configuration.</dd>
|
|
<dt><code>preserve</code></dt>
|
|
<dd>The domain will be terminated and its resource preserved
|
|
to allow analysis.</dd>
|
|
<dt><code>rename-restart</code></dt>
|
|
<dd>The domain will be terminated and then restarted with
|
|
a new name.</dd>
|
|
</dl>
|
|
|
|
<p>
|
|
QEMU/KVM supports the <code>on_poweroff</code> and <code>on_reboot</code>
|
|
events handling the <code>destroy</code> and <code>restart</code> actions.
|
|
The <code>preserve</code> action for an <code>on_reboot</code> event
|
|
is treated as a <code>destroy</code> and the <code>rename-restart</code>
|
|
action for an <code>on_poweroff</code> event is treated as a
|
|
<code>restart</code> event.
|
|
</p>
|
|
|
|
<p>
|
|
The <code>on_crash</code> event supports these additional
|
|
actions <span class="since">since 0.8.4</span>.
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>coredump-destroy</code></dt>
|
|
<dd>The crashed domain's core will be dumped, and then the
|
|
domain will be terminated completely and all resources
|
|
released</dd>
|
|
<dt><code>coredump-restart</code></dt>
|
|
<dd>The crashed domain's core will be dumped, and then the
|
|
domain will be restarted with the same configuration</dd>
|
|
</dl>
|
|
|
|
<p>
|
|
<span class="since">Since 3.9.0</span>, the lifecycle events can
|
|
be configured via the
|
|
<a href="html/libvirt-libvirt-domain.html#virDomainSetLifecycleAction">
|
|
<code>virDomainSetLifecycleAction</code></a> API.
|
|
</p>
|
|
|
|
<p>
|
|
The <code>on_lockfailure</code> element (<span class="since">since
|
|
1.0.0</span>) may be used to configure what action should be
|
|
taken when a lock manager loses resource locks. The following
|
|
actions are recognized by libvirt, although not all of them need
|
|
to be supported by individual lock managers. When no action is
|
|
specified, each lock manager will take its default action.
|
|
</p>
|
|
<dl>
|
|
<dt><code>poweroff</code></dt>
|
|
<dd>The domain will be forcefully powered off.</dd>
|
|
<dt><code>restart</code></dt>
|
|
<dd>The domain will be powered off and started up again to
|
|
reacquire its locks.</dd>
|
|
<dt><code>pause</code></dt>
|
|
<dd>The domain will be paused so that it can be manually resumed
|
|
when lock issues are solved.</dd>
|
|
<dt><code>ignore</code></dt>
|
|
<dd>Keep the domain running as if nothing happened.</dd>
|
|
</dl>
|
|
|
|
<h3><a id="elementsPowerManagement">Power Management</a></h3>
|
|
|
|
<p>
|
|
<span class="since">Since 0.10.2</span> it is possible to
|
|
forcibly enable or disable BIOS advertisements to the guest
|
|
OS. (NB: Only qemu driver support)
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<pm>
|
|
<suspend-to-disk enabled='no'/>
|
|
<suspend-to-mem enabled='yes'/>
|
|
</pm>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><code>pm</code></dt>
|
|
<dd>These elements enable ('yes') or disable ('no') BIOS support
|
|
for S3 (suspend-to-mem) and S4 (suspend-to-disk) ACPI sleep
|
|
states. If nothing is specified, then the hypervisor will be
|
|
left with its default value.<br/>
|
|
Note: This setting cannot prevent the guest OS from performing
|
|
a suspend as the guest OS itself can choose to circumvent the
|
|
unavailability of the sleep states (e.g. S4 by turning off
|
|
completely).</dd>
|
|
</dl>
|
|
|
|
<h3><a id="elementsFeatures">Hypervisor features</a></h3>
|
|
|
|
<p>
|
|
Hypervisors may allow certain CPU / machine features to be
|
|
toggled on/off.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<features>
|
|
<pae/>
|
|
<acpi/>
|
|
<apic/>
|
|
<hap/>
|
|
<privnet/>
|
|
<hyperv>
|
|
<relaxed state='on'/>
|
|
<vapic state='on'/>
|
|
<spinlocks state='on' retries='4096'/>
|
|
<vpindex state='on'/>
|
|
<runtime state='on'/>
|
|
<synic state='on'/>
|
|
<stimer state='on'>
|
|
<direct state='on'/>
|
|
</stimer>
|
|
<reset state='on'/>
|
|
<vendor_id state='on' value='KVM Hv'/>
|
|
<frequencies state='on'/>
|
|
<reenlightenment state='on'/>
|
|
<tlbflush state='on'/>
|
|
<ipi state='on'/>
|
|
<evmcs state='on'/>
|
|
</hyperv>
|
|
<kvm>
|
|
<hidden state='on'/>
|
|
<hint-dedicated state='on'/>
|
|
</kvm>
|
|
<xen>
|
|
<e820_host state='on'/>
|
|
<passthrough state='on' mode='share_pt'/>
|
|
</xen>
|
|
<pvspinlock state='on'/>
|
|
<gic version='2'/>
|
|
<ioapic driver='qemu'/>
|
|
<hpt resizing='required'>
|
|
<maxpagesize unit='MiB'>16</maxpagesize>
|
|
</hpt>
|
|
<vmcoreinfo state='on'/>
|
|
<smm state='on'>
|
|
<tseg unit='MiB'>48</tseg>
|
|
</smm>
|
|
<htm state='on'/>
|
|
<ccf-assist state='on'/>
|
|
<msrs unknown='ignore'/>
|
|
<cfpc value='workaround'/>
|
|
<sbbc value='workaround'/>
|
|
<ibs value='fixed-na'/>
|
|
</features>
|
|
...</pre>
|
|
|
|
<p>
|
|
All features are listed within the <code>features</code>
|
|
element, omitting a togglable feature tag turns it off.
|
|
The available features can be found by asking
|
|
for the <a href="formatcaps.html">capabilities XML</a> and
|
|
<a href="formatdomaincaps.html">domain capabilities XML</a>,
|
|
but a common set for fully virtualized domains are:
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>pae</code></dt>
|
|
<dd>Physical address extension mode allows 32-bit guests
|
|
to address more than 4 GB of memory.</dd>
|
|
<dt><code>acpi</code></dt>
|
|
<dd>ACPI is useful for power management, for example, with
|
|
KVM guests it is required for graceful shutdown to work.
|
|
</dd>
|
|
<dt><code>apic</code></dt>
|
|
<dd>APIC allows the use of programmable IRQ
|
|
management. <span class="since">Since 0.10.2 (QEMU only)</span> there is
|
|
an optional attribute <code>eoi</code> with values <code>on</code>
|
|
and <code>off</code> which toggles the availability of EOI (End of
|
|
Interrupt) for the guest.
|
|
</dd>
|
|
<dt><code>hap</code></dt>
|
|
<dd>Depending on the <code>state</code> attribute (values <code>on</code>,
|
|
<code>off</code>) enable or disable use of Hardware Assisted Paging.
|
|
The default is <code>on</code> if the hypervisor detects availability
|
|
of Hardware Assisted Paging.
|
|
</dd>
|
|
<dt><code>viridian</code></dt>
|
|
<dd>Enable Viridian hypervisor extensions for paravirtualizing
|
|
guest operating systems
|
|
</dd>
|
|
<dt><code>privnet</code></dt>
|
|
<dd>Always create a private network namespace. This is
|
|
automatically set if any interface devices are defined.
|
|
This feature is only relevant for container based
|
|
virtualization drivers, such as LXC.
|
|
</dd>
|
|
<dt><code>hyperv</code></dt>
|
|
<dd>Enable various features improving behavior of guests
|
|
running Microsoft Windows.
|
|
<table class="top_table">
|
|
<tr>
|
|
<th>Feature</th>
|
|
<th>Description</th>
|
|
<th>Value</th>
|
|
<th>Since</th>
|
|
</tr>
|
|
<tr>
|
|
<td>relaxed</td>
|
|
<td>Relax constraints on timers</td>
|
|
<td>on, off</td>
|
|
<td><span class="since">1.0.0 (QEMU 2.0)</span></td>
|
|
</tr>
|
|
<tr>
|
|
<td>vapic</td>
|
|
<td>Enable virtual APIC</td>
|
|
<td>on, off</td>
|
|
<td><span class="since">1.1.0 (QEMU 2.0)</span></td>
|
|
</tr>
|
|
<tr>
|
|
<td>spinlocks</td>
|
|
<td>Enable spinlock support</td>
|
|
<td>on, off; retries - at least 4095</td>
|
|
<td><span class="since">1.1.0 (QEMU 2.0)</span></td>
|
|
</tr>
|
|
<tr>
|
|
<td>vpindex</td>
|
|
<td>Virtual processor index</td>
|
|
<td>on, off</td>
|
|
<td><span class="since">1.3.3 (QEMU 2.5)</span></td>
|
|
</tr>
|
|
<tr>
|
|
<td>runtime</td>
|
|
<td>Processor time spent on running guest code and on behalf of guest code</td>
|
|
<td>on, off</td>
|
|
<td><span class="since">1.3.3 (QEMU 2.5)</span></td>
|
|
</tr>
|
|
<tr>
|
|
<td>synic</td>
|
|
<td>Enable Synthetic Interrupt Controller (SynIC)</td>
|
|
<td>on, off</td>
|
|
<td><span class="since">1.3.3 (QEMU 2.6)</span></td>
|
|
</tr>
|
|
<tr>
|
|
<td>stimer</td>
|
|
<td>Enable SynIC timers, optionally with Direct Mode support</td>
|
|
<td>on, off; direct - on,off</td>
|
|
<td><span class="since">1.3.3 (QEMU 2.6), direct mode 5.7.0 (QEMU 4.1)</span></td>
|
|
</tr>
|
|
<tr>
|
|
<td>reset</td>
|
|
<td>Enable hypervisor reset</td>
|
|
<td>on, off</td>
|
|
<td><span class="since">1.3.3 (QEMU 2.5)</span></td>
|
|
</tr>
|
|
<tr>
|
|
<td>vendor_id</td>
|
|
<td>Set hypervisor vendor id</td>
|
|
<td>on, off; value - string, up to 12 characters</td>
|
|
<td><span class="since">1.3.3 (QEMU 2.5)</span></td>
|
|
</tr>
|
|
<tr>
|
|
<td>frequencies</td>
|
|
<td>Expose frequency MSRs</td>
|
|
<td>on, off</td>
|
|
<td><span class="since">4.7.0 (QEMU 2.12)</span></td>
|
|
</tr>
|
|
<tr>
|
|
<td>reenlightenment</td>
|
|
<td>Enable re-enlightenment notification on migration</td>
|
|
<td>on, off</td>
|
|
<td><span class="since">4.7.0 (QEMU 3.0)</span></td>
|
|
</tr>
|
|
<tr>
|
|
<td>tlbflush</td>
|
|
<td>Enable PV TLB flush support</td>
|
|
<td>on, off</td>
|
|
<td><span class="since">4.7.0 (QEMU 3.0)</span></td>
|
|
</tr>
|
|
<tr>
|
|
<td>ipi</td>
|
|
<td>Enable PV IPI support</td>
|
|
<td>on, off</td>
|
|
<td><span class="since">4.10.0 (QEMU 3.1)</span></td>
|
|
</tr>
|
|
<tr>
|
|
<td>evmcs</td>
|
|
<td>Enable Enlightened VMCS</td>
|
|
<td>on, off</td>
|
|
<td><span class="since">4.10.0 (QEMU 3.1)</span></td>
|
|
</tr>
|
|
</table>
|
|
</dd>
|
|
<dt><code>pvspinlock</code></dt>
|
|
<dd>Notify the guest that the host supports paravirtual spinlocks
|
|
for example by exposing the pvticketlocks mechanism. This feature
|
|
can be explicitly disabled by using <code>state='off'</code>
|
|
attribute.
|
|
</dd>
|
|
<dt><code>kvm</code></dt>
|
|
<dd>Various features to change the behavior of the KVM hypervisor.
|
|
<table class="top_table">
|
|
<tr>
|
|
<th>Feature</th>
|
|
<th>Description</th>
|
|
<th>Value</th>
|
|
<th>Since</th>
|
|
</tr>
|
|
<tr>
|
|
<td>hidden</td>
|
|
<td>Hide the KVM hypervisor from standard MSR based discovery</td>
|
|
<td>on, off</td>
|
|
<td><span class="since">1.2.8 (QEMU 2.1.0)</span></td>
|
|
</tr>
|
|
<tr>
|
|
<td>hint-dedicated</td>
|
|
<td>Allows a guest to enable optimizations when running on dedicated vCPUs</td>
|
|
<td>on, off</td>
|
|
<td><span class="since">5.7.0 (QEMU 2.12.0)</span></td>
|
|
</tr>
|
|
</table>
|
|
</dd>
|
|
<dt><code>xen</code></dt>
|
|
<dd>Various features to change the behavior of the Xen hypervisor.
|
|
<table class="top_table">
|
|
<tr>
|
|
<th>Feature</th>
|
|
<th>Description</th>
|
|
<th>Value</th>
|
|
<th>Since</th>
|
|
</tr>
|
|
<tr>
|
|
<td>e820_host</td>
|
|
<td>Expose the host e820 to the guest (PV only)</td>
|
|
<td>on, off</td>
|
|
<td><span class="since">6.3.0</span></td>
|
|
</tr>
|
|
<tr>
|
|
<td>passthrough</td>
|
|
<td>Enable IOMMU mappings allowing PCI passthrough</td>
|
|
<td>on, off; mode - optional string sync_pt or share_pt</td>
|
|
<td><span class="since">6.3.0</span></td>
|
|
</tr>
|
|
</table>
|
|
</dd>
|
|
<dt><code>pmu</code></dt>
|
|
<dd>Depending on the <code>state</code> attribute (values <code>on</code>,
|
|
<code>off</code>, default <code>on</code>) enable or disable the
|
|
performance monitoring unit for the guest.
|
|
<span class="since">Since 1.2.12</span>
|
|
</dd>
|
|
<dt><code>vmport</code></dt>
|
|
<dd>Depending on the <code>state</code> attribute (values <code>on</code>,
|
|
<code>off</code>, default <code>on</code>) enable or disable
|
|
the emulation of VMware IO port, for vmmouse etc.
|
|
<span class="since">Since 1.2.16</span>
|
|
</dd>
|
|
<dt><code>gic</code></dt>
|
|
<dd>Enable for architectures using a General Interrupt
|
|
Controller instead of APIC in order to handle interrupts.
|
|
For example, the 'aarch64' architecture uses
|
|
<code>gic</code> instead of <code>apic</code>. The optional
|
|
attribute <code>version</code> specifies the GIC version;
|
|
however, it may not be supported by all hypervisors. Accepted
|
|
values are <code>2</code>, <code>3</code> and <code>host</code>.
|
|
<span class="since">Since 1.2.16</span>
|
|
</dd>
|
|
<dt><code>smm</code></dt>
|
|
<dd>
|
|
<p>
|
|
Depending on the <code>state</code> attribute (values <code>on</code>,
|
|
<code>off</code>, default <code>on</code>) enable or disable
|
|
System Management Mode.
|
|
<span class="since">Since 2.1.0</span>
|
|
</p><p> Optional sub-element <code>tseg</code> can be used to specify
|
|
the amount of memory dedicated to SMM's extended TSEG. That offers a
|
|
fourth option size apart from the existing ones (1 MiB, 2 MiB and 8
|
|
MiB) that the guest OS (or rather loader) can choose from. The size
|
|
can be specified as a value of that element, optional attribute
|
|
<code>unit</code> can be used to specify the unit of the
|
|
aforementioned value (defaults to 'MiB'). If set to 0 the extended
|
|
size is not advertised and only the default ones (see above) are
|
|
available.
|
|
</p><p>
|
|
<b>If the VM is booting you should leave this option alone, unless you
|
|
are very certain you know what you are doing.</b>
|
|
</p><p>
|
|
This value is configurable due to the fact that the calculation cannot
|
|
be done right with the guarantee that it will work correctly. In
|
|
QEMU, the user-configurable extended TSEG feature was unavailable up
|
|
to and including <code>pc-q35-2.9</code>. Starting with
|
|
<code>pc-q35-2.10</code> the feature is available, with default size
|
|
16 MiB. That should suffice for up to roughly 272 vCPUs, 5 GiB guest
|
|
RAM in total, no hotplug memory range, and 32 GiB of 64-bit PCI MMIO
|
|
aperture. Or for 48 vCPUs, with 1TB of guest RAM, no hotplug DIMM
|
|
range, and 32GB of 64-bit PCI MMIO aperture. The values may also vary
|
|
based on the loader the VM is using.
|
|
</p><p>
|
|
Additional size might be needed for significantly higher vCPU counts
|
|
or increased address space (that can be memory, maxMemory, 64-bit PCI
|
|
MMIO aperture size; roughly 8 MiB of TSEG per 1 TiB of address space)
|
|
which can also be rounded up.
|
|
</p><p>
|
|
Due to the nature of this setting being similar to "how much RAM
|
|
should the guest have" users are advised to either consult the
|
|
documentation of the guest OS or loader (if there is any), or test
|
|
this by trial-and-error changing the value until the VM boots
|
|
successfully. Yet another guiding value for users might be the fact
|
|
that 48 MiB should be enough for pretty large guests (240 vCPUs and
|
|
4TB guest RAM), but it is on purpose not set as default as 48 MiB of
|
|
unavailable RAM might be too much for small guests (e.g. with 512 MiB
|
|
of RAM).
|
|
</p><p>
|
|
See <a href="#elementsMemoryAllocation">Memory Allocation</a>
|
|
for more details about the <code>unit</code> attribute.
|
|
<span class="since">Since 4.5.0</span> (QEMU only)
|
|
</p>
|
|
</dd>
|
|
<dt><code>ioapic</code></dt>
|
|
<dd>Tune the I/O APIC. Possible values for the
|
|
<code>driver</code> attribute are:
|
|
<code>kvm</code> (default for KVM domains)
|
|
and <code>qemu</code> which puts I/O APIC in userspace
|
|
which is also known as a split I/O APIC mode.
|
|
<span class="since">Since 3.4.0</span> (QEMU/KVM only)
|
|
</dd>
|
|
<dt><code>hpt</code></dt>
|
|
<dd>Configure the HPT (Hash Page Table) of a pSeries guest. Possible
|
|
values for the <code>resizing</code> attribute are
|
|
<code>enabled</code>, which causes HPT resizing to be enabled if
|
|
both the guest and the host support it; <code>disabled</code>, which
|
|
causes HPT resizing to be disabled regardless of guest and host
|
|
support; and <code>required</code>, which prevents the guest from
|
|
starting unless both the guest and the host support HPT resizing. If
|
|
the attribute is not defined, the hypervisor default will be used.
|
|
<span class="since">Since 3.10.0</span> (QEMU/KVM only).
|
|
|
|
<p>The optional <code>maxpagesize</code> subelement can be used to
|
|
limit the usable page size for HPT guests. Common values are 64 KiB,
|
|
16 MiB and 16 GiB; when not specified, the hypervisor default will
|
|
be used. <span class="since">Since 4.5.0</span> (QEMU/KVM only).</p>
|
|
</dd>
|
|
<dt><code>vmcoreinfo</code></dt>
|
|
<dd>Enable QEMU vmcoreinfo device to let the guest kernel save debug
|
|
details. <span class="since">Since 4.4.0</span> (QEMU only)
|
|
</dd>
|
|
<dt><code>htm</code></dt>
|
|
<dd>Configure HTM (Hardware Transational Memory) availability for
|
|
pSeries guests. Possible values for the <code>state</code> attribute
|
|
are <code>on</code> and <code>off</code>. If the attribute is not
|
|
defined, the hypervisor default will be used.
|
|
<span class="since">Since 4.6.0</span> (QEMU/KVM only)
|
|
</dd>
|
|
<dt><code>nested-hv</code></dt>
|
|
<dd>Configure nested HV availability for pSeries guests. This needs to
|
|
be enabled from the host (L0) in order to be effective; having HV
|
|
support in the (L1) guest is very desiderable if it's planned to
|
|
run nested (L2) guests inside it, because it will result in those
|
|
nested guests having much better performance than they would when
|
|
using KVM PR or TCG.
|
|
Possible values for the <code>state</code> attribute are
|
|
<code>on</code> and <code>off</code>. If the attribute is not
|
|
defined, the hypervisor default will be used.
|
|
<span class="since">Since 4.10.0</span> (QEMU/KVM only)
|
|
</dd>
|
|
<dt><code>msrs</code></dt>
|
|
<dd>Some guests might require ignoring unknown
|
|
Model Specific Registers (MSRs) reads and writes. It's possible
|
|
to switch this by setting <code>unknown</code> attribute
|
|
of <code>msrs</code> to <code>ignore</code>. If the attribute is
|
|
not defined, or set to <code>fault</code>, unknown reads and writes
|
|
will not be ignored.
|
|
<span class="since">Since 5.1.0</span> (bhyve only)
|
|
</dd>
|
|
<dt><code>ccf-assist</code></dt>
|
|
<dd>Configure ccf-assist (Count Cache Flush Assist) availability for
|
|
pSeries guests.
|
|
Possible values for the <code>state</code> attribute
|
|
are <code>on</code> and <code>off</code>. If the attribute is not
|
|
defined, the hypervisor default will be used.
|
|
<span class="since">Since 5.9.0</span> (QEMU/KVM only)
|
|
</dd>
|
|
<dt><code>cfpc</code></dt>
|
|
<dd>Configure cfpc (Cache Flush on Privilege Change) availability for
|
|
pSeries guests.
|
|
Possible values for the <code>value</code> attribute
|
|
are <code>broken</code> (no protection), <code>workaround</code>
|
|
(software workaround available) and <code>fixed</code> (fixed in
|
|
hardware). If the attribute is not defined, the hypervisor
|
|
default will be used.
|
|
<span class="since">Since 6.3.0</span> (QEMU/KVM only)
|
|
</dd>
|
|
<dt><code>sbbc</code></dt>
|
|
<dd>Configure sbbc (Speculation Barrier Bounds Checking) availability for
|
|
pSeries guests.
|
|
Possible values for the <code>value</code> attribute
|
|
are <code>broken</code> (no protection), <code>workaround</code>
|
|
(software workaround available) and <code>fixed</code> (fixed in
|
|
hardware). If the attribute is not defined, the hypervisor
|
|
default will be used.
|
|
<span class="since">Since 6.3.0</span> (QEMU/KVM only)
|
|
</dd>
|
|
<dt><code>ibs</code></dt>
|
|
<dd>Configure ibs (Indirect Branch Speculation) availability for
|
|
pSeries guests.
|
|
Possible values for the <code>value</code> attribute
|
|
are <code>broken</code> (no protection), <code>workaround</code>
|
|
(count cache flush), <code>fixed-ibs</code> (fixed by
|
|
serializing indirect branches), <code>fixed-ccd</code> (fixed by
|
|
disabling the cache count) and <code>fixed-na (fixed in
|
|
hardware - no longer applicable)</code>.
|
|
If the attribute is not defined, the hypervisor
|
|
default will be used.
|
|
<span class="since">Since 6.3.0</span> (QEMU/KVM only)
|
|
</dd>
|
|
</dl>
|
|
|
|
<h3><a id="elementsTime">Time keeping</a></h3>
|
|
|
|
<p>
|
|
The guest clock is typically initialized from the host clock.
|
|
Most operating systems expect the hardware clock to be kept
|
|
in UTC, and this is the default. Windows, however, expects
|
|
it to be in so called 'localtime'.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<clock offset='localtime'>
|
|
<timer name='rtc' tickpolicy='catchup' track='guest'>
|
|
<catchup threshold='123' slew='120' limit='10000'/>
|
|
</timer>
|
|
<timer name='pit' tickpolicy='delay'/>
|
|
</clock>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><code>clock</code></dt>
|
|
<dd>
|
|
<p>The <code>offset</code> attribute takes four possible
|
|
values, allowing fine grained control over how the guest
|
|
clock is synchronized to the host. NB, not all hypervisors
|
|
support all modes.</p>
|
|
<dl>
|
|
<dt><code>utc</code></dt>
|
|
<dd>
|
|
The guest clock will always be synchronized to UTC when
|
|
booted.
|
|
<span class="since">Since 0.9.11</span> 'utc' mode can be converted
|
|
to 'variable' mode, which can be controlled by using the
|
|
<code>adjustment</code> attribute. If the value is 'reset', the
|
|
conversion is never done (not all hypervisors can
|
|
synchronize to UTC on each boot; use of 'reset' will cause
|
|
an error on those hypervisors). A numeric value
|
|
forces the conversion to 'variable' mode using the value as the
|
|
initial adjustment. The default <code>adjustment</code> is
|
|
hypervisor specific.
|
|
</dd>
|
|
<dt><code>localtime</code></dt>
|
|
<dd>
|
|
The guest clock will be synchronized to the host's configured
|
|
timezone when booted, if any.
|
|
<span class="since">Since 0.9.11,</span> the <code>adjustment</code>
|
|
attribute behaves the same as in 'utc' mode.
|
|
</dd>
|
|
<dt><code>timezone</code></dt>
|
|
<dd>
|
|
The guest clock will be synchronized to the requested timezone
|
|
using the <code>timezone</code> attribute.
|
|
<span class="since">Since 0.7.7</span>
|
|
</dd>
|
|
<dt><code>variable</code></dt>
|
|
<dd>
|
|
The guest clock will have an arbitrary offset applied
|
|
relative to UTC or localtime, depending on the <code>basis</code>
|
|
attribute. The delta relative to UTC (or localtime) is specified
|
|
in seconds, using the <code>adjustment</code> attribute.
|
|
The guest is free to adjust the RTC over time and expect
|
|
that it will be honored at next reboot. This is in
|
|
contrast to 'utc' and 'localtime' mode (with the optional
|
|
attribute adjustment='reset'), where the RTC adjustments are
|
|
lost at each reboot. <span class="since">Since 0.7.7</span>
|
|
<span class="since">Since 0.9.11</span> the <code>basis</code>
|
|
attribute can be either 'utc' (default) or 'localtime'.
|
|
</dd>
|
|
</dl>
|
|
<p>
|
|
A <code>clock</code> may have zero or more
|
|
<code>timer</code> sub-elements. <span class="since">Since
|
|
0.8.0</span>
|
|
</p>
|
|
</dd>
|
|
<dt><code>timer</code></dt>
|
|
<dd>
|
|
<p>
|
|
Each timer element requires a <code>name</code> attribute,
|
|
and has other optional attributes that depend on
|
|
the <code>name</code> specified. Various hypervisors
|
|
support different combinations of attributes.
|
|
</p>
|
|
<dl>
|
|
<dt><code>name</code></dt>
|
|
<dd>
|
|
The <code>name</code> attribute selects which timer is
|
|
being modified, and can be one of
|
|
"platform" (currently unsupported),
|
|
"hpet" (xen, qemu, lxc), "kvmclock" (qemu),
|
|
"pit" (qemu), "rtc" (qemu, lxc), "tsc" (xen, qemu -
|
|
<span class="since">since 3.2.0</span>), "hypervclock"
|
|
(qemu - <span class="since">since 1.2.2</span>) or
|
|
"armvtimer" (qemu - <span class="since">since 6.1.0</span>).
|
|
|
|
The <code>hypervclock</code> timer adds support for the
|
|
reference time counter and the reference page for iTSC
|
|
feature for guests running the Microsoft Windows
|
|
operating system.
|
|
</dd>
|
|
<dt><code>track</code></dt>
|
|
<dd>
|
|
The <code>track</code> attribute specifies what the timer
|
|
tracks, and can be "boot", "guest", or "wall".
|
|
Only valid for <code>name="rtc"</code>
|
|
or <code>name="platform"</code>.
|
|
</dd>
|
|
<dt><code>tickpolicy</code></dt>
|
|
<dd>
|
|
<p>
|
|
The <code>tickpolicy</code> attribute determines what
|
|
happens when QEMU misses a deadline for injecting a
|
|
tick to the guest. This can happen, for example, because the
|
|
guest was paused.
|
|
</p>
|
|
<dl>
|
|
<dt><code>delay</code></dt>
|
|
<dd>Continue to deliver ticks at the normal rate. The guest OS
|
|
will not notice anything is amiss, as from its point of view
|
|
time will have continued to flow normally. The time in the
|
|
guest should now be behind the time in the host by exactly
|
|
the amount of time during which ticks have been missed.</dd>
|
|
<dt><code>catchup</code></dt>
|
|
<dd>Deliver ticks at a higher rate to catch up with the missed
|
|
ticks. The guest OS will not notice anything is amiss, as
|
|
from its point of view time will have continued to flow
|
|
normally. Once the timer has managed to catch up with all
|
|
the missing ticks, the time in the guest and in the host
|
|
should match.</dd>
|
|
<dt><code>merge</code></dt>
|
|
<dd>Merge the missed tick(s) into one tick and
|
|
inject. The guest time may be delayed, depending
|
|
on how the OS reacts to the merging of ticks</dd>
|
|
<dt><code>discard</code></dt>
|
|
<dd>Throw away the missed ticks and continue with future
|
|
injection normally. The guest OS will see the timer jump
|
|
ahead by a potentially quite significant amount all at once,
|
|
as if the intervening chunk of time had simply not existed;
|
|
needless to say, such a sudden jump can easily confuse a
|
|
guest OS which is not specifically prepared to deal with it.
|
|
Assuming the guest OS can deal correctly with the time jump,
|
|
the time in the guest and in the host should now match.</dd>
|
|
</dl>
|
|
<p>If the policy is "catchup", there can be further details in
|
|
the <code>catchup</code> sub-element.</p>
|
|
<dl>
|
|
<dt><code>catchup</code></dt>
|
|
<dd>
|
|
The <code>catchup</code> element has three optional
|
|
attributes, each a positive integer. The attributes
|
|
are <code>threshold</code>, <code>slew</code>,
|
|
and <code>limit</code>.
|
|
</dd>
|
|
</dl>
|
|
<p>
|
|
Note that hypervisors are not required to support all policies across all time sources
|
|
</p>
|
|
</dd>
|
|
<dt><code>frequency</code></dt>
|
|
<dd>
|
|
The <code>frequency</code> attribute is an unsigned
|
|
integer specifying the frequency at
|
|
which <code>name="tsc"</code> runs.
|
|
</dd>
|
|
<dt><code>mode</code></dt>
|
|
<dd>
|
|
The <code>mode</code> attribute controls how
|
|
the <code>name="tsc"</code> timer is managed, and can be
|
|
"auto", "native", "emulate", "paravirt", or "smpsafe".
|
|
Other timers are always emulated.
|
|
</dd>
|
|
<dt><code>present</code></dt>
|
|
<dd>
|
|
The <code>present</code> attribute can be "yes" or "no" to
|
|
specify whether a particular timer is available to the guest.
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
</dl>
|
|
|
|
<h3><a id="elementsPerf">Performance monitoring events</a></h3>
|
|
|
|
<p>
|
|
Some platforms allow monitoring of performance of the virtual machine and
|
|
the code executed inside. To enable the performance monitoring events
|
|
you can either specify them in the <code>perf</code> element or enable
|
|
them via <code>virDomainSetPerfEvents</code> API. The performance values
|
|
are then retrieved using the virConnectGetAllDomainStats API.
|
|
<span class="since">Since 2.0.0</span>
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<perf>
|
|
<event name='cmt' enabled='yes'/>
|
|
<event name='mbmt' enabled='no'/>
|
|
<event name='mbml' enabled='yes'/>
|
|
<event name='cpu_cycles' enabled='no'/>
|
|
<event name='instructions' enabled='yes'/>
|
|
<event name='cache_references' enabled='no'/>
|
|
<event name='cache_misses' enabled='no'/>
|
|
<event name='branch_instructions' enabled='no'/>
|
|
<event name='branch_misses' enabled='no'/>
|
|
<event name='bus_cycles' enabled='no'/>
|
|
<event name='stalled_cycles_frontend' enabled='no'/>
|
|
<event name='stalled_cycles_backend' enabled='no'/>
|
|
<event name='ref_cpu_cycles' enabled='no'/>
|
|
<event name='cpu_clock' enabled='no'/>
|
|
<event name='task_clock' enabled='no'/>
|
|
<event name='page_faults' enabled='no'/>
|
|
<event name='context_switches' enabled='no'/>
|
|
<event name='cpu_migrations' enabled='no'/>
|
|
<event name='page_faults_min' enabled='no'/>
|
|
<event name='page_faults_maj' enabled='no'/>
|
|
<event name='alignment_faults' enabled='no'/>
|
|
<event name='emulation_faults' enabled='no'/>
|
|
</perf>
|
|
...
|
|
</pre>
|
|
|
|
<table class="top_table">
|
|
<tr>
|
|
<th>event name</th>
|
|
<th>Description</th>
|
|
<th>stats parameter name</th>
|
|
</tr>
|
|
<tr>
|
|
<td><code>cmt</code></td>
|
|
<td>usage of l3 cache in bytes by applications running on the platform</td>
|
|
<td><code>perf.cmt</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>mbmt</code></td>
|
|
<td>total system bandwidth from one level of cache</td>
|
|
<td><code>perf.mbmt</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>mbml</code></td>
|
|
<td>bandwidth of memory traffic for a memory controller</td>
|
|
<td><code>perf.mbml</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>cpu_cycles</code></td>
|
|
<td>the count of CPU cycles (total/elapsed)</td>
|
|
<td><code>perf.cpu_cycles</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>instructions</code></td>
|
|
<td>the count of instructions by applications running on the platform</td>
|
|
<td><code>perf.instructions</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>cache_references</code></td>
|
|
<td>the count of cache hits by applications running on the platform</td>
|
|
<td><code>perf.cache_references</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>cache_misses</code></td>
|
|
<td>the count of cache misses by applications running on the platform</td>
|
|
<td><code>perf.cache_misses</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>branch_instructions</code></td>
|
|
<td>the count of branch instructions by applications running on the platform</td>
|
|
<td><code>perf.branch_instructions</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>branch_misses</code></td>
|
|
<td>the count of branch misses by applications running on the platform</td>
|
|
<td><code>perf.branch_misses</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>bus_cycles</code></td>
|
|
<td>the count of bus cycles by applications running on the platform</td>
|
|
<td><code>perf.bus_cycles</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>stalled_cycles_frontend</code></td>
|
|
<td>the count of stalled CPU cycles in the frontend of the instruction
|
|
processor pipeline by applications running on the platform</td>
|
|
<td><code>perf.stalled_cycles_frontend</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>stalled_cycles_backend</code></td>
|
|
<td>the count of stalled CPU cycles in the backend of the instruction
|
|
processor pipeline by applications running on the platform</td>
|
|
<td><code>perf.stalled_cycles_backend</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>ref_cpu_cycles</code></td>
|
|
<td>the count of total CPU cycles not affected by CPU frequency scaling
|
|
by applications running on the platform</td>
|
|
<td><code>perf.ref_cpu_cycles</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>cpu_clock</code></td>
|
|
<td>the count of CPU clock time, as measured by a monotonic
|
|
high-resolution per-CPU timer, by applications running on
|
|
the platform</td>
|
|
<td><code>perf.cpu_clock</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>task_clock</code></td>
|
|
<td>the count of task clock time, as measured by a monotonic
|
|
high-resolution CPU timer, specific to the task that
|
|
is run by applications running on the platform</td>
|
|
<td><code>perf.task_clock</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>page_faults</code></td>
|
|
<td>the count of page faults by applications running on the
|
|
platform. This includes minor, major, invalid and other
|
|
types of page faults</td>
|
|
<td><code>perf.page_faults</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>context_switches</code></td>
|
|
<td>the count of context switches by applications running on
|
|
the platform</td>
|
|
<td><code>perf.context_switches</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>cpu_migrations</code></td>
|
|
<td>the count of CPU migrations, that is, where the process
|
|
moved from one logical processor to another, by
|
|
applications running on the platform</td>
|
|
<td><code>perf.cpu_migrations</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>page_faults_min</code></td>
|
|
<td>the count of minor page faults, that is, where the
|
|
page was present in the page cache, and therefore
|
|
the fault avoided loading it from storage, by
|
|
applications running on the platform</td>
|
|
<td><code>perf.page_faults_min</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>page_faults_maj</code></td>
|
|
<td>the count of major page faults, that is, where the
|
|
page was not present in the page cache, and
|
|
therefore had to be fetched from storage, by
|
|
applications running on the platform</td>
|
|
<td><code>perf.page_faults_maj</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>alignment_faults</code></td>
|
|
<td>the count of alignment faults, that is when
|
|
the load or store is not aligned properly, by
|
|
applications running on the platform</td>
|
|
<td><code>perf.alignment_faults</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>emulation_faults</code></td>
|
|
<td>the count of emulation faults, that is when
|
|
the kernel traps on unimplemented instrucions
|
|
and emulates them for user space, by
|
|
applications running on the platform</td>
|
|
<td><code>perf.emulation_faults</code></td>
|
|
</tr>
|
|
</table>
|
|
|
|
<h3><a id="elementsDevices">Devices</a></h3>
|
|
|
|
<p>
|
|
The final set of XML elements are all used to describe devices
|
|
provided to the guest domain. All devices occur as children
|
|
of the main <code>devices</code> element.
|
|
<span class="since">Since 0.1.3</span>
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<emulator>/usr/lib/xen/bin/qemu-dm</emulator>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><a id="elementEmulator"><code>emulator</code></a></dt>
|
|
<dd>
|
|
The contents of the <code>emulator</code> element specify
|
|
the fully qualified path to the device model emulator binary.
|
|
The <a href="formatcaps.html">capabilities XML</a> specifies
|
|
the recommended default emulator to use for each particular
|
|
domain type / architecture combination.
|
|
</dd>
|
|
</dl>
|
|
|
|
<p>
|
|
To help users identifying devices they care about, every
|
|
device can have direct child <code>alias</code> element
|
|
which then has <code>name</code> attribute where users can
|
|
store identifier for the device. The identifier has to have
|
|
"ua-" prefix and must be unique within the domain. Additionally, the
|
|
identifier must consist only of the following characters:
|
|
<code>[a-zA-Z0-9_-]</code>.
|
|
<span class="since">Since 3.9.0</span>
|
|
</p>
|
|
|
|
<pre>
|
|
<devices>
|
|
<disk type='file'>
|
|
<alias name='ua-myDisk'/>
|
|
</disk>
|
|
<interface type='network' trustGuestRxFilters='yes'>
|
|
<alias name='ua-myNIC'/>
|
|
</interface>
|
|
...
|
|
</devices>
|
|
</pre>
|
|
|
|
<h4><a id="elementsDisks">Hard drives, floppy disks, CDROMs</a></h4>
|
|
|
|
<p>
|
|
Any device that looks like a disk, be it a floppy, harddisk,
|
|
cdrom, or paravirtualized driver is specified via the <code>disk</code>
|
|
element.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<disk type='file' snapshot='external'>
|
|
<driver name="tap" type="aio" cache="default"/>
|
|
<source file='/var/lib/xen/images/fv0' startupPolicy='optional'>
|
|
<seclabel relabel='no'/>
|
|
</source>
|
|
<target dev='hda' bus='ide'/>
|
|
<iotune>
|
|
<total_bytes_sec>10000000</total_bytes_sec>
|
|
<read_iops_sec>400000</read_iops_sec>
|
|
<write_iops_sec>100000</write_iops_sec>
|
|
</iotune>
|
|
<boot order='2'/>
|
|
<encryption type='...'>
|
|
...
|
|
</encryption>
|
|
<shareable/>
|
|
<serial>
|
|
...
|
|
</serial>
|
|
</disk>
|
|
...
|
|
<disk type='network'>
|
|
<driver name="qemu" type="raw" io="threads" ioeventfd="on" event_idx="off"/>
|
|
<source protocol="sheepdog" name="image_name">
|
|
<host name="hostname" port="7000"/>
|
|
</source>
|
|
<target dev="hdb" bus="ide"/>
|
|
<boot order='1'/>
|
|
<transient/>
|
|
<address type='drive' controller='0' bus='1' unit='0'/>
|
|
</disk>
|
|
<disk type='network'>
|
|
<driver name="qemu" type="raw"/>
|
|
<source protocol="rbd" name="image_name2">
|
|
<host name="hostname" port="7000"/>
|
|
<snapshot name="snapname"/>
|
|
<config file="/path/to/file"/>
|
|
<auth username='myuser'>
|
|
<secret type='ceph' usage='mypassid'/>
|
|
</auth>
|
|
</source>
|
|
<target dev="hdc" bus="ide"/>
|
|
</disk>
|
|
<disk type='block' device='cdrom'>
|
|
<driver name='qemu' type='raw'/>
|
|
<target dev='hdd' bus='ide' tray='open'/>
|
|
<readonly/>
|
|
</disk>
|
|
<disk type='network' device='cdrom'>
|
|
<driver name='qemu' type='raw'/>
|
|
<source protocol="http" name="url_path" query="foo=bar&amp;baz=flurb>
|
|
<host name="hostname" port="80"/>
|
|
<cookies>
|
|
<cookie name="test">somevalue</cookie>
|
|
</cookies>
|
|
<readahead size='65536'/>
|
|
<timeout seconds='6'/>
|
|
</source>
|
|
<target dev='hde' bus='ide' tray='open'/>
|
|
<readonly/>
|
|
</disk>
|
|
<disk type='network' device='cdrom'>
|
|
<driver name='qemu' type='raw'/>
|
|
<source protocol="https" name="url_path">
|
|
<host name="hostname" port="443"/>
|
|
<ssl verify="no"/>
|
|
</source>
|
|
<target dev='hdf' bus='ide' tray='open'/>
|
|
<readonly/>
|
|
</disk>
|
|
<disk type='network' device='cdrom'>
|
|
<driver name='qemu' type='raw'/>
|
|
<source protocol="ftp" name="url_path">
|
|
<host name="hostname" port="21"/>
|
|
</source>
|
|
<target dev='hdg' bus='ide' tray='open'/>
|
|
<readonly/>
|
|
</disk>
|
|
<disk type='network' device='cdrom'>
|
|
<driver name='qemu' type='raw'/>
|
|
<source protocol="ftps" name="url_path">
|
|
<host name="hostname" port="990"/>
|
|
</source>
|
|
<target dev='hdh' bus='ide' tray='open'/>
|
|
<readonly/>
|
|
</disk>
|
|
<disk type='network' device='cdrom'>
|
|
<driver name='qemu' type='raw'/>
|
|
<source protocol="tftp" name="url_path">
|
|
<host name="hostname" port="69"/>
|
|
</source>
|
|
<target dev='hdi' bus='ide' tray='open'/>
|
|
<readonly/>
|
|
</disk>
|
|
<disk type='block' device='lun'>
|
|
<driver name='qemu' type='raw'/>
|
|
<source dev='/dev/sda'>
|
|
<slices>
|
|
<slice type='storage' offset='12345' size='123'/>
|
|
</slices>
|
|
<reservations managed='no'>
|
|
<source type='unix' path='/path/to/qemu-pr-helper' mode='client'/>
|
|
</reservations>
|
|
</source>
|
|
<target dev='sda' bus='scsi'/>
|
|
<address type='drive' controller='0' bus='0' target='3' unit='0'/>
|
|
</disk>
|
|
<disk type='block' device='disk'>
|
|
<driver name='qemu' type='raw'/>
|
|
<source dev='/dev/sda'/>
|
|
<geometry cyls='16383' heads='16' secs='63' trans='lba'/>
|
|
<blockio logical_block_size='512' physical_block_size='4096'/>
|
|
<target dev='hdj' bus='ide'/>
|
|
</disk>
|
|
<disk type='volume' device='disk'>
|
|
<driver name='qemu' type='raw'/>
|
|
<source pool='blk-pool0' volume='blk-pool0-vol0'/>
|
|
<target dev='hdk' bus='ide'/>
|
|
</disk>
|
|
<disk type='network' device='disk'>
|
|
<driver name='qemu' type='raw'/>
|
|
<source protocol='iscsi' name='iqn.2013-07.com.example:iscsi-nopool/2'>
|
|
<host name='example.com' port='3260'/>
|
|
<auth username='myuser'>
|
|
<secret type='iscsi' usage='libvirtiscsi'/>
|
|
</auth>
|
|
</source>
|
|
<target dev='vda' bus='virtio'/>
|
|
</disk>
|
|
<disk type='network' device='lun'>
|
|
<driver name='qemu' type='raw'/>
|
|
<source protocol='iscsi' name='iqn.2013-07.com.example:iscsi-nopool/1'>
|
|
<host name='example.com' port='3260'/>
|
|
<auth username='myuser'>
|
|
<secret type='iscsi' usage='libvirtiscsi'/>
|
|
</auth>
|
|
</source>
|
|
<target dev='sdb' bus='scsi'/>
|
|
</disk>
|
|
<disk type='network' device='lun'>
|
|
<driver name='qemu' type='raw'/>
|
|
<source protocol='iscsi' name='iqn.2013-07.com.example:iscsi-nopool/0'>
|
|
<host name='example.com' port='3260'/>
|
|
<initiator>
|
|
<iqn name='iqn.2013-07.com.example:client'/>
|
|
</initiator>
|
|
</source>
|
|
<target dev='sdb' bus='scsi'/>
|
|
</disk>
|
|
<disk type='volume' device='disk'>
|
|
<driver name='qemu' type='raw'/>
|
|
<source pool='iscsi-pool' volume='unit:0:0:1' mode='host'/>
|
|
<target dev='vdb' bus='virtio'/>
|
|
</disk>
|
|
<disk type='volume' device='disk'>
|
|
<driver name='qemu' type='raw'/>
|
|
<source pool='iscsi-pool' volume='unit:0:0:2' mode='direct'/>
|
|
<target dev='vdc' bus='virtio'/>
|
|
</disk>
|
|
<disk type='file' device='disk'>
|
|
<driver name='qemu' type='qcow2' queues='4'/>
|
|
<source file='/var/lib/libvirt/images/domain.qcow'/>
|
|
<backingStore type='file'>
|
|
<format type='qcow2'/>
|
|
<source file='/var/lib/libvirt/images/snapshot.qcow'/>
|
|
<backingStore type='block'>
|
|
<format type='raw'/>
|
|
<source dev='/dev/mapper/base'/>
|
|
<backingStore/>
|
|
</backingStore>
|
|
</backingStore>
|
|
<target dev='vdd' bus='virtio'/>
|
|
</disk>
|
|
<disk type='nvme' device='disk'>
|
|
<driver name='qemu' type='raw'/>
|
|
<source type='pci' managed='yes' namespace='1'>
|
|
<address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
|
|
</source>
|
|
<target dev='vde' bus='virtio'/>
|
|
</disk>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><code>disk</code></dt>
|
|
<dd>The <code>disk</code> element is the main container for
|
|
describing disks and supports the following attributes:
|
|
<dl>
|
|
<dt><code>type</code></dt>
|
|
<dd>
|
|
Valid values are "file", "block",
|
|
"dir" (<span class="since">since 0.7.5</span>),
|
|
"network" (<span class="since">since 0.8.7</span>), or
|
|
"volume" (<span class="since">since 1.0.5</span>), or
|
|
"nvme" (<span class="since">since 6.0.0</span>)
|
|
and refer to the underlying source for the disk.
|
|
<span class="since">Since 0.0.3</span>
|
|
</dd>
|
|
<dt><code>device</code></dt>
|
|
<dd>
|
|
Indicates how the disk is to be exposed to the guest OS. Possible
|
|
values for this attribute are "floppy", "disk", "cdrom", and "lun",
|
|
defaulting to "disk".
|
|
<p>
|
|
Using "lun" (<span class="since">since 0.9.10</span>) is only
|
|
valid when the <code>type</code> is "block" or "network" for
|
|
<code>protocol='iscsi'</code> or when the <code>type</code>
|
|
is "volume" when using an iSCSI source <code>pool</code>
|
|
for <code>mode</code> "host" or as an
|
|
<a href="http://wiki.libvirt.org/page/NPIV_in_libvirt">NPIV</a>
|
|
virtual Host Bus Adapter (vHBA) using a Fibre Channel storage pool.
|
|
Configured in this manner, the LUN behaves identically to "disk",
|
|
except that generic SCSI commands from the guest are accepted
|
|
and passed through to the physical device. Also note that
|
|
device='lun' will only be recognized for actual raw devices,
|
|
but never for individual partitions or LVM partitions (in those
|
|
cases, the kernel will reject the generic SCSI commands, making
|
|
it identical to device='disk').
|
|
<span class="since">Since 0.1.4</span>
|
|
</p>
|
|
</dd>
|
|
<dt><code>model</code></dt>
|
|
<dd>
|
|
Indicates the emulated device model of the disk. Typically
|
|
this is indicated solely by the <code>bus</code> property but
|
|
for <code>bus</code> "virtio" the model can be specified further
|
|
with "virtio-transitional", "virtio-non-transitional", or
|
|
"virtio". See
|
|
<a href="#elementsVirtioTransitional">Virtio transitional devices</a>
|
|
for more details.
|
|
<span class="since">Since 5.2.0</span>
|
|
</dd>
|
|
<dt><code>rawio</code></dt>
|
|
<dd>
|
|
Indicates whether the disk needs rawio capability. Valid
|
|
settings are "yes" or "no" (default is "no"). If any one disk
|
|
in a domain has rawio='yes', rawio capability will be enabled
|
|
for all disks in the domain (because, in the case of QEMU, this
|
|
capability can only be set on a per-process basis). This attribute
|
|
is only valid when device is "lun". NB, <code>rawio</code> intends
|
|
to confine the capability per-device, however, current QEMU
|
|
implementation gives the domain process broader capability
|
|
than that (per-process basis, affects all the domain disks).
|
|
To confine the capability as much as possible for QEMU driver
|
|
as this stage, <code>sgio</code> is recommended, it's more
|
|
secure than <code>rawio</code>.
|
|
<span class="since">Since 0.9.10</span>
|
|
</dd>
|
|
<dt><code>sgio</code></dt>
|
|
<dd>
|
|
If supported by the hypervisor and OS, indicates whether
|
|
unprivileged SG_IO commands are filtered for the disk. Valid
|
|
settings are "filtered" or "unfiltered" where the default is
|
|
"filtered". Only available when the <code>device</code> is 'lun'.
|
|
<span class="since">Since 1.0.2</span>
|
|
</dd>
|
|
<dt><code>snapshot</code></dt>
|
|
<dd>
|
|
Indicates the default behavior of the disk during disk snapshots:
|
|
"<code>internal</code>" requires a file format such as qcow2 that
|
|
can store both the snapshot and the data changes since the snapshot;
|
|
"<code>external</code>" will separate the snapshot from the live
|
|
data; and "<code>no</code>" means the disk will not participate in
|
|
snapshots. Read-only disks default to "<code>no</code>", while the
|
|
default for other disks depends on the hypervisor's capabilities.
|
|
Some hypervisors allow a per-snapshot choice as well, during
|
|
<a href="formatsnapshot.html">domain snapshot creation</a>.
|
|
Not all snapshot modes are supported; for example, enabling
|
|
snapshots with a transient disk generally does not make sense.
|
|
<span class="since">Since 0.9.5</span>
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
<dt><code>source</code></dt>
|
|
<dd>Representation of the disk <code>source</code> depends on the
|
|
disk <code>type</code> attribute value as follows:
|
|
<dl>
|
|
<dt><code>file</code></dt>
|
|
<dd>
|
|
The <code>file</code> attribute specifies the fully-qualified
|
|
path to the file holding the disk.
|
|
<span class="since">Since 0.0.3</span>
|
|
</dd>
|
|
<dt><code>block</code></dt>
|
|
<dd>
|
|
The <code>dev</code> attribute specifies the fully-qualified path
|
|
to the host device to serve as the disk.
|
|
<span class="since">Since 0.0.3</span>
|
|
</dd>
|
|
<dt><code>dir</code></dt>
|
|
<dd>
|
|
The <code>dir</code> attribute specifies the fully-qualified path
|
|
to the directory to use as the disk.
|
|
<span class="since">Since 0.7.5</span>
|
|
</dd>
|
|
<dt><code>network</code></dt>
|
|
<dd>
|
|
The <code>protocol</code> attribute specifies the protocol to
|
|
access to the requested image. Possible values are "nbd",
|
|
"iscsi", "rbd", "sheepdog", "gluster", "vxhs", "http", "https",
|
|
"ftp", ftps", or "tftp".
|
|
|
|
<p>For any <code>protocol</code> other than <code>nbd</code>
|
|
an additional attribute <code>name</code>
|
|
is mandatory to specify which volume/image will be used.
|
|
</p>
|
|
|
|
<p>For "nbd", the <code>name</code> attribute is optional. TLS
|
|
transport for NBD can be enabled by setting the <code>tls</code>
|
|
attribute to <code>yes</code>. For the QEMU hypervisor, usage of
|
|
a TLS environment can also be globally controlled on the host by
|
|
the <code>nbd_tls</code> and <code>nbd_tls_x509_cert_dir</code> in
|
|
/etc/libvirt/qemu.conf.
|
|
('tls' <span class="since">Since 4.5.0</span>)
|
|
</p>
|
|
|
|
<p>For protocols <code>http</code> and <code>https</code> an
|
|
optional attribute <code>query</code> specifies the query string.
|
|
(<span class="since">Since 6.2.0</span>)
|
|
</p>
|
|
|
|
<p>For "iscsi" (<span class="since">since 1.0.4</span>), the
|
|
<code>name</code> attribute may include a logical unit number,
|
|
separated from the target's name by a slash (e.g.,
|
|
<code>iqn.2013-07.com.example:iscsi-pool/1</code>). If not
|
|
specified, the default LUN is zero.
|
|
</p>
|
|
|
|
<p>For "vxhs" (<span class="since">since 3.8.0</span>), the
|
|
<code>name</code> is the UUID of the volume, assigned by the
|
|
HyperScale server. Additionally, an optional attribute
|
|
<code>tls</code> (QEMU only) can be used to control whether a
|
|
VxHS block device would utilize a hypervisor configured TLS
|
|
X.509 certificate environment in order to encrypt the data
|
|
channel. For the QEMU hypervisor, usage of a TLS environment can
|
|
also be globally controlled on the host by the
|
|
<code>vxhs_tls</code> and <code>vxhs_tls_x509_cert_dir</code> or
|
|
<code>default_tls_x509_cert_dir</code> settings in the file
|
|
/etc/libvirt/qemu.conf. If <code>vxhs_tls</code> is enabled,
|
|
then unless the domain <code>tls</code> attribute is set to "no",
|
|
libvirt will use the host configured TLS environment. If the
|
|
<code>tls</code> attribute is set to "yes", then regardless of
|
|
the qemu.conf setting, TLS authentication will be attempted.
|
|
</p>
|
|
<span class="since">Since 0.8.7</span>
|
|
</dd>
|
|
<dt><code>volume</code></dt>
|
|
<dd>
|
|
The underlying disk source is represented by attributes
|
|
<code>pool</code> and <code>volume</code>. Attribute
|
|
<code>pool</code> specifies the name of the
|
|
<a href="formatstorage.html">storage pool</a> (managed
|
|
by libvirt) where the disk source resides. Attribute
|
|
<code>volume</code> specifies the name of storage volume (managed
|
|
by libvirt) used as the disk source. The value for the
|
|
<code>volume</code> attribute will be the output from the "Name"
|
|
column of a <code>virsh vol-list [pool-name]</code> command.
|
|
<p>
|
|
Use the attribute <code>mode</code>
|
|
(<span class="since">since 1.1.1</span>) to indicate how to
|
|
represent the LUN as the disk source. Valid values are
|
|
"direct" and "host". If <code>mode</code> is not specified,
|
|
the default is to use "host".
|
|
|
|
Using "direct" as the <code>mode</code> value indicates to use
|
|
the <a href="formatstorage.html">storage pool's</a>
|
|
<code>source</code> element <code>host</code> attribute as
|
|
the disk source to generate the libiscsi URI (e.g.
|
|
'file=iscsi://example.com:3260/iqn.2013-07.com.example:iscsi-pool/1').
|
|
|
|
Using "host" as the <code>mode</code> value indicates to use the
|
|
LUN's path as it shows up on host (e.g.
|
|
'file=/dev/disk/by-path/ip-example.com:3260-iscsi-iqn.2013-07.com.example:iscsi-pool-lun-1').
|
|
|
|
Using a LUN from an iSCSI source pool provides the same
|
|
features as a <code>disk</code> configured using
|
|
<code>type</code> 'block' or 'network' and <code>device</code>
|
|
of 'lun' with respect to how the LUN is presented to and
|
|
may be used by the guest.
|
|
|
|
<span class="since">Since 1.0.5</span>
|
|
</p>
|
|
</dd>
|
|
<dt><code>nvme</code></dt>
|
|
<dd>
|
|
To specify disk source for NVMe disk the <code>source</code>
|
|
element has the following attributes:
|
|
<dl>
|
|
<dt><code>type</code></dt>
|
|
<dd>The type of address specified in <code>address</code>
|
|
sub-element. Currently, only <code>pci</code> value is
|
|
accepted.
|
|
</dd>
|
|
|
|
<dt><code>managed</code></dt>
|
|
<dd>This attribute instructs libvirt to detach NVMe
|
|
controller automatically on domain startup (<code>yes</code>)
|
|
or expect the controller to be detached by system
|
|
administrator (<code>no</code>).
|
|
</dd>
|
|
|
|
<dt><code>namespace</code></dt>
|
|
<dd>The namespace ID which should be assigned to the domain.
|
|
According to NVMe standard, namespace numbers start from 1,
|
|
including.
|
|
</dd>
|
|
</dl>
|
|
|
|
The difference between <code><disk type='nvme'></code>
|
|
and <code><hostdev/></code> is that the latter is plain
|
|
host device assignment with all its limitations (e.g. no live
|
|
migration), while the former makes hypervisor to run the NVMe
|
|
disk through hypervisor's block layer thus enabling all
|
|
features provided by the layer (e.g. snapshots, domain
|
|
migration, etc.). Moreover, since the NVMe disk is unbinded
|
|
from its PCI driver, the host kernel storage stack is not
|
|
involved (compared to passing say <code>/dev/nvme0n1</code> via
|
|
<code><disk type='block'></code> and therefore lower
|
|
latencies can be achieved.
|
|
</dd>
|
|
</dl>
|
|
With "file", "block", and "volume", one or more optional
|
|
sub-elements <code>seclabel</code>, <a href="#seclabel">described
|
|
below</a> (and <span class="since">since 0.9.9</span>), can be
|
|
used to override the domain security labeling policy for just
|
|
that source file. (NB, for "volume" type disk, <code>seclabel</code>
|
|
is only valid when the specified storage volume is of 'file' or
|
|
'block' type).
|
|
<p>
|
|
The <code>source</code> element may also have the <code>index</code>
|
|
attribute with same semantics the <a href='#elementsDiskBackingStoreIndex'>
|
|
<code>index</code></a> attribute of <code>backingStore</code>
|
|
</p>
|
|
<p>
|
|
The <code>source</code> element may contain the following sub elements:
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>host</code></dt>
|
|
<dd>
|
|
<p>
|
|
When the disk <code>type</code> is "network", the <code>source</code>
|
|
may have zero or more <code>host</code> sub-elements used to
|
|
specify the hosts to connect.
|
|
|
|
The <code>host</code> element supports 4 attributes, viz. "name",
|
|
"port", "transport" and "socket", which specify the hostname,
|
|
the port number, transport type and path to socket, respectively.
|
|
The meaning of this element and the number of the elements depend
|
|
on the protocol attribute.
|
|
</p>
|
|
<table class="top_table">
|
|
<tr>
|
|
<th> Protocol </th>
|
|
<th> Meaning </th>
|
|
<th> Number of hosts </th>
|
|
<th> Default port </th>
|
|
</tr>
|
|
<tr>
|
|
<td> nbd </td>
|
|
<td> a server running nbd-server </td>
|
|
<td> only one </td>
|
|
<td> 10809 </td>
|
|
</tr>
|
|
<tr>
|
|
<td> iscsi </td>
|
|
<td> an iSCSI server </td>
|
|
<td> only one </td>
|
|
<td> 3260 </td>
|
|
</tr>
|
|
<tr>
|
|
<td> rbd </td>
|
|
<td> monitor servers of RBD </td>
|
|
<td> one or more </td>
|
|
<td> librados default </td>
|
|
</tr>
|
|
<tr>
|
|
<td> sheepdog </td>
|
|
<td> one of the sheepdog servers (default is localhost:7000) </td>
|
|
<td> zero or one </td>
|
|
<td> 7000 </td>
|
|
</tr>
|
|
<tr>
|
|
<td> gluster </td>
|
|
<td> a server running glusterd daemon </td>
|
|
<td> one or more (<span class="since">Since 2.1.0</span>), just one prior to that </td>
|
|
<td> 24007 </td>
|
|
</tr>
|
|
<tr>
|
|
<td> vxhs </td>
|
|
<td> a server running Veritas HyperScale daemon </td>
|
|
<td> only one </td>
|
|
<td> 9999 </td>
|
|
</tr>
|
|
</table>
|
|
<p>
|
|
gluster supports "tcp", "rdma", "unix" as valid values for the
|
|
transport attribute. nbd supports "tcp" and "unix". Others only
|
|
support "tcp". If nothing is specified, "tcp" is assumed. If the
|
|
transport is "unix", the socket attribute specifies the path to an
|
|
AF_UNIX socket.
|
|
</p>
|
|
</dd>
|
|
<dt><code>snapshot</code></dt>
|
|
<dd>
|
|
The <code>name</code> attribute of <code>snapshot</code> element can
|
|
optionally specify an internal snapshot name to be used as the
|
|
source for storage protocols.
|
|
Supported for 'rbd' <span class="since">since 1.2.11 (QEMU only).</span>
|
|
</dd>
|
|
<dt><code>config</code></dt>
|
|
<dd>
|
|
The <code>file</code> attribute for the <code>config</code> element
|
|
provides a fully qualified path to a configuration file to be
|
|
provided as a parameter to the client of a networked storage
|
|
protocol. Supported for 'rbd' <span class="since">since 1.2.11
|
|
(QEMU only).</span>
|
|
</dd>
|
|
<dt><code>auth</code></dt>
|
|
<dd><span class="since">Since libvirt 3.9.0</span>, the
|
|
<code>auth</code> element is supported for a disk
|
|
<code>type</code> "network" that is using a <code>source</code>
|
|
element with the <code>protocol</code> attributes "rbd" or "iscsi".
|
|
If present, the <code>auth</code> element provides the
|
|
authentication credentials needed to access the source. It
|
|
includes a mandatory attribute <code>username</code>, which
|
|
identifies the username to use during authentication, as well
|
|
as a sub-element <code>secret</code> with mandatory
|
|
attribute <code>type</code>, to tie back to
|
|
a <a href="formatsecret.html">libvirt secret object</a> that
|
|
holds the actual password or other credentials (the domain XML
|
|
intentionally does not expose the password, only the reference
|
|
to the object that does manage the password).
|
|
Known secret types are "ceph" for Ceph RBD network sources and
|
|
"iscsi" for CHAP authentication of iSCSI targets.
|
|
Both will require either a <code>uuid</code> attribute
|
|
with the UUID of the secret object or a <code>usage</code>
|
|
attribute matching the key that was specified in the
|
|
secret object.
|
|
</dd>
|
|
<dt><code>encryption</code></dt>
|
|
<dd><span class="since">Since libvirt 3.9.0</span>, the
|
|
<code>encryption</code> can be a sub-element of the
|
|
<code>source</code> element for encrypted storage sources.
|
|
If present, specifies how the storage source is encrypted
|
|
See the
|
|
<a href="formatstorageencryption.html">Storage Encryption</a>
|
|
page for more information.
|
|
<p/>
|
|
Note that the 'qcow' format of encryption is broken and thus is no
|
|
longer supported for use with disk images.
|
|
(<span class="since">Since libvirt 4.5.0</span>)
|
|
</dd>
|
|
<dt><code>reservations</code></dt>
|
|
<dd><span class="since">Since libvirt 4.4.0</span>, the
|
|
<code>reservations</code> can be a sub-element of the
|
|
<code>source</code> element for storage sources (QEMU driver only).
|
|
If present it enables persistent reservations for SCSI
|
|
based disks. The element has one mandatory attribute
|
|
<code>managed</code> with accepted values <code>yes</code> and
|
|
<code>no</code>. If <code>managed</code> is enabled libvirt prepares
|
|
and manages any resources needed. When the persistent reservations
|
|
are unmanaged, then the hypervisor acts as a client and the path to
|
|
the server socket must be provided in the child element
|
|
<code>source</code>, which currently accepts only the following
|
|
attributes:
|
|
<code>type</code> with one value <code>unix</code>,
|
|
<code>path</code> path to the socket, and
|
|
finally <code>mode</code> which accepts one value
|
|
<code>client</code> specifying the role of hypervisor.
|
|
It's recommended to allow libvirt manage the persistent
|
|
reservations.
|
|
</dd>
|
|
<dt><code>initiator</code></dt>
|
|
<dd><span class="since">Since libvirt 4.7.0</span>, the
|
|
<code>initiator</code> element is supported for a disk
|
|
<code>type</code> "network" that is using a <code>source</code>
|
|
element with the <code>protocol</code> attribute "iscsi".
|
|
If present, the <code>initiator</code> element provides the
|
|
initiator IQN needed to access the source via mandatory
|
|
attribute <code>name</code>.
|
|
</dd>
|
|
<dt><code>address</code></dt>
|
|
<dd>For disk of type <code>nvme</code> this element
|
|
specifies the PCI address of the host NVMe
|
|
controller.
|
|
<span class="since">Since 6.0.0</span>
|
|
</dd>
|
|
<dt><code>slices</code></dt>
|
|
<dd>The <code>slices</code> element using its <code>slice</code>
|
|
sub-elements allows configuring offset and size of either the
|
|
location of the image format (<code>slice type='storage'</code>)
|
|
inside the storage source or the guest data inside the image format
|
|
container (future expansion).
|
|
|
|
The <code>offset</code> and <code>size</code> values are in bytes.
|
|
<span class="since">Since 6.1.0</span>
|
|
</dd>
|
|
<dt><code>ssl</code></dt>
|
|
<dd>
|
|
For <code>https</code> and <code>ftps</code> accessed storage it's
|
|
possible to tweak the SSL transport parameters with this element.
|
|
The <code>verify</code> attribute allows to turn on or off SSL
|
|
certificate validation. Supported values are <code>yes</code> and
|
|
<code>no</code>. <span class="since">Since 6.2.0</span>
|
|
</dd>
|
|
<dt><code>cookies</code></dt>
|
|
<dd>
|
|
For <code>http</code> and <code>https</code> accessed storage it's
|
|
possible to pass one or more cookies. The cookie name and value
|
|
must conform to the HTTP specification.
|
|
<span class="since">Since 6.2.0</span>
|
|
</dd>
|
|
<dt><code>readahead</code></dt>
|
|
<dd>
|
|
Specifies the size of the readahead buffer for protocols
|
|
which support it. (all 'curl' based drivers in qemu). The size
|
|
is in bytes. Note that '0' is considered as if the value is not
|
|
provided.
|
|
<span class="since">Since 6.2.0</span>
|
|
</dd>
|
|
<dt><code>timeout</code></dt>
|
|
<dd>
|
|
Specifies the connection timeout for protocols which support it.
|
|
Note that '0' is considered as if the value is not provided.
|
|
<span class="since">Since 6.2.0</span>
|
|
</dd>
|
|
</dl>
|
|
|
|
<p>
|
|
For a "file" or "volume" disk type which represents a cdrom or floppy
|
|
(the <code>device</code> attribute), it is possible to define
|
|
policy what to do with the disk if the source file is not accessible.
|
|
(NB, <code>startupPolicy</code> is not valid for "volume" disk unless
|
|
the specified storage volume is of "file" type). This is done by the
|
|
<code>startupPolicy</code> attribute
|
|
(<span class="since">since 0.9.7</span>),
|
|
accepting these values:
|
|
</p>
|
|
<table class="top_table">
|
|
<tr>
|
|
<td> mandatory </td>
|
|
<td> fail if missing for any reason (the default) </td>
|
|
</tr>
|
|
<tr>
|
|
<td> requisite </td>
|
|
<td> fail if missing on boot up,
|
|
drop if missing on migrate/restore/revert </td>
|
|
</tr>
|
|
<tr>
|
|
<td> optional </td>
|
|
<td> drop if missing at any start attempt </td>
|
|
</tr>
|
|
</table>
|
|
<p>
|
|
<span class="since">Since 1.1.2</span> the <code>startupPolicy</code>
|
|
is extended to support hard disks besides cdrom and floppy. On guest
|
|
cold bootup, if a certain disk is not accessible or its disk chain is
|
|
broken, with startupPolicy 'optional' the guest will drop this disk.
|
|
This feature doesn't support migration currently.
|
|
</p>
|
|
</dd>
|
|
<dt><code>backingStore</code></dt>
|
|
<dd>
|
|
This element describes the backing store used by the disk
|
|
specified by sibling <code>source</code> element.
|
|
<span class="since">Since 1.2.4.</span>
|
|
|
|
If the hypervisor driver does not support the
|
|
<a href='formatdomaincaps.html#featureBackingStoreInput'>
|
|
<code>backingStoreInput</code></a>
|
|
(<span class='since'>Since 5.10.0</span>)
|
|
domain feature the <code>backingStore</code> is ignored on
|
|
input and only used for output to describe the detected
|
|
backing chains of running domains.
|
|
|
|
If <code>backingStoreInput</code> is supported
|
|
the <code>backingStore</code> is used as the backing image of
|
|
<code>source</code> or other <code>backingStore</code> overriding
|
|
any backing image information recorded in the image metadata.
|
|
|
|
An empty <code>backingStore</code> element means the sibling
|
|
source is self-contained and is not based on any backing store.
|
|
|
|
For the detected backing chain information to be accurate, the
|
|
backing format must be correctly specified in the metadata of
|
|
each file of the chain (files created by libvirt satisfy this
|
|
property, but using existing external files for snapshot or
|
|
block copy operations requires the end user to pre-create the
|
|
file correctly). The following attributes are
|
|
supported in <code>backingStore</code>:
|
|
<dl>
|
|
<dt><code>type</code></dt>
|
|
<dd>
|
|
The <code>type</code> attribute represents the type of disk used
|
|
by the backing store, see disk type attribute above for more
|
|
details and possible values.
|
|
</dd>
|
|
<dt><code><a id="elementsDiskBackingStoreIndex">index</a></code></dt>
|
|
<dd>
|
|
This attribute is only valid in output (and ignored on input) and
|
|
it can be used to refer to a specific part of the disk chain when
|
|
doing block operations (such as via the
|
|
<code>virDomainBlockRebase</code> API). For example,
|
|
<code>vda[2]</code> refers to the backing store with
|
|
<code>index='2'</code> of the disk with <code>vda</code> target.
|
|
</dd>
|
|
</dl>
|
|
Moreover, <code>backingStore</code> supports the following sub-elements:
|
|
<dl>
|
|
<dt><code>format</code></dt>
|
|
<dd>
|
|
The <code>format</code> element contains <code>type</code>
|
|
attribute which specifies the internal format of the backing
|
|
store, such as <code>raw</code> or <code>qcow2</code>.
|
|
</dd>
|
|
<dt><code>source</code></dt>
|
|
<dd>
|
|
This element has the same structure as the <code>source</code>
|
|
element in <code>disk</code>. It specifies which file, device,
|
|
or network location contains the data of the described backing
|
|
store.
|
|
</dd>
|
|
<dt><code>backingStore</code></dt>
|
|
<dd>
|
|
If the backing store is not self-contained, the next element
|
|
in the chain is described by nested <code>backingStore</code>
|
|
element.
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
<dt><code>mirror</code></dt>
|
|
<dd>
|
|
This element is present if the hypervisor has started a
|
|
long-running block job operation, where the mirror location in
|
|
the <code>source</code> sub-element will eventually have the
|
|
same contents as the source, and with the file format in the
|
|
sub-element <code>format</code> (which might differ from the
|
|
format of the source). The details of the <code>source</code>
|
|
sub-element are determined by the <code>type</code> attribute
|
|
of the mirror, similar to what is done for the
|
|
overall <code>disk</code> device element. The <code>job</code>
|
|
attribute mentions which API started the operation ("copy" for
|
|
the <code>virDomainBlockRebase</code> API, or "active-commit"
|
|
for the <code>virDomainBlockCommit</code>
|
|
API), <span class="since">since 1.2.7</span>. The
|
|
attribute <code>ready</code>, if present, tracks progress of
|
|
the job: <code>yes</code> if the disk is known to be ready to
|
|
pivot, or, <span class="since">since
|
|
1.2.7</span>, <code>abort</code> or <code>pivot</code> if the
|
|
job is in the process of completing. If <code>ready</code> is
|
|
not present, the disk is probably still
|
|
copying. For now, this element only valid in output; it is
|
|
ignored on input. The <code>source</code> sub-element exists
|
|
for all two-phase jobs <span class="since">since 1.2.6</span>.
|
|
Older libvirt supported only block copy to a
|
|
file, <span class="since">since 0.9.12</span>; for
|
|
compatibility with older clients, such jobs include redundant
|
|
information in the attributes <code>file</code>
|
|
and <code>format</code> in the <code>mirror</code> element.
|
|
</dd>
|
|
<dt><code>target</code></dt>
|
|
<dd>The <code>target</code> element controls the bus / device
|
|
under which the disk is exposed to the guest
|
|
OS. The <code>dev</code> attribute indicates the "logical"
|
|
device name. The actual device name specified is not
|
|
guaranteed to map to the device name in the guest OS. Treat it
|
|
as a device ordering hint. The optional <code>bus</code>
|
|
attribute specifies the type of disk device to emulate;
|
|
possible values are driver specific, with typical values being
|
|
"ide", "scsi", "virtio", "xen", "usb", "sata", or
|
|
"sd" <span class="since">"sd" since 1.1.2</span>. If omitted, the bus
|
|
type is inferred from the style of the device name (e.g. a device named
|
|
'sda' will typically be exported using a SCSI bus). The optional
|
|
attribute <code>tray</code> indicates the tray status of the
|
|
removable disks (i.e. CDROM or Floppy disk), the value can be either
|
|
"open" or "closed", defaults to "closed". NB, the value of
|
|
<code>tray</code> could be updated while the domain is running.
|
|
The optional attribute <code>removable</code> sets the
|
|
removable flag for USB disks, and its value can be either "on"
|
|
or "off", defaulting to "off". <span class="since">Since
|
|
0.0.3; <code>bus</code> attribute since 0.4.3;
|
|
<code>tray</code> attribute since 0.9.11; "usb" attribute value since
|
|
after 0.4.4; "sata" attribute value since 0.9.7; "removable" attribute
|
|
value since 1.1.3</span>
|
|
</dd>
|
|
<dt><code>iotune</code></dt>
|
|
<dd>The optional <code>iotune</code> element provides the
|
|
ability to provide additional per-device I/O tuning, with
|
|
values that can vary for each device (contrast this to
|
|
the <a href="#elementsBlockTuning"><code><blkiotune></code></a>
|
|
element, which applies globally to the domain). Currently,
|
|
the only tuning available is Block I/O throttling for qemu.
|
|
This element has optional sub-elements; any sub-element not
|
|
specified or given with a value of 0 implies no
|
|
limit. <span class="since">Since 0.9.8</span>
|
|
<dl>
|
|
<dt><code>total_bytes_sec</code></dt>
|
|
<dd>The optional <code>total_bytes_sec</code> element is the
|
|
total throughput limit in bytes per second. This cannot
|
|
appear with <code>read_bytes_sec</code>
|
|
or <code>write_bytes_sec</code>.</dd>
|
|
<dt><code>read_bytes_sec</code></dt>
|
|
<dd>The optional <code>read_bytes_sec</code> element is the
|
|
read throughput limit in bytes per second.</dd>
|
|
<dt><code>write_bytes_sec</code></dt>
|
|
<dd>The optional <code>write_bytes_sec</code> element is the
|
|
write throughput limit in bytes per second.</dd>
|
|
<dt><code>total_iops_sec</code></dt>
|
|
<dd>The optional <code>total_iops_sec</code> element is the
|
|
total I/O operations per second. This cannot
|
|
appear with <code>read_iops_sec</code>
|
|
or <code>write_iops_sec</code>.</dd>
|
|
<dt><code>read_iops_sec</code></dt>
|
|
<dd>The optional <code>read_iops_sec</code> element is the
|
|
read I/O operations per second.</dd>
|
|
<dt><code>write_iops_sec</code></dt>
|
|
<dd>The optional <code>write_iops_sec</code> element is the
|
|
write I/O operations per second.</dd>
|
|
<dt><code>total_bytes_sec_max</code></dt>
|
|
<dd>The optional <code>total_bytes_sec_max</code> element is the
|
|
maximum total throughput limit in bytes per second. This cannot
|
|
appear with <code>read_bytes_sec_max</code>
|
|
or <code>write_bytes_sec_max</code>.</dd>
|
|
<dt><code>read_bytes_sec_max</code></dt>
|
|
<dd>The optional <code>read_bytes_sec_max</code> element is the
|
|
maximum read throughput limit in bytes per second.</dd>
|
|
<dt><code>write_bytes_sec_max</code></dt>
|
|
<dd>The optional <code>write_bytes_sec_max</code> element is the
|
|
maximum write throughput limit in bytes per second.</dd>
|
|
<dt><code>total_iops_sec_max</code></dt>
|
|
<dd>The optional <code>total_iops_sec_max</code> element is the
|
|
maximum total I/O operations per second. This cannot
|
|
appear with <code>read_iops_sec_max</code>
|
|
or <code>write_iops_sec_max</code>.</dd>
|
|
<dt><code>read_iops_sec_max</code></dt>
|
|
<dd>The optional <code>read_iops_sec_max</code> element is the
|
|
maximum read I/O operations per second.</dd>
|
|
<dt><code>write_iops_sec_max</code></dt>
|
|
<dd>The optional <code>write_iops_sec_max</code> element is the
|
|
maximum write I/O operations per second.</dd>
|
|
<dt><code>size_iops_sec</code></dt>
|
|
<dd>The optional <code>size_iops_sec</code> element is the
|
|
size of I/O operations per second.
|
|
<p>
|
|
<span class="since">Throughput limits since 1.2.11 and QEMU 1.7</span>
|
|
</p>
|
|
</dd>
|
|
<dt><code>group_name</code></dt>
|
|
<dd>The optional <code>group_name</code> provides the cability
|
|
to share I/O throttling quota between multiple drives. This
|
|
prevents end-users from circumventing a hosting provider's
|
|
throttling policy by splitting 1 large drive in N small drives
|
|
and getting N times the normal throttling quota. Any name may
|
|
be used.
|
|
<p>
|
|
<span class="since">group_name since 3.0.0 and QEMU 2.4</span>
|
|
</p>
|
|
</dd>
|
|
<dt><code>total_bytes_sec_max_length</code></dt>
|
|
<dd>The optional <code>total_bytes_sec_max_length</code>
|
|
element is the maximum duration in seconds for the
|
|
<code>total_bytes_sec_max</code> burst period. Only valid
|
|
when the <code>total_bytes_sec_max</code> is set.</dd>
|
|
<dt><code>read_bytes_sec_max_length</code></dt>
|
|
<dd>The optional <code>read_bytes_sec_max_length</code>
|
|
element is the maximum duration in seconds for the
|
|
<code>read_bytes_sec_max</code> burst period. Only valid
|
|
when the <code>read_bytes_sec_max</code> is set.</dd>
|
|
<dt><code>write_bytes_sec_max</code></dt>
|
|
<dd>The optional <code>write_bytes_sec_max_length</code>
|
|
element is the maximum duration in seconds for the
|
|
<code>write_bytes_sec_max</code> burst period. Only valid
|
|
when the <code>write_bytes_sec_max</code> is set.</dd>
|
|
<dt><code>total_iops_sec_max_length</code></dt>
|
|
<dd>The optional <code>total_iops_sec_max_length</code>
|
|
element is the maximum duration in seconds for the
|
|
<code>total_iops_sec_max</code> burst period. Only valid
|
|
when the <code>total_iops_sec_max</code> is set.</dd>
|
|
<dt><code>read_iops_sec_max_length</code></dt>
|
|
<dd>The optional <code>read_iops_sec_max_length</code>
|
|
element is the maximum duration in seconds for the
|
|
<code>read_iops_sec_max</code> burst period. Only valid
|
|
when the <code>read_iops_sec_max</code> is set.</dd>
|
|
<dt><code>write_iops_sec_max</code></dt>
|
|
<dd>The optional <code>write_iops_sec_max_length</code>
|
|
element is the maximum duration in seconds for the
|
|
<code>write_iops_sec_max</code> burst period. Only valid
|
|
when the <code>write_iops_sec_max</code> is set.
|
|
<p>
|
|
<span class="since">Throughput length since 2.4.0 and QEMU 2.6</span>
|
|
</p>
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
<dt><code>driver</code></dt>
|
|
<dd>
|
|
The optional driver element allows specifying further details
|
|
related to the hypervisor driver used to provide the disk.
|
|
<span class="since">Since 0.1.8</span>
|
|
<ul>
|
|
<li>
|
|
If the hypervisor supports multiple backend drivers, then
|
|
the <code>name</code> attribute selects the primary
|
|
backend driver name, while the optional <code>type</code>
|
|
attribute provides the sub-type. For example, xen
|
|
supports a name of "tap", "tap2", "phy", or "file", with a
|
|
type of "aio", while qemu only supports a name of "qemu",
|
|
but multiple types including "raw", "bochs", "qcow2", and
|
|
"qed".
|
|
</li>
|
|
<li>
|
|
The optional <code>cache</code> attribute controls the
|
|
cache mechanism, possible values are "default", "none",
|
|
"writethrough", "writeback", "directsync" (like
|
|
"writethrough", but it bypasses the host page cache) and
|
|
"unsafe" (host may cache all disk io, and sync requests from
|
|
guest are ignored).
|
|
<span class="since">
|
|
Since 0.6.0,
|
|
"directsync" since 0.9.5,
|
|
"unsafe" since 0.9.7
|
|
</span>
|
|
</li>
|
|
<li>
|
|
The optional <code>error_policy</code> attribute controls
|
|
how the hypervisor will behave on a disk read or write
|
|
error, possible values are "stop", "report", "ignore", and
|
|
"enospace".<span class="since">Since 0.8.0, "report" since
|
|
0.9.7</span> The default is left to the discretion of the
|
|
hypervisor. There is also an
|
|
optional <code>rerror_policy</code> that controls behavior
|
|
for read errors only. <span class="since">Since
|
|
0.9.7</span>. If no rerror_policy is given, error_policy
|
|
is used for both read and write errors. If rerror_policy
|
|
is given, it overrides the <code>error_policy</code> for
|
|
read errors. Also note that "enospace" is not a valid
|
|
policy for read errors, so if <code>error_policy</code> is
|
|
set to "enospace" and no <code>rerror_policy</code> is
|
|
given, the read error policy will be left at its default.
|
|
</li>
|
|
<li>
|
|
The optional <code>io</code> attribute controls specific
|
|
policies on I/O; qemu guests support "threads" and
|
|
"native" <span class="since">Since 0.8.8</span>, io_uring
|
|
<span class="since">Since 6.3.0 (QEMU 5.0)</span>.
|
|
</li>
|
|
<li>
|
|
The optional <code>ioeventfd</code> attribute allows users to
|
|
set <a href='https://patchwork.kernel.org/patch/43390/'>
|
|
domain I/O asynchronous handling</a> for disk device.
|
|
The default is left to the discretion of the hypervisor.
|
|
Accepted values are "on" and "off". Enabling this allows
|
|
qemu to execute VM while a separate thread handles I/O.
|
|
Typically guests experiencing high system CPU utilization
|
|
during I/O will benefit from this. On the other hand,
|
|
on overloaded host it could increase guest I/O latency.
|
|
<span class="since">Since 0.9.3 (QEMU and KVM only)</span>
|
|
<b>In general you should leave this option alone, unless you
|
|
are very certain you know what you are doing.</b>
|
|
</li>
|
|
<li>
|
|
The optional <code>event_idx</code> attribute controls
|
|
some aspects of device event processing. The value can be
|
|
either 'on' or 'off' - if it is on, it will reduce the
|
|
number of interrupts and exits for the guest. The default
|
|
is determined by QEMU; usually if the feature is
|
|
supported, default is on. In case there is a situation
|
|
where this behavior is suboptimal, this attribute provides
|
|
a way to force the feature off.
|
|
<span class="since">Since 0.9.5 (QEMU and KVM only)</span>
|
|
<b>In general you should leave this option alone, unless you
|
|
are very certain you know what you are doing.</b>
|
|
</li>
|
|
<li>
|
|
The optional <code>copy_on_read</code> attribute controls
|
|
whether to copy read backing file into the image file. The
|
|
value can be either "on" or "off".
|
|
Copy-on-read avoids accessing the same backing file sectors
|
|
repeatedly and is useful when the backing file is over a slow
|
|
network. By default copy-on-read is off.
|
|
<span class='since'>Since 0.9.10 (QEMU and KVM only)</span>
|
|
</li>
|
|
<li>
|
|
The optional <code>discard</code> attribute controls whether
|
|
discard requests (also known as "trim" or "unmap") are
|
|
ignored or passed to the filesystem. The value can be either
|
|
"unmap" (allow the discard request to be passed) or "ignore"
|
|
(ignore the discard request).
|
|
<span class='since'>Since 1.0.6 (QEMU and KVM only)</span>
|
|
</li>
|
|
<li>
|
|
The optional <code>detect_zeroes</code> attribute controls whether
|
|
to detect zero write requests. The value can be "off", "on" or
|
|
"unmap". First two values turn the detection off and on,
|
|
respectively. The third value ("unmap") turns the detection on
|
|
and additionally tries to discard such areas from the image based
|
|
on the value of <code>discard</code> above (it will act as "on"
|
|
if <code>discard</code> is set to "ignore"). NB enabling the
|
|
detection is a compute intensive operation, but can save file
|
|
space and/or time on slow media.
|
|
<span class='since'>Since 2.0.0</span>
|
|
</li>
|
|
<li>
|
|
The optional <code>iothread</code> attribute assigns the
|
|
disk to an IOThread as defined by the range for the domain
|
|
<a href="#elementsIOThreadsAllocation"><code>iothreads</code></a>
|
|
value. Multiple disks may be assigned to the same IOThread and
|
|
are numbered from 1 to the domain iothreads value. Available
|
|
for a disk device <code>target</code> configured to use "virtio"
|
|
<code>bus</code> and "pci" or "ccw" <code>address</code> types.
|
|
<span class='since'>Since 1.2.8 (QEMU 2.1)</span>
|
|
</li>
|
|
<li>
|
|
The optional <code>queues</code> attribute specifies the number of
|
|
virt queues for virtio-blk. (<span class="since">Since 3.9.0</span>)
|
|
</li>
|
|
<li>
|
|
For virtio disks,
|
|
<a href="#elementsVirtio">Virtio-specific options</a> can also be
|
|
set. (<span class="since">Since 3.5.0</span>)
|
|
</li>
|
|
</ul>
|
|
</dd>
|
|
<dt><code>backenddomain</code></dt>
|
|
<dd>The optional <code>backenddomain</code> element allows specifying a
|
|
backend domain (aka driver domain) hosting the disk. Use the
|
|
<code>name</code> attribute to specify the backend domain name.
|
|
<span class="since">Since 1.2.13 (Xen only)</span>
|
|
</dd>
|
|
<dt><code>boot</code></dt>
|
|
<dd>Specifies that the disk is bootable. The <code>order</code>
|
|
attribute determines the order in which devices will be tried during
|
|
boot sequence. On the S390 architecture only the first boot device is
|
|
used. The optional <code>loadparm</code> attribute is an 8 character
|
|
string which can be queried by guests on S390 via sclp or diag 308.
|
|
Linux guests on S390 can use <code>loadparm</code> to select a boot
|
|
entry. <span class="since">Since 3.5.0</span>
|
|
The per-device <code>boot</code> elements cannot be used together
|
|
with general boot elements in
|
|
<a href="#elementsOSBIOS">BIOS bootloader</a> section.
|
|
<span class="since">Since 0.8.8</span>
|
|
</dd>
|
|
<dt><code>encryption</code></dt>
|
|
<dd>Starting with <span class="since">libvirt 3.9.0</span> the
|
|
<code>encryption</code> element is preferred to be a sub-element
|
|
of the <code>source</code> element. If present, specifies how the
|
|
volume is encrypted using "qcow". See the
|
|
<a href="formatstorageencryption.html">Storage Encryption</a> page
|
|
for more information.
|
|
</dd>
|
|
<dt><code>readonly</code></dt>
|
|
<dd>If present, this indicates the device cannot be modified by
|
|
the guest. For now, this is the default for disks with
|
|
attribute <code>device='cdrom'</code>.
|
|
</dd>
|
|
<dt><code>shareable</code></dt>
|
|
<dd>If present, this indicates the device is expected to be shared
|
|
between domains (assuming the hypervisor and OS support this),
|
|
which means that caching should be deactivated for that device.
|
|
</dd>
|
|
<dt><code>transient</code></dt>
|
|
<dd>If present, this indicates that changes to the device
|
|
contents should be reverted automatically when the guest
|
|
exits. With some hypervisors, marking a disk transient
|
|
prevents the domain from participating in migration or
|
|
snapshots. Only suppported in vmx hypervisor.
|
|
<span class="since">Since 0.9.5</span>
|
|
</dd>
|
|
<dt><code>serial</code></dt>
|
|
<dd>If present, this specify serial number of virtual hard drive.
|
|
For example, it may look
|
|
like <code><serial>WD-WMAP9A966149</serial></code>.
|
|
Not supported for scsi-block devices, that is those using
|
|
disk <code>type</code> 'block' using <code>device</code> 'lun'
|
|
on <code>bus</code> 'scsi'.
|
|
<span class="since">Since 0.7.1</span>
|
|
</dd>
|
|
<dt><code>wwn</code></dt>
|
|
<dd>If present, this element specifies the WWN (World Wide Name)
|
|
of a virtual hard disk or CD-ROM drive. It must be composed
|
|
of 16 hexadecimal digits.
|
|
<span class='since'>Since 0.10.1</span>
|
|
</dd>
|
|
<dt><code>vendor</code></dt>
|
|
<dd>If present, this element specifies the vendor of a virtual hard
|
|
disk or CD-ROM device. It must not be longer than 8 printable
|
|
characters.
|
|
<span class='since'>Since 1.0.1</span>
|
|
</dd>
|
|
<dt><code>product</code></dt>
|
|
<dd>If present, this element specifies the product of a virtual hard
|
|
disk or CD-ROM device. It must not be longer than 16 printable
|
|
characters.
|
|
<span class='since'>Since 1.0.1</span>
|
|
</dd>
|
|
<dt><code>address</code></dt>
|
|
<dd>If present, the <code>address</code> element ties the disk
|
|
to a given slot of a controller (the
|
|
actual <code><controller></code> device can often be
|
|
inferred by libvirt, although it can
|
|
be <a href="#elementsControllers">explicitly specified</a>).
|
|
The <code>type</code> attribute is mandatory, and is typically
|
|
"pci" or "drive". For a "pci" controller, additional
|
|
attributes for <code>bus</code>, <code>slot</code>,
|
|
and <code>function</code> must be present, as well as
|
|
optional <code>domain</code> and <code>multifunction</code>.
|
|
Multifunction defaults to 'off'; any other value requires
|
|
QEMU 0.1.3 and <span class="since">libvirt 0.9.7</span>. For a
|
|
"drive" controller, additional attributes
|
|
<code>controller</code>, <code>bus</code>, <code>target</code>
|
|
(<span class="since">libvirt 0.9.11</span>), and <code>unit</code>
|
|
are available, each defaulting to 0.
|
|
</dd>
|
|
<dt><code>auth</code></dt>
|
|
<dd>Starting with <span class="since">libvirt 3.9.0</span> the
|
|
<code>auth</code> element is preferred to be a sub-element of
|
|
the <code>source</code> element. The element is still read and
|
|
managed as a <code>disk</code> sub-element. It is invalid to use
|
|
<code>auth</code> as both a sub-element of <code>disk</code>
|
|
and <code>source</code>. The <code>auth</code> element was
|
|
introduced as a <code>disk</code> sub-element in
|
|
<span class="since">libvirt 0.9.7.</span>
|
|
</dd>
|
|
<dt><code>geometry</code></dt>
|
|
<dd>The optional <code>geometry</code> element provides the
|
|
ability to override geometry settings. This mostly useful for
|
|
S390 DASD-disks or older DOS-disks. <span class="since">0.10.0</span>
|
|
<dl>
|
|
<dt><code>cyls</code></dt>
|
|
<dd>The <code>cyls</code> attribute is the
|
|
number of cylinders. </dd>
|
|
<dt><code>heads</code></dt>
|
|
<dd>The <code>heads</code> attribute is the
|
|
number of heads. </dd>
|
|
<dt><code>secs</code></dt>
|
|
<dd>The <code>secs</code> attribute is the
|
|
number of sectors per track. </dd>
|
|
<dt><code>trans</code></dt>
|
|
<dd>The optional <code>trans</code> attribute is the
|
|
BIOS-Translation-Modus (none, lba or auto)</dd>
|
|
</dl>
|
|
</dd>
|
|
<dt><code>blockio</code></dt>
|
|
<dd>If present, the <code>blockio</code> element allows
|
|
to override any of the block device properties listed below.
|
|
<span class="since">Since 0.10.2 (QEMU and KVM)</span>
|
|
<dl>
|
|
<dt><code>logical_block_size</code></dt>
|
|
<dd>The logical block size the disk will report to the guest
|
|
OS. For Linux this would be the value returned by the
|
|
BLKSSZGET ioctl and describes the smallest units for disk
|
|
I/O.
|
|
</dd>
|
|
<dt><code>physical_block_size</code></dt>
|
|
<dd>The physical block size the disk will report to the guest
|
|
OS. For Linux this would be the value returned by the
|
|
BLKPBSZGET ioctl and describes the disk's hardware sector
|
|
size which can be relevant for the alignment of disk data.
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
</dl>
|
|
|
|
<h4><a id="elementsFilesystems">Filesystems</a></h4>
|
|
|
|
<p>
|
|
A directory on the host that can be accessed directly from the guest.
|
|
<span class="since">since 0.3.3, since 0.8.5 for QEMU/KVM</span>
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<filesystem type='template'>
|
|
<source name='my-vm-template'/>
|
|
<target dir='/'/>
|
|
</filesystem>
|
|
<filesystem type='mount' accessmode='passthrough' multidevs='remap'>
|
|
<driver type='path' wrpolicy='immediate'/>
|
|
<source dir='/export/to/guest'/>
|
|
<target dir='/import/from/host'/>
|
|
<readonly/>
|
|
</filesystem>
|
|
<filesystem type='file' accessmode='passthrough'>
|
|
<driver type='loop' format='raw'/>
|
|
<driver type='path' wrpolicy='immediate'/>
|
|
<source file='/export/to/guest.img'/>
|
|
<target dir='/import/from/host'/>
|
|
<readonly/>
|
|
</filesystem>
|
|
<filesystem type='mount' accessmode='passthrough'>
|
|
<driver type='virtiofs' queue='1024'/>
|
|
<binary path='/usr/libexec/virtiofsd' xattr='on'>
|
|
<cache mode='always'/>
|
|
<lock posix='on' flock='on'/>
|
|
</binary>
|
|
<source dir='/path'/>
|
|
<target dir='mount_tag'/>
|
|
</filesystem>
|
|
...
|
|
</devices>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><code>filesystem</code></dt>
|
|
<dd>
|
|
|
|
The filesystem attribute <code>type</code> specifies the type of the
|
|
<code>source</code>. The possible values are:
|
|
|
|
<dl>
|
|
<dt><code>mount</code></dt>
|
|
<dd>
|
|
A host directory to mount in the guest. Used by LXC,
|
|
OpenVZ <span class="since">(since 0.6.2)</span>
|
|
and QEMU/KVM <span class="since">(since 0.8.5)</span>.
|
|
This is the default <code>type</code> if one is not specified.
|
|
This mode also has an optional
|
|
sub-element <code>driver</code>, with an
|
|
attribute <code>type='path'</code>
|
|
or <code>type='handle'</code> <span class="since">(since
|
|
0.9.7)</span>. The driver block has an optional attribute
|
|
<code>wrpolicy</code> that further controls interaction with
|
|
the host page cache; omitting the attribute gives default behavior,
|
|
while the value <code>immediate</code> means that a host writeback
|
|
is immediately triggered for all pages touched during a guest file
|
|
write operation <span class="since">(since 0.9.10)</span>.
|
|
<span class="since">Since 6.2.0</span>, <code>type='virtiofs'</code>
|
|
is also supported. Using virtiofs requires setting up shared memory,
|
|
see the guide: <a href="kbase/virtiofs.html">Virtio-FS</a>
|
|
</dd>
|
|
<dt><code>template</code></dt>
|
|
<dd>
|
|
OpenVZ filesystem template. Only used by OpenVZ driver.
|
|
</dd>
|
|
<dt><code>file</code></dt>
|
|
<dd>
|
|
A host file will be treated as an image and mounted in
|
|
the guest. The filesystem format will be autodetected.
|
|
Only used by LXC driver.
|
|
</dd>
|
|
<dt><code>block</code></dt>
|
|
<dd>
|
|
A host block device to mount in the guest. The filesystem
|
|
format will be autodetected. Only used by LXC driver
|
|
<span class="since">(since 0.9.5)</span>.
|
|
</dd>
|
|
<dt><code>ram</code></dt>
|
|
<dd>
|
|
An in-memory filesystem, using memory from the host OS.
|
|
The source element has a single attribute <code>usage</code>
|
|
which gives the memory usage limit in KiB, unless units
|
|
are specified by the <code>units</code> attribute. Only used
|
|
by LXC driver.
|
|
<span class="since"> (since 0.9.13)</span></dd>
|
|
<dt><code>bind</code></dt>
|
|
<dd>
|
|
A directory inside the guest will be bound to another
|
|
directory inside the guest. Only used by LXC driver
|
|
<span class="since"> (since 0.9.13)</span></dd>
|
|
</dl>
|
|
|
|
The filesystem element has an optional attribute <code>accessmode</code>
|
|
which specifies the security mode for accessing the source
|
|
<span class="since">(since 0.8.5)</span>. Currently this only works
|
|
with <code>type='mount'</code> for the QEMU/KVM driver.
|
|
For driver type <code>virtiofs</code>, only <code>passthrough</code> is
|
|
supported. For other driver types, the possible
|
|
values are:
|
|
|
|
<dl>
|
|
<dt><code>passthrough</code></dt>
|
|
<dd>
|
|
The <code>source</code> is accessed with the permissions of the
|
|
user inside the guest. This is the default <code>accessmode</code> if
|
|
one is not specified.
|
|
<a href="http://lists.gnu.org/archive/html/qemu-devel/2010-05/msg02673.html">More info</a>
|
|
</dd>
|
|
<dt><code>mapped</code></dt>
|
|
<dd>
|
|
The <code>source</code> is accessed with the permissions of the
|
|
hypervisor (QEMU process).
|
|
<a href="http://lists.gnu.org/archive/html/qemu-devel/2010-05/msg02673.html">More info</a>
|
|
</dd>
|
|
<dt><code>squash</code></dt>
|
|
<dd>
|
|
Similar to 'passthrough', the exception is that failure of
|
|
privileged operations like 'chown' are ignored. This makes a
|
|
passthrough-like mode usable for people who run the hypervisor
|
|
as non-root.
|
|
<a href="http://lists.gnu.org/archive/html/qemu-devel/2010-09/msg00121.html">More info</a>
|
|
</dd>
|
|
</dl>
|
|
|
|
<p>
|
|
<span class="since">Since 5.2.0</span>, the filesystem element
|
|
has an optional attribute <code>model</code> with supported values
|
|
"virtio-transitional", "virtio-non-transitional", or "virtio".
|
|
See <a href="#elementsVirtioTransitional">Virtio transitional devices</a>
|
|
for more details.
|
|
</p>
|
|
|
|
<p>
|
|
The filesystem element has an optional attribute <code>multidevs</code>
|
|
which specifies how to deal with a filesystem export containing more than
|
|
one device, in order to avoid file ID collisions on guest when using 9pfs
|
|
(<span class="since">since 6.3.0, requires QEMU 4.2</span>).
|
|
This attribute is not available for virtiofs. The possible values are:
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>default</code></dt>
|
|
<dd>
|
|
Use QEMU's default setting (which currently is <code>warn</code>).
|
|
</dd>
|
|
<dt><code>remap</code></dt>
|
|
<dd>
|
|
This setting allows guest to access multiple devices per export without
|
|
encountering misbehaviours. Inode numbers from host are automatically
|
|
remapped on guest to actively prevent file ID collisions if guest
|
|
accesses one export containing multiple devices.
|
|
</dd>
|
|
<dt><code>forbid</code></dt>
|
|
<dd>
|
|
Only allow to access one device per export by guest. Attempts to access
|
|
additional devices on the same export will cause the individual
|
|
filesystem access by guest to fail with an error and being logged (once)
|
|
as error on host side.
|
|
</dd>
|
|
<dt><code>warn</code></dt>
|
|
<dd>
|
|
This setting resembles the behaviour of 9pfs prior to QEMU 4.2, that is
|
|
no action is performed to prevent any potential file ID collisions if an
|
|
export contains multiple devices, with the only exception: a warning is
|
|
logged (once) on host side now. This setting may lead to misbehaviours
|
|
on guest side if more than one device is exported per export, due to the
|
|
potential file ID collisions this may cause on guest side in that case.
|
|
</dd>
|
|
</dl>
|
|
|
|
</dd>
|
|
|
|
<p>
|
|
The <code>filesystem</code> element may contain the following subelements:
|
|
</p>
|
|
|
|
<dt><code>driver</code></dt>
|
|
<dd>
|
|
The optional driver element allows specifying further details
|
|
related to the hypervisor driver used to provide the filesystem.
|
|
<span class="since">Since 1.0.6</span>
|
|
<ul>
|
|
<li>
|
|
If the hypervisor supports multiple backend drivers, then
|
|
the <code>type</code> attribute selects the primary
|
|
backend driver name, while the <code>format</code>
|
|
attribute provides the format type. For example, LXC
|
|
supports a type of "loop", with a format of "raw" or
|
|
"nbd" with any format. QEMU supports a type of "path"
|
|
or "handle", but no formats. Virtuozzo driver supports
|
|
a type of "ploop" with a format of "ploop".
|
|
</li>
|
|
<li>
|
|
For virtio-backed devices,
|
|
<a href="#elementsVirtio">Virtio-specific options</a> can also be
|
|
set. (<span class="since">Since 3.5.0</span>)
|
|
</li>
|
|
<li>
|
|
For <code>virtiofs</code>, the <code>queue</code> attribute can be used
|
|
to specify the queue size (i.e. how many requests can the queue fit).
|
|
(<span class="since">Since 6.2.0</span>)
|
|
</li>
|
|
</ul>
|
|
</dd>
|
|
|
|
<dt><code>binary</code></dt>
|
|
<dd>
|
|
The optional <code>binary</code> element can tune the options for virtiofsd.
|
|
All of the following attributes and elements are optional.
|
|
The attribute <code>path</code> can be used to override the path to the daemon.
|
|
Attribute <code>xattr</code> enables the use of filesystem extended attributes.
|
|
Caching can be tuned via the <code>cache</code> element, possible <code>mode</code>
|
|
values being <code>none</code> and <code>always</code>.
|
|
Locking can be controlled via the <code>lock</code>
|
|
element - attributes <code>posix</code> and <code>flock</code> both accepting
|
|
values <code>on</code> or <code>off</code>.
|
|
(<span class="since">Since 6.2.0</span>)
|
|
</dd>
|
|
|
|
<dt><code>source</code></dt>
|
|
<dd>
|
|
The resource on the host that is being accessed in the guest. The
|
|
<code>name</code> attribute must be used with
|
|
<code>type='template'</code>, and the <code>dir</code> attribute must
|
|
be used with <code>type='mount'</code>. The <code>usage</code> attribute
|
|
is used with <code>type='ram'</code> to set the memory limit in KiB,
|
|
unless units are specified by the <code>units</code> attribute.
|
|
</dd>
|
|
|
|
<dt><code>target</code></dt>
|
|
<dd>
|
|
Where the <code>source</code> can be accessed in the guest. For
|
|
most drivers this is an automatic mount point, but for QEMU/KVM
|
|
this is merely an arbitrary string tag that is exported to the
|
|
guest as a hint for where to mount.
|
|
</dd>
|
|
|
|
<dt><code>readonly</code></dt>
|
|
<dd>
|
|
Enables exporting filesystem as a readonly mount for guest, by
|
|
default read-write access is given (currently only works for
|
|
QEMU/KVM driver).
|
|
</dd>
|
|
|
|
<dt><code>space_hard_limit</code></dt>
|
|
<dd>
|
|
Maximum space available to this guest's filesystem.
|
|
<span class="since">Since 0.9.13</span>
|
|
</dd>
|
|
|
|
<dt><code>space_soft_limit</code></dt>
|
|
<dd>
|
|
Maximum space available to this guest's filesystem. The container is
|
|
permitted to exceed its soft limits for a grace period of time. Afterwards the
|
|
hard limit is enforced.
|
|
<span class="since">Since 0.9.13</span>
|
|
</dd>
|
|
</dl>
|
|
|
|
<h4><a id="elementsAddress">Device Addresses</a></h4>
|
|
|
|
<p>
|
|
Many devices have an optional <code><address></code>
|
|
sub-element to describe where the device is placed on the
|
|
virtual bus presented to the guest. If an address (or any
|
|
optional attribute within an address) is omitted on
|
|
input, libvirt will generate an appropriate address; but an
|
|
explicit address is required if more control over layout is
|
|
required. See below for device examples including an address
|
|
element.
|
|
</p>
|
|
|
|
<p>
|
|
Every address has a mandatory attribute <code>type</code> that
|
|
describes which bus the device is on. The choice of which
|
|
address to use for a given device is constrained in part by the
|
|
device and the architecture of the guest. For example,
|
|
a <code><disk></code> device
|
|
uses <code>type='drive'</code>, while
|
|
a <code><console></code> device would
|
|
use <code>type='pci'</code> on i686 or x86_64 guests,
|
|
or <code>type='spapr-vio'</code> on PowerPC64 pseries guests.
|
|
Each address type has further optional attributes that control
|
|
where on the bus the device will be placed:
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>pci</code></dt>
|
|
<dd>PCI addresses have the following additional
|
|
attributes: <code>domain</code> (a 2-byte hex integer, not
|
|
currently used by qemu), <code>bus</code> (a hex value between
|
|
0 and 0xff, inclusive), <code>slot</code> (a hex value between
|
|
0x0 and 0x1f, inclusive), and <code>function</code> (a value
|
|
between 0 and 7, inclusive). Also available is
|
|
the <code>multifunction</code> attribute, which controls
|
|
turning on the multifunction bit for a particular
|
|
slot/function in the PCI control register
|
|
(<span class="since">since 0.9.7, requires QEMU
|
|
0.13</span>). <code>multifunction</code> defaults to 'off',
|
|
but should be set to 'on' for function 0 of a slot that will
|
|
have multiple functions used.
|
|
(<span class="since">Since 4.10.0</span>), PCI address extensions
|
|
depending on the architecture are supported. For example, PCI
|
|
addresses for S390 guests will have a <code>zpci</code> child
|
|
element, with two attributes: <code>uid</code> (a hex value
|
|
between 0x0001 and 0xffff, inclusive), and <code>fid</code> (a
|
|
hex value between 0x00000000 and 0xffffffff, inclusive) used by
|
|
PCI devices on S390 for User-defined Identifiers and Function
|
|
Identifiers.<br/>
|
|
<span class="since">Since 1.3.5</span>, some hypervisor
|
|
drivers may accept an <code><address type='pci'/></code>
|
|
element with no other attributes as an explicit request to
|
|
assign a PCI address for the device rather than some other
|
|
type of address that may also be appropriate for that same
|
|
device (e.g. virtio-mmio).<br/>
|
|
The relationship between the PCI addresses configured in the domain
|
|
XML and those seen by the guest OS can sometime seem confusing: a
|
|
separate document describes <a href="pci-addresses.html">how PCI
|
|
addresses work</a> in more detail.
|
|
</dd>
|
|
<dt><code>drive</code></dt>
|
|
<dd>Drive addresses have the following additional
|
|
attributes: <code>controller</code> (a 2-digit controller
|
|
number), <code>bus</code> (a 2-digit bus number),
|
|
<code>target</code> (a 2-digit target number),
|
|
and <code>unit</code> (a 2-digit unit number on the bus).
|
|
</dd>
|
|
<dt><code>virtio-serial</code></dt>
|
|
<dd>Each virtio-serial address has the following additional
|
|
attributes: <code>controller</code> (a 2-digit controller
|
|
number), <code>bus</code> (a 2-digit bus number),
|
|
and <code>slot</code> (a 2-digit slot within the bus).
|
|
</dd>
|
|
<dt><code>ccid</code></dt>
|
|
<dd>A CCID address, for smart-cards, has the following
|
|
additional attributes: <code>bus</code> (a 2-digit bus
|
|
number), and <code>slot</code> attribute (a 2-digit slot
|
|
within the bus). <span class="since">Since 0.8.8.</span>
|
|
</dd>
|
|
<dt><code>usb</code></dt>
|
|
<dd>USB addresses have the following additional
|
|
attributes: <code>bus</code> (a hex value between 0 and 0xfff,
|
|
inclusive), and <code>port</code> (a dotted notation of up to
|
|
four octets, such as 1.2 or 2.1.3.1).
|
|
</dd>
|
|
<dt><code>spapr-vio</code></dt>
|
|
<dd>On PowerPC pseries guests, devices can be assigned to the
|
|
SPAPR-VIO bus. It has a flat 32-bit address space; by
|
|
convention, devices are generally assigned at a non-zero
|
|
multiple of 0x00001000, but other addresses are valid and
|
|
permitted by libvirt. Each address has the following
|
|
additional attribute: <code>reg</code> (the hex value address
|
|
of the starting register). <span class="since">Since
|
|
0.9.9.</span>
|
|
</dd>
|
|
<dt><code>ccw</code></dt>
|
|
<dd>S390 guests with a <code>machine</code> value of
|
|
s390-ccw-virtio use the native CCW bus for I/O devices.
|
|
CCW bus addresses have the following additional attributes:
|
|
<code>cssid</code> (a hex value between 0 and 0xfe, inclusive),
|
|
<code>ssid</code> (a value between 0 and 3, inclusive) and
|
|
<code>devno</code> (a hex value between 0 and 0xffff, inclusive).
|
|
Partially specified bus addresses are not allowed.
|
|
If omitted, libvirt will assign a free bus address with
|
|
cssid=0xfe and ssid=0. Virtio-ccw devices must have their cssid
|
|
set to 0xfe.
|
|
<span class="since">Since 1.0.4</span>
|
|
</dd>
|
|
<dt><code>virtio-mmio</code></dt>
|
|
<dd>This places the device on the virtio-mmio transport, which is
|
|
currently only available for some <code>armv7l</code> and
|
|
<code>aarch64</code> virtual machines. virtio-mmio addresses
|
|
do not have any additional attributes.
|
|
<span class="since">Since 1.1.3</span><br/>
|
|
If the guest architecture is <code>aarch64</code> and the machine
|
|
type is <code>virt</code>, libvirt will automatically assign PCI
|
|
addresses to devices; however, the presence of a single device
|
|
with virtio-mmio address in the guest configuration will cause
|
|
libvirt to assign virtio-mmio addresses to all further devices.
|
|
<span class="since">Since 3.0.0</span>
|
|
</dd>
|
|
<dt><code>isa</code></dt>
|
|
<dd>ISA addresses have the following additional
|
|
attributes: <code>iobase</code> and <code>irq</code>.
|
|
<span class="since">Since 1.2.1</span>
|
|
</dd>
|
|
<dt><code>unassigned</code></dt>
|
|
<dd>For PCI hostdevs, <code><address type='unassigned'/></code>
|
|
allows the admin to include a PCI hostdev in the domain XML definition,
|
|
without making it available for the guest. This allows for configurations
|
|
in which Libvirt manages the device as a regular PCI hostdev,
|
|
regardless of whether the guest will have access to it.
|
|
<code><address type='unassigned'/></code> is an invalid address
|
|
type for all other device types.
|
|
<span class="since">Since 6.0.0</span>
|
|
</dd>
|
|
</dl>
|
|
|
|
<h4><a id="elementsVirtio">Virtio-related options</a></h4>
|
|
|
|
<p>
|
|
QEMU's virtio devices have some attributes related to the virtio transport under
|
|
the <code>driver</code> element:
|
|
The <code>iommu</code> attribute enables the use of emulated IOMMU
|
|
by the device. The attribute <code>ats</code> controls the Address
|
|
Translation Service support for PCIe devices. This is needed to make use
|
|
of IOTLB support (see <a href="#elementsIommu">IOMMU device</a>).
|
|
Possible values are <code>on</code> or <code>off</code>.
|
|
<span class="since">Since 3.5.0</span>
|
|
</p>
|
|
<p>
|
|
The attribute <code>packed</code> controls if QEMU should try to use
|
|
packed virtqueues. Compared to regular split queues, packed queues
|
|
consist of only a single descriptor ring replacing available and used
|
|
ring, index and descriptor buffer. This can result in better cache
|
|
utilization and performance. If packed virtqueues are actually used
|
|
depends on the feature negotiation between QEMU, vhost backends and guest
|
|
drivers. Possible values are <code>on</code> or <code>off</code>.
|
|
<span class="since">Since 6.3.0 (QEMU and KVM only)</span>
|
|
</p>
|
|
|
|
<h4><a id="elementsVirtioTransitional">Virtio transitional devices</a></h4>
|
|
|
|
<p>
|
|
<span class="since">Since 5.2.0</span>, some of QEMU's virtio devices,
|
|
when used with PCI/PCIe machine types, accept the following
|
|
<code>model</code> values:
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>virtio-transitional</code></dt>
|
|
<dd>This device can work both with virtio 0.9 and virtio 1.0 guest
|
|
drivers, so it's the best choice when compatibility with older
|
|
guest operating systems is desired. libvirt will plug the device
|
|
into a conventional PCI slot.
|
|
</dd>
|
|
<dt><code>virtio-non-transitional</code></dt>
|
|
<dd>This device can only work with virtio 1.0 guest drivers, and it's
|
|
the recommended option unless compatibility with older guest
|
|
operating systems is necessary. libvirt will plug the device into
|
|
either a PCI Express slot or a conventional PCI slot based on the
|
|
machine type, resulting in a more optimized PCI topology.
|
|
</dd>
|
|
<dt><code>virtio</code></dt>
|
|
<dd>This device will work like a <code>virtio-non-transitional</code>
|
|
device when plugged into a PCI Express slot, and like a
|
|
<code>virtio-transitional</code> device otherwise; libvirt will
|
|
pick one or the other based on the machine type. This is the best
|
|
choice when compatibility with libvirt versions older than 5.2.0
|
|
is necessary, but it's otherwise not recommended to use it.
|
|
</dd>
|
|
</dl>
|
|
|
|
<p>
|
|
While the information outlined above applies to most virtio devices,
|
|
there are a few exceptions:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>
|
|
for SCSI controllers, <code>virtio-scsi</code> must be used instead
|
|
of <code>virtio</code> for backwards compatibility reasons;
|
|
</li>
|
|
<li>
|
|
some devices, such as GPUs and input devices (keyboard, tablet and
|
|
mouse), are only defined in the virtio 1.0 spec and as such don't
|
|
have a transitional variant: the only accepted model is
|
|
<code>virtio</code>, which will result in a non-transitional device.
|
|
</li>
|
|
</ul>
|
|
|
|
<p>
|
|
For more details see the
|
|
<a href="https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg00923.html">qemu patch posting</a> and the
|
|
<a href="http://docs.oasis-open.org/virtio/virtio/v1.0/virtio-v1.0.html">virtio-1.0 spec</a>.
|
|
</p>
|
|
|
|
|
|
<h4><a id="elementsControllers">Controllers</a></h4>
|
|
|
|
<p>
|
|
Depending on the guest architecture, some device buses can
|
|
appear more than once, with a group of virtual devices tied to a
|
|
virtual controller. Normally, libvirt can automatically infer such
|
|
controllers without requiring explicit XML markup, but sometimes
|
|
it is necessary to provide an explicit controller element, notably
|
|
when planning the <a href="pci-hotplug.html">PCI topology</a>
|
|
for guests where device hotplug is expected.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<controller type='ide' index='0'/>
|
|
<controller type='virtio-serial' index='0' ports='16' vectors='4'/>
|
|
<controller type='virtio-serial' index='1'>
|
|
<address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
|
|
</controller>
|
|
<controller type='scsi' index='0' model='virtio-scsi'>
|
|
<driver iothread='4'/>
|
|
<address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
|
|
</controller>
|
|
<controller type='xenbus' maxGrantFrames='64' maxEventChannels='2047'/>
|
|
...
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
Each controller has a mandatory attribute <code>type</code>,
|
|
which must be one of 'ide', 'fdc', 'scsi', 'sata', 'usb',
|
|
'ccid', 'virtio-serial' or 'pci', and a mandatory
|
|
attribute <code>index</code> which is the decimal integer
|
|
describing in which order the bus controller is encountered (for
|
|
use in <code>controller</code> attributes of
|
|
<code><address></code> elements).
|
|
<span class="since">Since 1.3.5</span> the index is optional; if
|
|
not specified, it will be auto-assigned to be the lowest unused
|
|
index for the given controller type. Some controller types have
|
|
additional attributes that control specific features, such as:
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>virtio-serial</code></dt>
|
|
<dd>The <code>virtio-serial</code> controller has two additional
|
|
optional attributes <code>ports</code> and <code>vectors</code>,
|
|
which control how many devices can be connected through the
|
|
controller. <span class="since">Since 5.2.0</span>, it
|
|
supports an optional attribute <code>model</code> which can
|
|
be 'virtio', 'virtio-transitional', or 'virtio-non-transitional'. See
|
|
<a href="#elementsVirtioTransitional">Virtio transitional devices</a>
|
|
for more details.
|
|
</dd>
|
|
<dt><code>scsi</code></dt>
|
|
<dd>A <code>scsi</code> controller has an optional attribute
|
|
<code>model</code>, which is one of 'auto', 'buslogic', 'ibmvscsi',
|
|
'lsilogic', 'lsisas1068', 'lsisas1078', 'virtio-scsi',
|
|
'vmpvscsi', 'virtio-transitional', 'virtio-non-transitional'. See
|
|
<a href="#elementsVirtioTransitional">Virtio transitional devices</a>
|
|
for more details.
|
|
</dd>
|
|
<dt><code>usb</code></dt>
|
|
<dd>A <code>usb</code> controller has an optional attribute
|
|
<code>model</code>, which is one of "piix3-uhci", "piix4-uhci",
|
|
"ehci", "ich9-ehci1", "ich9-uhci1", "ich9-uhci2", "ich9-uhci3",
|
|
"vt82c686b-uhci", "pci-ohci", "nec-xhci", "qusb1" (xen pvusb
|
|
with qemu backend, version 1.1), "qusb2" (xen pvusb with qemu
|
|
backend, version 2.0) or "qemu-xhci". Additionally,
|
|
<span class="since">since 0.10.0</span>, if the USB bus needs to
|
|
be explicitly disabled for the guest, <code>model='none'</code>
|
|
may be used. <span class="since">Since 1.0.5</span>, no default
|
|
USB controller will be built on s390.
|
|
<span class="since">Since 1.3.5</span>, USB controllers accept a
|
|
<code>ports</code> attribute to configure how many devices can be
|
|
connected to the controller.</dd>
|
|
<dt><code>ide</code></dt>
|
|
<dd><span class="since">Since 3.10.0</span> for the vbox driver, the
|
|
<code>ide</code> controller has an optional attribute
|
|
<code>model</code>, which is one of "piix3", "piix4" or "ich6".</dd>
|
|
<dt><code>xenbus</code></dt>
|
|
<dd><span class="since">Since 5.2.0</span>, the <code>xenbus</code>
|
|
controller has an optional attribute <code>maxGrantFrames</code>,
|
|
which specifies the maximum number of grant frames the controller
|
|
makes available for connected devices.
|
|
<span class="since">Since 6.3.0</span>, the xenbus controller
|
|
supports the optional <code>maxEventChannels</code> attribute,
|
|
which specifies maximum number of event channels (PV interrupts)
|
|
that can be used by the guest.</dd>
|
|
</dl>
|
|
|
|
<p>
|
|
Note: The PowerPC64 "spapr-vio" addresses do not have an
|
|
associated controller.
|
|
</p>
|
|
|
|
<p>
|
|
For controllers that are themselves devices on a PCI or USB bus,
|
|
an optional sub-element <code><address></code> can specify
|
|
the exact relationship of the controller to its master bus, with
|
|
semantics <a href="#elementsAddress">given above</a>.
|
|
</p>
|
|
|
|
<p>
|
|
An optional sub-element <code>driver</code> can specify the driver
|
|
specific options:
|
|
</p>
|
|
<dl>
|
|
<dt><code>queues</code></dt>
|
|
<dd>
|
|
The optional <code>queues</code> attribute specifies the number of
|
|
queues for the controller. For best performance, it's recommended to
|
|
specify a value matching the number of vCPUs.
|
|
<span class="since">Since 1.0.5 (QEMU and KVM only)</span>
|
|
</dd>
|
|
<dt><code>cmd_per_lun</code></dt>
|
|
<dd>
|
|
The optional <code>cmd_per_lun</code> attribute specifies the maximum
|
|
number of commands that can be queued on devices controlled by the
|
|
host.
|
|
<span class="since">Since 1.2.7 (QEMU and KVM only)</span>
|
|
</dd>
|
|
<dt><code>max_sectors</code></dt>
|
|
<dd>
|
|
The optional <code>max_sectors</code> attribute specifies the maximum
|
|
amount of data in bytes that will be transferred to or from the device
|
|
in a single command. The transfer length is measured in sectors, where
|
|
a sector is 512 bytes.
|
|
<span class="since">Since 1.2.7 (QEMU and KVM only)</span>
|
|
</dd>
|
|
<dt><code>ioeventfd</code></dt>
|
|
<dd>
|
|
The optional <code>ioeventfd</code> attribute specifies
|
|
whether the controller should use
|
|
<a href='https://patchwork.kernel.org/patch/43390/'>
|
|
I/O asynchronous handling</a> or not. Accepted values are
|
|
"on" and "off". <span class="since">Since 1.2.18</span>
|
|
</dd>
|
|
<dt><code>iothread</code></dt>
|
|
<dd>
|
|
Supported for controller type <code>scsi</code> using model
|
|
<code>virtio-scsi</code> for <code>address</code> types
|
|
<code>pci</code> and <code>ccw</code>
|
|
<span class="since">since 1.3.5 (QEMU 2.4)</span>.
|
|
|
|
The optional <code>iothread</code> attribute assigns the controller
|
|
to an IOThread as defined by the range for the domain
|
|
<a href="#elementsIOThreadsAllocation"><code>iothreads</code></a>
|
|
value. Each SCSI <code>disk</code> assigned to use the specified
|
|
<code>controller</code> will utilize the same IOThread. If a specific
|
|
IOThread is desired for a specific SCSI <code>disk</code>, then
|
|
multiple controllers must be defined each having a specific
|
|
<code>iothread</code> value. The <code>iothread</code> value
|
|
must be within the range 1 to the domain iothreads value.
|
|
</dd>
|
|
<dt>virtio options</dt>
|
|
<dd>
|
|
For virtio controllers,
|
|
<a href="#elementsVirtio">Virtio-specific options</a> can also be
|
|
set. (<span class="since">Since 3.5.0</span>)
|
|
</dd>
|
|
</dl>
|
|
<p>
|
|
USB companion controllers have an optional
|
|
sub-element <code><master></code> to specify the exact
|
|
relationship of the companion to its master controller.
|
|
A companion controller is on the same bus as its master, so
|
|
the companion <code>index</code> value should be equal.
|
|
Not all controller models can be used as companion controllers
|
|
and libvirt might provide some sensible defaults (settings
|
|
of <code>master startport</code> and <code>function</code> of an
|
|
address) for some particular models.
|
|
Preferred companion controllers are <code>ich-uhci[123]</code>.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<controller type='usb' index='0' model='ich9-ehci1'>
|
|
<address type='pci' domain='0' bus='0' slot='4' function='7'/>
|
|
</controller>
|
|
<controller type='usb' index='0' model='ich9-uhci1'>
|
|
<master startport='0'/>
|
|
<address type='pci' domain='0' bus='0' slot='4' function='0' multifunction='on'/>
|
|
</controller>
|
|
...
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
PCI controllers have an optional <code>model</code> attribute; possible
|
|
values for this attribute are
|
|
</p>
|
|
<ul>
|
|
<li>
|
|
<code>pci-root</code>, <code>pci-bridge</code>
|
|
(<span class="since">since 1.0.5</span>)
|
|
</li>
|
|
<li>
|
|
<code>pcie-root</code>, <code>dmi-to-pci-bridge</code>
|
|
(<span class="since">since 1.1.2</span>)
|
|
</li>
|
|
<li>
|
|
<code>pcie-root-port</code>, <code>pcie-switch-upstream-port</code>,
|
|
<code>pcie-switch-downstream-port</code>
|
|
(<span class="since">since 1.2.19</span>)
|
|
</li>
|
|
<li>
|
|
<code>pci-expander-bus</code>, <code>pcie-expander-bus</code>
|
|
(<span class="since">since 1.3.4</span>)
|
|
</li>
|
|
<li>
|
|
<code>pcie-to-pci-bridge</code>
|
|
(<span class="since">since 4.3.0</span>)
|
|
</li>
|
|
</ul>
|
|
<p>
|
|
The root controllers (<code>pci-root</code>
|
|
and <code>pcie-root</code>) have an
|
|
optional <code>pcihole64</code> element specifying how big (in
|
|
kilobytes, or in the unit specified by <code>pcihole64</code>'s
|
|
<code>unit</code> attribute) the 64-bit PCI hole should be. Some guests (like
|
|
Windows XP or Windows Server 2003) might crash when QEMU and Seabios
|
|
are recent enough to support 64-bit PCI holes, unless this is disabled
|
|
(set to 0). <span class="since">Since 1.1.2 (QEMU only)</span>
|
|
</p>
|
|
<p>
|
|
PCI controllers also have an optional
|
|
subelement <code><model></code> with an attribute
|
|
<code>name</code>. The name attribute holds the name of the
|
|
specific device that qemu is emulating (e.g. "i82801b11-bridge")
|
|
rather than simply the class of device ("pcie-to-pci-bridge",
|
|
"pci-bridge"), which is set in the controller element's
|
|
model <b>attribute</b>. In almost all cases, you should not
|
|
manually add a <code><model></code> subelement to a
|
|
controller, nor should you modify one that is automatically
|
|
generated by libvirt. <span class="since">Since 1.2.19 (QEMU
|
|
only).</span>
|
|
</p>
|
|
<p>
|
|
PCI controllers also have an optional
|
|
subelement <code><target></code> with the attributes and
|
|
subelements listed below. These are configurable items that 1)
|
|
are visible to the guest OS so must be preserved for guest ABI
|
|
compatibility, and 2) are usually left to default values or
|
|
derived automatically by libvirt. In almost all cases, you
|
|
should not manually add a <code><target></code> subelement
|
|
to a controller, nor should you modify the values in the those
|
|
that are automatically generated by
|
|
libvirt. <span class="since">Since 1.2.19 (QEMU only).</span>
|
|
</p>
|
|
<dl>
|
|
<dt><code>chassisNr</code></dt>
|
|
<dd>
|
|
PCI controllers that have attribute model="pci-bridge", can
|
|
also have a <code>chassisNr</code> attribute in
|
|
the <code><target></code> subelement, which is used to
|
|
control QEMU's "chassis_nr" option for the pci-bridge device
|
|
(normally libvirt automatically sets this to the same value as
|
|
the index attribute of the pci controller). If set, chassisNr
|
|
must be between 1 and 255.
|
|
</dd>
|
|
<dt><code>chassis</code></dt>
|
|
<dd>
|
|
pcie-root-port and pcie-switch-downstream-port controllers can
|
|
also have a <code>chassis</code> attribute in
|
|
the <code><target></code> subelement, which is used to
|
|
set the controller's "chassis" configuration value, which is
|
|
visible to the virtual machine. If set, chassis must be
|
|
between 0 and 255.
|
|
</dd>
|
|
<dt><code>port</code></dt>
|
|
<dd>
|
|
pcie-root-port and pcie-switch-downstream-port controllers can
|
|
also have a <code>port</code> attribute in
|
|
the <code><target></code> subelement, which
|
|
is used to set the controller's "port" configuration value,
|
|
which is visible to the virtual machine. If set, port must be
|
|
between 0 and 255.
|
|
</dd>
|
|
<dt><code>hotplug</code></dt>
|
|
<dd>
|
|
pcie-root-port and pcie-switch-downstream-port controllers can
|
|
also have a <code>hotplug</code> attribute in
|
|
the <code><target></code> subelement, which is used to
|
|
disable hotplug/unplug of devices on a particular
|
|
controller. The default setting of <code>hotplug</code>
|
|
is <code>on</code>; it should be set to <code>off</code> to
|
|
disable hotplug/unplug of devices on a particular controller.
|
|
<span class="since">Since 6.3.0</span>
|
|
</dd>
|
|
<dt><code>busNr</code></dt>
|
|
<dd>
|
|
pci-expander-bus and pcie-expander-bus controllers can have an
|
|
optional <code>busNr</code> attribute (1-254). This will be
|
|
the bus number of the new bus; All bus numbers between that
|
|
specified and 255 will be available only for assignment to
|
|
PCI/PCIe controllers plugged into the hierarchy starting with
|
|
this expander bus, and bus numbers less than the specified
|
|
value will be available to the next lower expander-bus (or the
|
|
root-bus if there are no lower expander buses). If you do not
|
|
specify a busNumber, libvirt will find the lowest existing
|
|
busNumber in all other expander buses (or use 256 if there are
|
|
no others) and auto-assign the busNr of that found bus - 2,
|
|
which provides one bus number for the pci-expander-bus and one
|
|
for the pci-bridge that is automatically attached to it (if
|
|
you plan on adding more pci-bridges to the hierarchy of the
|
|
bus, you should manually set busNr to a lower value).
|
|
<p>
|
|
A similar algorithm is used for automatically determining
|
|
the busNr attribute for pcie-expander-bus, but since the
|
|
pcie-expander-bus doesn't have any built-in pci-bridge, the
|
|
2nd bus-number is just being reserved for the pcie-root-port
|
|
that must necessarily be connected to the bus in order to
|
|
actually plug in an endpoint device. If you intend to plug
|
|
multiple devices into a pcie-expander-bus, you must connect
|
|
a pcie-switch-upstream-port to the pcie-root-port that is
|
|
plugged into the pcie-expander-bus, and multiple
|
|
pcie-switch-downstream-ports to the
|
|
pcie-switch-upstream-port, and of course for this to work
|
|
properly, you will need to decrease the pcie-expander-bus'
|
|
busNr accordingly so that there are enough unused bus
|
|
numbers above it to accommodate giving out one bus number for
|
|
the upstream-port and one for each downstream-port (in
|
|
addition to the pcie-root-port and the pcie-expander-bus
|
|
itself).
|
|
</p>
|
|
</dd>
|
|
<dt><code>node</code></dt>
|
|
<dd>
|
|
Some PCI controllers (<code>pci-expander-bus</code> for the pc
|
|
machine type, <code>pcie-expander-bus</code> for the q35 machine
|
|
type and, <span class="since">since 3.6.0</span>,
|
|
<code>pci-root</code> for the pseries machine type) can have an
|
|
optional <code><node></code> subelement within
|
|
the <code><target></code> subelement, which is used to
|
|
set the NUMA node reported to the guest OS for that bus - the
|
|
guest OS will then know that all devices on that bus are a
|
|
part of the specified NUMA node (it is up to the user of the
|
|
libvirt API to attach host devices to the correct
|
|
pci-expander-bus when assigning them to the domain).
|
|
</dd>
|
|
<dt><code>index</code></dt>
|
|
<dd>
|
|
pci-root controllers for pSeries guests use this attribute to
|
|
record the order they will show up in the guest.
|
|
<span class="since">Since 3.6.0</span>
|
|
</dd>
|
|
</dl>
|
|
<p>
|
|
For machine types which provide an implicit PCI bus, the pci-root
|
|
controller with index=0 is auto-added and required to use PCI devices.
|
|
pci-root has no address.
|
|
PCI bridges are auto-added if there are too many devices to fit on
|
|
the one bus provided by pci-root, or a PCI bus number greater than zero
|
|
was specified.
|
|
PCI bridges can also be specified manually, but their addresses should
|
|
only refer to PCI buses provided by already specified PCI controllers.
|
|
Leaving gaps in the PCI controller indexes might lead to an invalid
|
|
configuration.
|
|
</p>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<controller type='pci' index='0' model='pci-root'/>
|
|
<controller type='pci' index='1' model='pci-bridge'>
|
|
<address type='pci' domain='0' bus='0' slot='5' function='0' multifunction='off'/>
|
|
</controller>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
For machine types which provide an implicit PCI Express (PCIe)
|
|
bus (for example, the machine types based on the Q35 chipset),
|
|
the pcie-root controller with index=0 is auto-added to the
|
|
domain's configuration. pcie-root has also no address, provides
|
|
31 slots (numbered 1-31) that can be used to attach PCIe or PCI
|
|
devices (although libvirt will never auto-assign a PCI device to
|
|
a PCIe slot, it will allow manual specification of such an
|
|
assignment). Devices connected to pcie-root cannot be
|
|
hotplugged. If traditional PCI devices are present in the guest
|
|
configuration, a <code>pcie-to-pci-bridge</code> controller will
|
|
automatically be added: this controller, which plugs into a
|
|
<code>pcie-root-port</code>, provides 31 usable PCI slots (1-31) with
|
|
hotplug support (<span class="since">since 4.3.0</span>). If the QEMU
|
|
binary doesn't support the corresponding device, then a
|
|
<code>dmi-to-pci-bridge</code> controller will be added instead,
|
|
usually at the defacto standard location of slot=0x1e. A
|
|
dmi-to-pci-bridge controller plugs into a PCIe slot (as provided
|
|
by pcie-root), and itself provides 31 standard PCI slots (which
|
|
also do not support device hotplug). In order to have
|
|
hot-pluggable PCI slots in the guest system, a pci-bridge
|
|
controller will also be automatically created and connected to
|
|
one of the slots of the auto-created dmi-to-pci-bridge
|
|
controller; all guest PCI devices with addresses that are
|
|
auto-determined by libvirt will be placed on this pci-bridge
|
|
device. (<span class="since">since 1.1.2</span>).
|
|
</p>
|
|
<p>
|
|
Domains with an implicit pcie-root can also add controllers
|
|
with <code>model='pcie-root-port'</code>,
|
|
<code>model='pcie-switch-upstream-port'</code>,
|
|
and <code>model='pcie-switch-downstream-port'</code>. pcie-root-port
|
|
is a simple type of bridge device that can connect only to one
|
|
of the 31 slots on the pcie-root bus on its upstream side, and
|
|
makes a single (PCIe, hotpluggable) port available on the
|
|
downstream side (at slot='0'). pcie-root-port can be used to
|
|
provide a single slot to later hotplug a PCIe device (but is not
|
|
itself hotpluggable - it must be in the configuration when the
|
|
domain is started).
|
|
(<span class="since">since 1.2.19</span>)
|
|
</p>
|
|
<p>
|
|
pcie-switch-upstream-port is a more flexible (but also more
|
|
complex) device that can only plug into a pcie-root-port or
|
|
pcie-switch-downstream-port on the upstream side (and only
|
|
before the domain is started - it is not hot-pluggable), and
|
|
provides 32 ports on the downstream side (slot='0' - slot='31')
|
|
that accept only pcie-switch-downstream-port devices; each
|
|
pcie-switch-downstream-port device can only plug into a
|
|
pcie-switch-upstream-port on its upstream side (again, not
|
|
hot-pluggable), and on its downstream side provides a single
|
|
hotpluggable pcie port that can accept any standard pci or pcie
|
|
device (or another pcie-switch-upstream-port), i.e. identical in
|
|
function to a pcie-root-port. (<span class="since">since
|
|
1.2.19</span>)
|
|
</p>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<controller type='pci' index='0' model='pcie-root'/>
|
|
<controller type='pci' index='1' model='pcie-root-port'>
|
|
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
|
|
</controller>
|
|
<controller type='pci' index='2' model='pcie-to-pci-bridge'>
|
|
<address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
|
|
</controller>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h4><a id="elementsLease">Device leases</a></h4>
|
|
|
|
<p>
|
|
When using a lock manager, it may be desirable to record device leases
|
|
against a VM. The lock manager will ensure the VM won't start unless
|
|
the leases can be acquired.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
...
|
|
<lease>
|
|
<lockspace>somearea</lockspace>
|
|
<key>somekey</key>
|
|
<target path='/some/lease/path' offset='1024'/>
|
|
</lease>
|
|
...
|
|
</devices>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><code>lockspace</code></dt>
|
|
<dd>This is an arbitrary string, identifying the lockspace
|
|
within which the key is held. Lock managers may impose
|
|
extra restrictions on the format, or length of the lockspace
|
|
name.</dd>
|
|
<dt><code>key</code></dt>
|
|
<dd>This is an arbitrary string, uniquely identifying the
|
|
lease to be acquired. Lock managers may impose extra
|
|
restrictions on the format, or length of the key.
|
|
</dd>
|
|
<dt><code>target</code></dt>
|
|
<dd>This is the fully qualified path of the file associated
|
|
with the lockspace. The offset specifies where the lease
|
|
is stored within the file. If the lock manager does not
|
|
require an offset, just pass 0.
|
|
</dd>
|
|
</dl>
|
|
|
|
<h4><a id="elementsHostDev">Host device assignment</a></h4>
|
|
|
|
<h5><a id="elementsHostDevSubsys">USB / PCI / SCSI devices</a></h5>
|
|
|
|
<p>
|
|
USB, PCI and SCSI devices attached to the host can be passed through
|
|
to the guest using the <code>hostdev</code> element.
|
|
<span class="since">since after 0.4.4 for USB, 0.6.0 for PCI (KVM only)
|
|
and 1.0.6 for SCSI (KVM only)</span>:
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<hostdev mode='subsystem' type='usb'>
|
|
<source startupPolicy='optional'>
|
|
<vendor id='0x1234'/>
|
|
<product id='0xbeef'/>
|
|
</source>
|
|
<boot order='2'/>
|
|
</hostdev>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>or:</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<hostdev mode='subsystem' type='pci' managed='yes'>
|
|
<source>
|
|
<address domain='0x0000' bus='0x06' slot='0x02' function='0x0'/>
|
|
</source>
|
|
<boot order='1'/>
|
|
<rom bar='on' file='/etc/fake/boot.bin'/>
|
|
</hostdev>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>or:</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<hostdev mode='subsystem' type='scsi' sgio='filtered' rawio='yes'>
|
|
<source>
|
|
<adapter name='scsi_host0'/>
|
|
<address bus='0' target='0' unit='0'/>
|
|
</source>
|
|
<readonly/>
|
|
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
|
|
</hostdev>
|
|
</devices>
|
|
...</pre>
|
|
|
|
|
|
<p>or:</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<hostdev mode='subsystem' type='scsi'>
|
|
<source protocol='iscsi' name='iqn.2014-08.com.example:iscsi-nopool/1'>
|
|
<host name='example.com' port='3260'/>
|
|
<auth username='myuser'>
|
|
<secret type='iscsi' usage='libvirtiscsi'/>
|
|
</auth>
|
|
</source>
|
|
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
|
|
</hostdev>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>or:</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<hostdev mode='subsystem' type='scsi_host'>
|
|
<source protocol='vhost' wwpn='naa.50014057667280d8'/>
|
|
</hostdev>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>or:</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<hostdev mode='subsystem' type='mdev' model='vfio-pci'>
|
|
<source>
|
|
<address uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'/>
|
|
</source>
|
|
</hostdev>
|
|
<hostdev mode='subsystem' type='mdev' model='vfio-ccw'>
|
|
<source>
|
|
<address uuid='9063cba3-ecef-47b6-abcf-3fef4fdcad85'/>
|
|
</source>
|
|
<address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0001'/>
|
|
</hostdev>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><code>hostdev</code></dt>
|
|
<dd>The <code>hostdev</code> element is the main container for describing
|
|
host devices. For each device, the <code>mode</code> is always
|
|
"subsystem" and the <code>type</code> is one of the following values
|
|
with additional attributes noted.
|
|
<dl>
|
|
<dt><code>usb</code></dt>
|
|
<dd>USB devices are detached from the host on guest startup
|
|
and reattached after the guest exits or the device is
|
|
hot-unplugged.
|
|
</dd>
|
|
<dt><code>pci</code></dt>
|
|
<dd>For PCI devices, when <code>managed</code> is "yes" it is
|
|
detached from the host before being passed on to the guest
|
|
and reattached to the host after the guest exits. If
|
|
<code>managed</code> is omitted or "no", the user is
|
|
responsible to call <code>virNodeDeviceDetachFlags</code>
|
|
(or <code>virsh nodedev-detach</code> before starting the guest
|
|
or hot-plugging the device and <code>virNodeDeviceReAttach</code>
|
|
(or <code>virsh nodedev-reattach</code>) after hot-unplug or
|
|
stopping the guest.
|
|
</dd>
|
|
<dt><code>scsi</code></dt>
|
|
<dd>For SCSI devices, user is responsible to make sure the device
|
|
is not used by host. If supported by the hypervisor and OS, the
|
|
optional <code>sgio</code> (<span class="since">since 1.0.6</span>)
|
|
attribute indicates whether unprivileged SG_IO commands are
|
|
filtered for the disk. Valid settings are "filtered" or
|
|
"unfiltered", where the default is "filtered".
|
|
The optional <code>rawio</code>
|
|
(<span class="since">since 1.2.9</span>) attribute indicates
|
|
whether the lun needs the rawio capability. Valid settings are
|
|
"yes" or "no". See the rawio description within the
|
|
<a href="#elementsDisks">disk</a> section.
|
|
If a disk lun in the domain already has the rawio capability,
|
|
then this setting not required.
|
|
</dd>
|
|
<dt><code>scsi_host</code></dt>
|
|
<dd><span class="since">since 2.5.0</span>For SCSI devices, user
|
|
is responsible to make sure the device is not used by host. This
|
|
<code>type</code> passes all LUNs presented by a single HBA to
|
|
the guest. <span class="since">Since 5.2.0,</span> the
|
|
<code>model</code> attribute can be specified further
|
|
with "virtio-transitional", "virtio-non-transitional", or
|
|
"virtio". See
|
|
<a href="#elementsVirtioTransitional">Virtio transitional devices</a>
|
|
for more details.
|
|
</dd>
|
|
<dt><code>mdev</code></dt>
|
|
<dd>For mediated devices (<span class="since">Since 3.2.0</span>)
|
|
the <code>model</code> attribute specifies the device API which
|
|
determines how the host's vfio driver will expose the device to the
|
|
guest. Currently, <code>model='vfio-pci'</code>,
|
|
<code>model='vfio-ccw'</code> (<span class="since">Since 4.4.0</span>)
|
|
and <code>model='vfio-ap'</code> (<span class="since">Since 4.9.0</span>)
|
|
is supported. <a href="drvnodedev.html#MDEV">MDEV</a> section
|
|
provides more information about mediated devices as well as how to
|
|
create mediated devices on the host.
|
|
<span class="since">Since 4.6.0 (QEMU 2.12)</span> an optional
|
|
<code>display</code> attribute may be used to enable or disable
|
|
support for an accelerated remote desktop backed by a mediated
|
|
device (such as NVIDIA vGPU or Intel GVT-g) as an alternative to
|
|
emulated <a href="#elementsVideo">video devices</a>. This attribute
|
|
is limited to <code>model='vfio-pci'</code> only. Supported values
|
|
are either <code>on</code> or <code>off</code> (default is 'off').
|
|
It is required to use a
|
|
<a href="#elementsGraphics">graphical framebuffer</a> in order to
|
|
use this attribute, currently only supported with VNC, Spice and
|
|
egl-headless graphics devices.
|
|
|
|
<span class="since">Since version 5.10.0</span>, there is an optional
|
|
<code>ramfb</code> attribute for devices with
|
|
<code>model='vfio-pci'</code>. Supported values are either
|
|
<code>on</code> or <code>off</code> (default is 'off'). When
|
|
enabled, this attribute provides a memory framebuffer device to the
|
|
guest. This framebuffer will be used as a boot display when a vgpu
|
|
device is the primary display.
|
|
<p>
|
|
Note: There are also some implications on the usage of guest's
|
|
address type depending on the <code>model</code> attribute,
|
|
see the <code>address</code> element below.
|
|
</p>
|
|
</dd>
|
|
</dl>
|
|
<p>
|
|
Note: The <code>managed</code> attribute is only used with
|
|
<code>type='pci'</code> and is ignored by all the other device types,
|
|
thus setting <code>managed</code> explicitly with other than a PCI
|
|
device has the same effect as omitting it. Similarly,
|
|
<code>model</code> attribute is only supported by mediated devices and
|
|
ignored by all other device types.
|
|
</p>
|
|
</dd>
|
|
<dt><code>source</code></dt>
|
|
<dd>The source element describes the device as seen from the host using
|
|
the following mechanism to describe:
|
|
<dl>
|
|
<dt><code>usb</code></dt>
|
|
<dd>The USB device can either be addressed by vendor / product id
|
|
using the <code>vendor</code> and <code>product</code> elements
|
|
or by the device's address on the host using the
|
|
<code>address</code> element.
|
|
<p>
|
|
<span class="since">Since 1.0.0</span>, the <code>source</code>
|
|
element of USB devices may contain <code>startupPolicy</code>
|
|
attribute which can be used to define policy what to do if the
|
|
specified host USB device is not found. The attribute accepts
|
|
the following values:
|
|
</p>
|
|
<table class="top_table">
|
|
<tr>
|
|
<td> mandatory </td>
|
|
<td> fail if missing for any reason (the default) </td>
|
|
</tr>
|
|
<tr>
|
|
<td> requisite </td>
|
|
<td> fail if missing on boot up,
|
|
drop if missing on migrate/restore/revert </td>
|
|
</tr>
|
|
<tr>
|
|
<td> optional </td>
|
|
<td> drop if missing at any start attempt </td>
|
|
</tr>
|
|
</table>
|
|
</dd>
|
|
<dt><code>pci</code></dt>
|
|
<dd>PCI devices can only be described by their <code>address</code>.
|
|
</dd>
|
|
<dt><code>scsi</code></dt>
|
|
<dd>SCSI devices are described by both the <code>adapter</code>
|
|
and <code>address</code> elements. The <code>address</code>
|
|
element includes a <code>bus</code> attribute (a 2-digit bus
|
|
number), a <code>target</code> attribute (a 10-digit target
|
|
number), and a <code>unit</code> attribute (a 20-digit unit
|
|
number on the bus). Not all hypervisors support larger
|
|
<code>target</code> and <code>unit</code> values. It is up
|
|
to each hypervisor to determine the maximum value supported
|
|
for the adapter.
|
|
<p>
|
|
<span class="since">Since 1.2.8</span>, the <code>source</code>
|
|
element of a SCSI device may contain the <code>protocol</code>
|
|
attribute. When the attribute is set to "iscsi", the host
|
|
device XML follows the network <a href="#elementsDisks">disk</a>
|
|
device using the same <code>name</code> attribute and optionally
|
|
using the <code>auth</code> element to provide the authentication
|
|
credentials to the iSCSI server.
|
|
</p>
|
|
</dd>
|
|
<dt><code>scsi_host</code></dt>
|
|
<dd><span class="since">Since 2.5.0</span>, multiple LUNs behind a
|
|
single SCSI HBA are described by a <code>protocol</code>
|
|
attribute set to "vhost" and a <code>wwpn</code> attribute that
|
|
is the vhost_scsi wwpn (16 hexadecimal digits with a prefix of
|
|
"naa.") established in the host configfs.
|
|
</dd>
|
|
<dt><code>mdev</code></dt>
|
|
<dd>Mediated devices (<span class="since">Since 3.2.0</span>) are
|
|
described by the <code>address</code> element. The
|
|
<code>address</code> element contains a single mandatory attribute
|
|
<code>uuid</code>.
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
<dt><code>vendor</code>, <code>product</code></dt>
|
|
<dd>The <code>vendor</code> and <code>product</code> elements each have an
|
|
<code>id</code> attribute that specifies the USB vendor and product id.
|
|
The ids can be given in decimal, hexadecimal (starting with 0x) or
|
|
octal (starting with 0) form.</dd>
|
|
<dt><code>boot</code></dt>
|
|
<dd>Specifies that the device is bootable. The <code>order</code>
|
|
attribute determines the order in which devices will be tried during
|
|
boot sequence. The per-device <code>boot</code> elements cannot be
|
|
used together with general boot elements in
|
|
<a href="#elementsOSBIOS">BIOS bootloader</a> section.
|
|
<span class="since">Since 0.8.8</span> for PCI devices,
|
|
<span class="since">Since 1.0.1</span> for USB devices.
|
|
</dd>
|
|
<dt><code>rom</code></dt>
|
|
<dd>The <code>rom</code> element is used to change how a PCI
|
|
device's ROM is presented to the guest. The optional <code>bar</code>
|
|
attribute can be set to "on" or "off", and determines whether
|
|
or not the device's ROM will be visible in the guest's memory
|
|
map. (In PCI documentation, the "rombar" setting controls the
|
|
presence of the Base Address Register for the ROM). If no rom
|
|
bar is specified, the qemu default will be used (older
|
|
versions of qemu used a default of "off", while newer qemus
|
|
have a default of "on"). <span class="since">Since
|
|
0.9.7 (QEMU and KVM only)</span>. The optional
|
|
<code>file</code> attribute contains an absolute path to a binary file
|
|
to be presented to the guest as the device's ROM BIOS. This
|
|
can be useful, for example, to provide a PXE boot ROM for a
|
|
virtual function of an sr-iov capable ethernet device (which
|
|
has no boot ROMs for the VFs).
|
|
<span class="since">Since 0.9.10 (QEMU and KVM only)</span>.
|
|
The optional <code>enabled</code> attribute can be set to
|
|
<code>no</code> to disable PCI ROM loading completely for the device;
|
|
if PCI ROM loading is disabled through this attribute, attempts to
|
|
tweak the loading process further using the <code>bar</code> or
|
|
<code>file</code> attributes will be rejected.
|
|
<span class="since">Since 4.3.0 (QEMU and KVM only)</span>.
|
|
</dd>
|
|
<dt><code>address</code></dt>
|
|
<dd>The <code>address</code> element for USB devices has a
|
|
<code>bus</code> and <code>device</code> attribute to specify the
|
|
USB bus and device number the device appears at on the host.
|
|
The values of these attributes can be given in decimal, hexadecimal
|
|
(starting with 0x) or octal (starting with 0) form.
|
|
For PCI devices the element carries 4 attributes allowing to designate
|
|
the device as can be found with the <code>lspci</code> or
|
|
with <code>virsh nodedev-list</code>. For SCSI devices a 'drive'
|
|
address type must be used. For mediated devices, which are software-only
|
|
devices defining an allocation of resources on the physical parent device,
|
|
the address type used must conform to the <code>model</code> attribute
|
|
of element <code>hostdev</code>, e.g. any address type other than PCI for
|
|
<code>vfio-pci</code> device API or any address type other than CCW for
|
|
<code>vfio-ccw</code> device API will result in an error.
|
|
<a href="#elementsAddress">See above</a> for more details on the address
|
|
element.</dd>
|
|
<dt><code>driver</code></dt>
|
|
<dd>
|
|
PCI devices can have an optional <code>driver</code>
|
|
subelement that specifies which backend driver to use for PCI
|
|
device assignment. Use the <code>name</code> attribute to
|
|
select either "vfio" (for the new VFIO device assignment
|
|
backend, which is compatible with UEFI SecureBoot) or "kvm"
|
|
(the legacy device assignment handled directly by the KVM
|
|
kernel module)<span class="since">Since 1.0.5 (QEMU and KVM
|
|
only, requires kernel 3.6 or newer)</span>. When specified,
|
|
device assignment will fail if the requested method of device
|
|
assignment isn't available on the host. When not specified,
|
|
the default is "vfio" on systems where the VFIO driver is
|
|
available and loaded, and "kvm" on older systems, or those
|
|
where the VFIO driver hasn't been
|
|
loaded <span class="since">Since 1.1.3</span> (prior to that
|
|
the default was always "kvm").
|
|
</dd>
|
|
<dt><code>readonly</code></dt>
|
|
<dd>Indicates that the device is readonly, only supported by SCSI host
|
|
device now. <span class="since">Since 1.0.6 (QEMU and KVM only)</span>
|
|
</dd>
|
|
<dt><code>shareable</code></dt>
|
|
<dd>If present, this indicates the device is expected to be shared
|
|
between domains (assuming the hypervisor and OS support this).
|
|
Only supported by SCSI host device.
|
|
<span class="since">Since 1.0.6</span>
|
|
<p>
|
|
Note: Although <code>shareable</code> was introduced
|
|
<span class="since">in 1.0.6</span>, it did not work as
|
|
as expected until <span class="since">1.2.2</span>.
|
|
</p>
|
|
</dd>
|
|
</dl>
|
|
|
|
|
|
<h5><a id="elementsHostDevCaps">Block / character devices</a></h5>
|
|
|
|
<p>
|
|
Block / character devices from the host can be passed through
|
|
to the guest using the <code>hostdev</code> element. This is
|
|
only possible with container based virtualization. Devices are specified
|
|
by a fully qualified path.
|
|
<span class="since">since after 1.0.1 for LXC</span>:
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<hostdev mode='capabilities' type='storage'>
|
|
<source>
|
|
<block>/dev/sdf1</block>
|
|
</source>
|
|
</hostdev>
|
|
...
|
|
</pre>
|
|
|
|
<pre>
|
|
...
|
|
<hostdev mode='capabilities' type='misc'>
|
|
<source>
|
|
<char>/dev/input/event3</char>
|
|
</source>
|
|
</hostdev>
|
|
...
|
|
</pre>
|
|
|
|
<pre>
|
|
...
|
|
<hostdev mode='capabilities' type='net'>
|
|
<source>
|
|
<interface>eth0</interface>
|
|
</source>
|
|
</hostdev>
|
|
...
|
|
</pre>
|
|
|
|
<dl>
|
|
<dt><code>hostdev</code></dt>
|
|
<dd>The <code>hostdev</code> element is the main container for describing
|
|
host devices. For block/character device passthrough <code>mode</code> is
|
|
always "capabilities" and <code>type</code> is "storage" for a block
|
|
device, "misc" for a character device and "net" for a host network
|
|
interface.
|
|
</dd>
|
|
<dt><code>source</code></dt>
|
|
<dd>The source element describes the device as seen from the host.
|
|
For block devices, the path to the block device in the host
|
|
OS is provided in the nested "block" element, while for character
|
|
devices the "char" element is used. For network interfaces, the
|
|
name of the interface is provided in the "interface" element.
|
|
</dd>
|
|
</dl>
|
|
|
|
<h4><a id="elementsRedir">Redirected devices</a></h4>
|
|
|
|
<p>
|
|
USB device redirection through a character device is
|
|
supported <span class="since">since after 0.9.5 (KVM
|
|
only)</span>:
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<redirdev bus='usb' type='tcp'>
|
|
<source mode='connect' host='localhost' service='4000'/>
|
|
<boot order='1'/>
|
|
</redirdev>
|
|
<redirfilter>
|
|
<usbdev class='0x08' vendor='0x1234' product='0xbeef' version='2.56' allow='yes'/>
|
|
<usbdev allow='no'/>
|
|
</redirfilter>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><code>redirdev</code></dt>
|
|
<dd>The <code>redirdev</code> element is the main container for
|
|
describing redirected devices. <code>bus</code> must be "usb"
|
|
for a USB device.
|
|
|
|
An additional attribute <code>type</code> is required,
|
|
matching one of the
|
|
supported <a href="#elementsConsole">serial device</a> types,
|
|
to describe the host side of the
|
|
tunnel; <code>type='tcp'</code>
|
|
or <code>type='spicevmc'</code> (which uses the usbredir
|
|
channel of a <a href="#elementsGraphics">SPICE graphics
|
|
device</a>) are typical. The redirdev element has an optional
|
|
sub-element <code><address></code> which can tie the
|
|
device to a particular controller. Further sub-elements,
|
|
such as <code><source></code>, may be required according
|
|
to the given type, although a <code><target></code> sub-element
|
|
is not required (since the consumer of the character device is
|
|
the hypervisor itself, rather than a device visible in the guest).
|
|
</dd>
|
|
<dt><code>boot</code></dt>
|
|
|
|
<dd>Specifies that the device is bootable.
|
|
The <code>order</code> attribute determines the order in which
|
|
devices will be tried during boot sequence. The per-device
|
|
<code>boot</code> elements cannot be used together with general
|
|
boot elements in <a href="#elementsOSBIOS">BIOS bootloader</a> section.
|
|
(<span class="since">Since 1.0.1</span>)
|
|
</dd>
|
|
<dt><code>redirfilter</code></dt>
|
|
<dd>The<code> redirfilter </code>element is used for creating the
|
|
filter rule to filter out certain devices from redirection.
|
|
It uses sub-element <code><usbdev></code> to define each filter rule.
|
|
<code>class</code> attribute is the USB Class code, for example,
|
|
0x08 represents mass storage devices. The USB device can be addressed by
|
|
vendor / product id using the <code>vendor</code> and <code>product</code> attributes.
|
|
<code>version</code> is the device revision from the bcdDevice field (not
|
|
the version of the USB protocol).
|
|
These four attributes are optional and <code>-1</code> can be used to allow
|
|
any value for them. <code>allow</code> attribute is mandatory,
|
|
'yes' means allow, 'no' for deny.
|
|
</dd>
|
|
</dl>
|
|
|
|
<h4><a id="elementsSmartcard">Smartcard devices</a></h4>
|
|
|
|
<p>
|
|
A virtual smartcard device can be supplied to the guest via the
|
|
<code>smartcard</code> element. A USB smartcard reader device on
|
|
the host cannot be used on a guest with simple device
|
|
passthrough, since it will then not be available on the host,
|
|
possibly locking the host computer when it is "removed".
|
|
Therefore, some hypervisors provide a specialized virtual device
|
|
that can present a smartcard interface to the guest, with
|
|
several modes for describing how credentials are obtained from
|
|
the host or even a from a channel created to a third-party
|
|
smartcard provider. <span class="since">Since 0.8.8</span>
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<smartcard mode='host'/>
|
|
<smartcard mode='host-certificates'>
|
|
<certificate>cert1</certificate>
|
|
<certificate>cert2</certificate>
|
|
<certificate>cert3</certificate>
|
|
<database>/etc/pki/nssdb/</database>
|
|
</smartcard>
|
|
<smartcard mode='passthrough' type='tcp'>
|
|
<source mode='bind' host='127.0.0.1' service='2001'/>
|
|
<protocol type='raw'/>
|
|
<address type='ccid' controller='0' slot='0'/>
|
|
</smartcard>
|
|
<smartcard mode='passthrough' type='spicevmc'/>
|
|
</devices>
|
|
...
|
|
</pre>
|
|
|
|
<p>
|
|
The <code><smartcard></code> element has a mandatory
|
|
attribute <code>mode</code>. The following modes are supported;
|
|
in each mode, the guest sees a device on its USB bus that
|
|
behaves like a physical USB CCID (Chip/Smart Card Interface
|
|
Device) card.
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>host</code></dt>
|
|
<dd>The simplest operation, where the hypervisor relays all
|
|
requests from the guest into direct access to the host's
|
|
smartcard via NSS. No other attributes or sub-elements are
|
|
required. See below about the use of an
|
|
optional <code><address></code> sub-element.</dd>
|
|
|
|
<dt><code>host-certificates</code></dt>
|
|
<dd>Rather than requiring a smartcard to be plugged into the
|
|
host, it is possible to provide three NSS certificate names
|
|
residing in a database on the host. These certificates can be
|
|
generated via the command <code>certutil -d /etc/pki/nssdb -x -t
|
|
CT,CT,CT -S -s CN=cert1 -n cert1</code>, and the resulting three
|
|
certificate names must be supplied as the content of each of
|
|
three <code><certificate></code> sub-elements. An
|
|
additional sub-element <code><database></code> can specify
|
|
the absolute path to an alternate directory (matching
|
|
the <code>-d</code> option of the <code>certutil</code> command
|
|
when creating the certificates); if not present, it defaults to
|
|
/etc/pki/nssdb.</dd>
|
|
|
|
<dt><code>passthrough</code></dt>
|
|
<dd>Rather than having the hypervisor directly communicate with
|
|
the host, it is possible to tunnel all requests through a
|
|
secondary character device to a third-party provider (which may
|
|
in turn be talking to a smartcard or using three certificate
|
|
files). In this mode of operation, an additional
|
|
attribute <code>type</code> is required, matching one of the
|
|
supported <a href="#elementsConsole">serial device</a> types, to
|
|
describe the host side of the tunnel; <code>type='tcp'</code>
|
|
or <code>type='spicevmc'</code> (which uses the smartcard
|
|
channel of a <a href="#elementsGraphics">SPICE graphics
|
|
device</a>) are typical. Further sub-elements, such
|
|
as <code><source></code>, may be required according to the
|
|
given type, although a <code><target></code> sub-element
|
|
is not required (since the consumer of the character device is
|
|
the hypervisor itself, rather than a device visible in the
|
|
guest).</dd>
|
|
</dl>
|
|
|
|
<p>
|
|
Each mode supports an optional
|
|
sub-element <code><address></code>, which fine-tunes the
|
|
correlation between the smartcard and a ccid bus
|
|
controller, <a href="#elementsAddress">documented above</a>.
|
|
For now, qemu only supports at most one
|
|
smartcard, with an address of bus=0 slot=0.
|
|
</p>
|
|
|
|
<h4><a id="elementsNICS">Network interfaces</a></h4>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='direct' trustGuestRxFilters='yes'>
|
|
<source dev='eth0'/>
|
|
<mac address='52:54:00:5d:c7:9e'/>
|
|
<boot order='1'/>
|
|
<rom bar='off'/>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
There are several possibilities for specifying a network
|
|
interface visible to the guest. Each subsection below provides
|
|
more details about common setup options.
|
|
</p>
|
|
<p>
|
|
<span class="since">Since 1.2.10</span>),
|
|
the <code>interface</code> element
|
|
property <code>trustGuestRxFilters</code> provides the
|
|
capability for the host to detect and trust reports from the
|
|
guest regarding changes to the interface mac address and receive
|
|
filters by setting the attribute to <code>yes</code>. The default
|
|
setting for the attribute is <code>no</code> for security
|
|
reasons and support depends on the guest network device model as
|
|
well as the type of connection on the host - currently it is
|
|
only supported for the virtio device model and for macvtap
|
|
connections on the host.
|
|
</p>
|
|
<p>
|
|
Each <code><interface></code> element has an
|
|
optional <code><address></code> sub-element that can tie
|
|
the interface to a particular pci slot, with
|
|
attribute <code>type='pci'</code>
|
|
as <a href="#elementsAddress">documented above</a>.
|
|
</p>
|
|
<p>
|
|
<span class="since">Since 6.6.0</span>, one can force libvirt to keep the
|
|
provided MAC address when it's in the reserved VMware range by adding a
|
|
<code>type="static"</code> attribute to the <code><mac/></code> element.
|
|
Note that this attribute is useless if the provided MAC address is outside of
|
|
the reserved VMWare ranges.
|
|
</p>
|
|
|
|
<h5><a id="elementsNICSVirtual">Virtual network</a></h5>
|
|
|
|
<p>
|
|
<strong><em>
|
|
This is the recommended config for general guest connectivity on
|
|
hosts with dynamic / wireless networking configs (or multi-host
|
|
environments where the host hardware details are described
|
|
separately in a <code><network></code>
|
|
definition <span class="since">Since 0.9.4</span>).
|
|
</em></strong>
|
|
</p>
|
|
|
|
<p>
|
|
|
|
Provides a connection whose details are described by the named
|
|
network definition. Depending on the virtual network's "forward
|
|
mode" configuration, the network may be totally isolated
|
|
(no <code><forward></code> element given), NAT'ing to an
|
|
explicit network device or to the default route
|
|
(<code><forward mode='nat'></code>), routed with no NAT
|
|
(<code><forward mode='route'/></code>), or connected
|
|
directly to one of the host's network interfaces (via macvtap)
|
|
or bridge devices ((<code><forward
|
|
mode='bridge|private|vepa|passthrough'/></code> <span class="since">Since
|
|
0.9.4</span>)
|
|
</p>
|
|
<p>
|
|
For networks with a forward mode of bridge, private, vepa, and
|
|
passthrough, it is assumed that the host has any necessary DNS
|
|
and DHCP services already setup outside the scope of libvirt. In
|
|
the case of isolated, nat, and routed networks, DHCP and DNS are
|
|
provided on the virtual network by libvirt, and the IP range can
|
|
be determined by examining the virtual network config with
|
|
'<code>virsh net-dumpxml [networkname]</code>'. There is one
|
|
virtual network called 'default' setup out of the box which does
|
|
NAT'ing to the default route and has an IP range
|
|
of <code>192.168.122.0/255.255.255.0</code>. Each guest will
|
|
have an associated tun device created with a name of vnetN,
|
|
which can also be overridden with the <target> element
|
|
(see
|
|
<a href="#elementsNICSTargetOverride">overriding the target element</a>).
|
|
</p>
|
|
<p>
|
|
When the source of an interface is a network,
|
|
a <code>portgroup</code> can be specified along with the name of
|
|
the network; one network may have multiple portgroups defined,
|
|
with each portgroup containing slightly different configuration
|
|
information for different classes of network
|
|
connections. <span class="since">Since 0.9.4</span>.
|
|
</p>
|
|
<p>
|
|
When a guest is running an interface of type <code>network</code>
|
|
may include a <code>portid</code> attribute. This provides the UUID
|
|
of an associated virNetworkPortPtr object that records the association
|
|
between the domain interface and the network. This attribute is
|
|
read-only since port objects are create and deleted automatically
|
|
during startup and shutdown. <span class="since">Since 5.1.0</span>
|
|
</p>
|
|
<p>
|
|
Also, similar to <code>direct</code> network connections
|
|
(described below), a connection of type <code>network</code> may
|
|
specify a <code>virtualport</code> element, with configuration
|
|
data to be forwarded to a vepa (802.1Qbg) or 802.1Qbh compliant
|
|
switch (<span class="since">Since 0.8.2</span>), or to an
|
|
Open vSwitch virtual switch (<span class="since">Since
|
|
0.9.11</span>).
|
|
</p>
|
|
<p>
|
|
Since the actual type of switch may vary depending on the
|
|
configuration in the <code><network></code> on the host,
|
|
it is acceptable to omit the virtualport <code>type</code>
|
|
attribute, and specify attributes from multiple different
|
|
virtualport types (and also to leave out certain attributes); at
|
|
domain startup time, a complete <code><virtualport></code>
|
|
element will be constructed by merging together the type and
|
|
attributes defined in the network and the portgroup referenced
|
|
by the interface. The newly-constructed virtualport is a combination
|
|
of them. The attributes from lower virtualport can't make change
|
|
on the ones defined in higher virtualport.
|
|
Interface takes the highest priority, portgroup is lowest priority.
|
|
(<span class="since">Since 0.10.0</span>). For example, in order
|
|
to work properly with both an 802.1Qbh switch and an Open vSwitch
|
|
switch, you may choose to specify no type, but both
|
|
a <code>profileid</code> (in case the switch is 802.1Qbh) and
|
|
an <code>interfaceid</code> (in case the switch is Open vSwitch)
|
|
(you may also omit the other attributes, such as managerid,
|
|
typeid, or profileid, to be filled in from the
|
|
network's <code><virtualport></code>). If you want to
|
|
limit a guest to connecting only to certain types of switches,
|
|
you can specify the virtualport type, but still omit some/all of
|
|
the parameters - in this case if the host's network has a
|
|
different type of virtualport, connection of the interface will
|
|
fail.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='network'>
|
|
<source network='default'/>
|
|
</interface>
|
|
...
|
|
<interface type='network'>
|
|
<source network='default' portgroup='engineering'/>
|
|
<target dev='vnet7'/>
|
|
<mac address="00:11:22:33:44:55"/>
|
|
<virtualport>
|
|
<parameters instanceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
|
|
</virtualport>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h5><a id="elementsNICSBridge">Bridge to LAN</a></h5>
|
|
|
|
<p>
|
|
<strong><em>
|
|
This is the recommended config for general guest connectivity on
|
|
hosts with static wired networking configs.
|
|
</em></strong>
|
|
</p>
|
|
|
|
<p>
|
|
Provides a bridge from the VM directly to the LAN. This assumes
|
|
there is a bridge device on the host which has one or more of the hosts
|
|
physical NICs attached. The guest VM will have an associated tun device
|
|
created with a name of vnetN, which can also be overridden with the
|
|
<target> element (see
|
|
<a href="#elementsNICSTargetOverride">overriding the target element</a>).
|
|
The tun device will be attached to the bridge. The IP range / network
|
|
configuration is whatever is used on the LAN. This provides the guest VM
|
|
full incoming & outgoing net access just like a physical machine.
|
|
</p>
|
|
<p>
|
|
On Linux systems, the bridge device is normally a standard Linux
|
|
host bridge. On hosts that support Open vSwitch, it is also
|
|
possible to connect to an Open vSwitch bridge device by adding
|
|
a <code><virtualport type='openvswitch'/></code> to the
|
|
interface definition. (<span class="since">Since
|
|
0.9.11</span>). The Open vSwitch type virtualport accepts two
|
|
parameters in its <code><parameters></code> element -
|
|
an <code>interfaceid</code> which is a standard uuid used to
|
|
uniquely identify this particular interface to Open vSwitch (if
|
|
you do not specify one, a random interfaceid will be generated
|
|
for you when you first define the interface), and an
|
|
optional <code>profileid</code> which is sent to Open vSwitch as
|
|
the interfaces "port-profile".
|
|
</p>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
...
|
|
<interface type='bridge'>
|
|
<source bridge='br0'/>
|
|
</interface>
|
|
<interface type='bridge'>
|
|
<source bridge='br1'/>
|
|
<target dev='vnet7'/>
|
|
<mac address="00:11:22:33:44:55"/>
|
|
</interface>
|
|
<interface type='bridge'>
|
|
<source bridge='ovsbr'/>
|
|
<virtualport type='openvswitch'>
|
|
<parameters profileid='menial' interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
|
|
</virtualport>
|
|
</interface>
|
|
...
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
On hosts that support Open vSwitch on the kernel side and have the
|
|
Midonet Host Agent configured, it is also possible to connect to the
|
|
'midonet' bridge device by adding a
|
|
<code><virtualport type='midonet'/></code> to the
|
|
interface definition. (<span class="since">Since
|
|
1.2.13</span>). The Midonet virtualport type requires an
|
|
<code>interfaceid</code> attribute in its
|
|
<code><parameters></code> element. This interface id is the UUID
|
|
that specifies which port in the virtual network topology will be bound
|
|
to the interface.
|
|
</p>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
...
|
|
<interface type='bridge'>
|
|
<source bridge='br0'/>
|
|
</interface>
|
|
<interface type='bridge'>
|
|
<source bridge='br1'/>
|
|
<target dev='vnet7'/>
|
|
<mac address="00:11:22:33:44:55"/>
|
|
</interface>
|
|
<interface type='bridge'>
|
|
<source bridge='midonet'/>
|
|
<virtualport type='midonet'>
|
|
<parameters interfaceid='0b2d64da-3d0e-431e-afdd-804415d6ebbb'/>
|
|
</virtualport>
|
|
</interface>
|
|
...
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h5><a id="elementsNICSSlirp">Userspace SLIRP stack</a></h5>
|
|
|
|
<p>
|
|
Provides a virtual LAN with NAT to the outside world. The virtual
|
|
network has DHCP & DNS services and will give the guest VM addresses
|
|
starting from <code>10.0.2.15</code>. The default router will be
|
|
<code>10.0.2.2</code> and the DNS server will be <code>10.0.2.3</code>.
|
|
This networking is the only option for unprivileged users who need their
|
|
VMs to have outgoing access. <span class="since">Since 3.8.0</span>
|
|
it is possible to override the default network address by
|
|
including an <code>ip</code> element specifying an IPv4
|
|
address in its one mandatory attribute, <code>address</code>.
|
|
Optionally, a second <code>ip</code> element with a
|
|
<code>family</code> attribute set to "ipv6" can be
|
|
specified to add an IPv6 address to the interface.
|
|
<code>address</code>. Optionally, address
|
|
<code>prefix</code> can be specified.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='user'/>
|
|
...
|
|
<interface type='user'>
|
|
<mac address="00:11:22:33:44:55"/>
|
|
<ip family='ipv4' address='172.17.2.0' prefix='24'/>
|
|
<ip family='ipv6' address='2001:db8:ac10:fd01::' prefix='64'/>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
|
|
<h5><a id="elementsNICSEthernet">Generic ethernet connection</a></h5>
|
|
|
|
<p>
|
|
Provides a means to use a new or existing tap device (or veth
|
|
device pair, depening on the needs of the hypervisor driver)
|
|
that is partially or wholly setup external to libvirt (either
|
|
prior to the guest starting, or while the guest is being started
|
|
via an optional script specified in the config).
|
|
</p>
|
|
<p>
|
|
The name of the tap device can optionally be specified with
|
|
the <code>dev</code> attribute of the
|
|
<code><target></code> element. If no target dev is
|
|
specified, libvirt will create a new standard tap device with a
|
|
name of the pattern "vnetN", where "N" is replaced with a
|
|
number. If a target dev is specified and that device doesn't
|
|
exist, then a new standard tap device will be created with the
|
|
exact dev name given. If the specified target dev does exist,
|
|
then that existing device will be used. Usually some basic setup
|
|
of the device is done by libvirt, including setting a MAC
|
|
address, and the IFF_UP flag, but if the <code>dev</code> is a
|
|
pre-existing device, and the <code>managed</code> attribute of
|
|
the <code>target</code> element is also set to "no" (the default
|
|
value is "yes"), even this basic setup will not be performed -
|
|
libvirt will simply pass the device on to the hypervisor with no
|
|
setup at all. <span class="since">Since 5.7.0</span> Using
|
|
managed='no' with a pre-created tap device is useful because
|
|
it permits a virtual machine managed by an unprivileged libvirtd
|
|
to have emulated network devices based on tap devices.
|
|
</p>
|
|
<p>
|
|
After creating/opening the tap device, an optional shell script
|
|
(given in the <code>path</code> attribute of
|
|
the <code><script></code> element) will be run.
|
|
<span class="since">Since 0.2.1</span>
|
|
Also, after detaching/closing the tap device, an optional shell
|
|
script (given in the <code>path</code> attribute of
|
|
the <code><downscript></code> element) will be run.
|
|
<span class="since">Since 5.1.0</span>
|
|
These can be used to do whatever extra host network integration is
|
|
required.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='ethernet'>
|
|
<script path='/etc/qemu-ifup-mynet'/>
|
|
<downscript path='/etc/qemu-ifdown-mynet'/>
|
|
</interface>
|
|
...
|
|
<interface type='ethernet'>
|
|
<target dev='mytap1' managed='no'/>
|
|
<model type='virtio'/>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h5><a id="elementsNICSDirect">Direct attachment to physical interface</a></h5>
|
|
|
|
<p>
|
|
Provides direct attachment of the virtual machine's NIC to the given
|
|
physical interface of the host.
|
|
<span class="since">Since 0.7.7 (QEMU and KVM only)</span><br/>
|
|
This setup requires the Linux macvtap
|
|
driver to be available. <span class="since">(Since Linux 2.6.34.)</span>
|
|
One of the modes 'vepa'
|
|
( <a href="http://www.ieee802.org/1/files/public/docs2009/new-evb-congdon-vepa-modular-0709-v01.pdf">
|
|
'Virtual Ethernet Port Aggregator'</a>), 'bridge' or 'private'
|
|
can be chosen for the operation mode of the macvtap device, 'vepa'
|
|
being the default mode. The individual modes cause the delivery of
|
|
packets to behave as follows:
|
|
</p>
|
|
<p>
|
|
If the model type is set to <code>virtio</code> and
|
|
interface's <code>trustGuestRxFilters</code> attribute is set
|
|
to <code>yes</code>, changes made to the interface mac address,
|
|
unicast/multicast receive filters, and vlan settings in the
|
|
guest will be monitored and propagated to the associated macvtap
|
|
device on the host (<span class="since">Since
|
|
1.2.10</span>). If <code>trustGuestRxFilters</code> is not set,
|
|
or is not supported for the device model in use, an attempted
|
|
change to the mac address originating from the guest side will
|
|
result in a non-working network connection.
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>vepa</code></dt>
|
|
<dd>All VMs' packets are sent to the external bridge. Packets
|
|
whose destination is a VM on the same host as where the
|
|
packet originates from are sent back to the host by the VEPA
|
|
capable bridge (today's bridges are typically not VEPA capable).</dd>
|
|
<dt><code>bridge</code></dt>
|
|
<dd>Packets whose destination is on the same host as where they
|
|
originate from are directly delivered to the target macvtap device.
|
|
Both origin and destination devices need to be in bridge mode
|
|
for direct delivery. If either one of them is in <code>vepa</code> mode,
|
|
a VEPA capable bridge is required.</dd>
|
|
<dt><code>private</code></dt>
|
|
<dd>All packets are sent to the external bridge and will only be
|
|
delivered to a target VM on the same host if they are sent through an
|
|
external router or gateway and that device sends them back to the
|
|
host. This procedure is followed if either the source or destination
|
|
device is in <code>private</code> mode.</dd>
|
|
<dt><code>passthrough</code></dt>
|
|
<dd>This feature attaches a virtual function of a SRIOV capable
|
|
NIC directly to a VM without losing the migration capability.
|
|
All packets are sent to the VF/IF of the configured network device.
|
|
Depending on the capabilities of the device additional prerequisites or
|
|
limitations may apply; for example, on Linux this requires
|
|
kernel 2.6.38 or newer. <span class="since">Since 0.9.2</span></dd>
|
|
</dl>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
...
|
|
<interface type='direct' trustGuestRxFilters='no'>
|
|
<source dev='eth0' mode='vepa'/>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
The network access of direct attached virtual machines can be
|
|
managed by the hardware switch to which the physical interface
|
|
of the host machine is connected to.
|
|
</p>
|
|
<p>
|
|
The interface can have additional parameters as shown below,
|
|
if the switch is conforming to the IEEE 802.1Qbg standard.
|
|
The parameters of the virtualport element are documented in more detail
|
|
in the IEEE 802.1Qbg standard. The values are network specific and
|
|
should be provided by the network administrator. In 802.1Qbg terms,
|
|
the Virtual Station Interface (VSI) represents the virtual interface
|
|
of a virtual machine. <span class="since">Since 0.8.2</span>
|
|
</p>
|
|
<p>
|
|
Please note that IEEE 802.1Qbg requires a non-zero value for the
|
|
VLAN ID.
|
|
</p>
|
|
<dl>
|
|
<dt><code>managerid</code></dt>
|
|
<dd>The VSI Manager ID identifies the database containing the VSI type
|
|
and instance definitions. This is an integer value and the
|
|
value 0 is reserved.</dd>
|
|
<dt><code>typeid</code></dt>
|
|
<dd>The VSI Type ID identifies a VSI type characterizing the network
|
|
access. VSI types are typically managed by network administrator.
|
|
This is an integer value.
|
|
</dd>
|
|
<dt><code>typeidversion</code></dt>
|
|
<dd>The VSI Type Version allows multiple versions of a VSI Type.
|
|
This is an integer value.
|
|
</dd>
|
|
<dt><code>instanceid</code></dt>
|
|
<dd>The VSI Instance ID Identifier is generated when a VSI instance
|
|
(i.e. a virtual interface of a virtual machine) is created.
|
|
This is a globally unique identifier.
|
|
</dd>
|
|
</dl>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
...
|
|
<interface type='direct'>
|
|
<source dev='eth0.2' mode='vepa'/>
|
|
<virtualport type="802.1Qbg">
|
|
<parameters managerid="11" typeid="1193047" typeidversion="2" instanceid="09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f"/>
|
|
</virtualport>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
The interface can have additional parameters as shown below
|
|
if the switch is conforming to the IEEE 802.1Qbh standard.
|
|
The values are network specific and should be provided by the
|
|
network administrator. <span class="since">Since 0.8.2</span>
|
|
</p>
|
|
<dl>
|
|
<dt><code>profileid</code></dt>
|
|
<dd>The profile ID contains the name of the port profile that is to
|
|
be applied to this interface. This name is resolved by the port
|
|
profile database into the network parameters from the port profile,
|
|
and those network parameters will be applied to this interface.
|
|
</dd>
|
|
</dl>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
...
|
|
<interface type='direct'>
|
|
<source dev='eth0' mode='private'/>
|
|
<virtualport type='802.1Qbh'>
|
|
<parameters profileid='finance'/>
|
|
</virtualport>
|
|
</interface>
|
|
</devices>
|
|
...
|
|
</pre>
|
|
|
|
|
|
<h5><a id="elementsNICSHostdev">PCI Passthrough</a></h5>
|
|
|
|
<p>
|
|
A PCI network device (specified by the <source> element)
|
|
is directly assigned to the guest using generic device
|
|
passthrough, after first optionally setting the device's MAC
|
|
address to the configured value, and associating the device with
|
|
an 802.1Qbh capable switch using an optionally specified
|
|
<virtualport> element (see the examples of virtualport
|
|
given above for type='direct' network devices). Note that - due
|
|
to limitations in standard single-port PCI ethernet card driver
|
|
design - only SR-IOV (Single Root I/O Virtualization) virtual
|
|
function (VF) devices can be assigned in this manner; to assign
|
|
a standard single-port PCI or PCIe ethernet card to a guest, use
|
|
the traditional <hostdev> device definition and
|
|
<span class="since">Since 0.9.11</span>
|
|
</p>
|
|
|
|
<p>
|
|
To use VFIO device assignment rather than traditional/legacy KVM
|
|
device assignment (VFIO is a new method of device assignment
|
|
that is compatible with UEFI Secure Boot), a type='hostdev'
|
|
interface can have an optional <code>driver</code> sub-element
|
|
with a <code>name</code> attribute set to "vfio". To use legacy
|
|
KVM device assignment you can set <code>name</code> to "kvm" (or
|
|
simply omit the <code><driver></code> element, since "kvm"
|
|
is currently the default).
|
|
<span class="since">Since 1.0.5 (QEMU and KVM only, requires kernel 3.6 or newer)</span>
|
|
</p>
|
|
|
|
<p>
|
|
Note that this "intelligent passthrough" of network devices is
|
|
very similar to the functionality of a standard <hostdev>
|
|
device, the difference being that this method allows specifying
|
|
a MAC address and <virtualport> for the passed-through
|
|
device. If these capabilities are not required, if you have a
|
|
standard single-port PCI, PCIe, or USB network card that doesn't
|
|
support SR-IOV (and hence would anyway lose the configured MAC
|
|
address during reset after being assigned to the guest domain),
|
|
or if you are using a version of libvirt older than 0.9.11, you
|
|
should use standard <hostdev> to assign the device to the
|
|
guest instead of <interface type='hostdev'/>.
|
|
</p>
|
|
|
|
<p>
|
|
Similar to the functionality of a standard <hostdev> device,
|
|
when <code>managed</code> is "yes", it is detached from the host
|
|
before being passed on to the guest, and reattached to the host
|
|
after the guest exits. If <code>managed</code> is omitted or "no",
|
|
the user is responsible to call <code>virNodeDeviceDettach</code>
|
|
(or <code>virsh nodedev-detach</code>) before starting the guest
|
|
or hot-plugging the device, and <code>virNodeDeviceReAttach</code>
|
|
(or <code>virsh nodedev-reattach</code>) after hot-unplug or
|
|
stopping the guest.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='hostdev' managed='yes'>
|
|
<driver name='vfio'/>
|
|
<source>
|
|
<address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
|
|
</source>
|
|
<mac address='52:54:00:6d:90:02'/>
|
|
<virtualport type='802.1Qbh'>
|
|
<parameters profileid='finance'/>
|
|
</virtualport>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h5><a id="elementsTeaming">Teaming a virtio/hostdev NIC pair</a></h5>
|
|
|
|
<p>
|
|
<span class="since">Since 6.1.0 (QEMU and KVM only, requires
|
|
QEMU 4.2.0 or newer and a guest virtio-net driver supporting
|
|
the "failover" feature, such as the one included in Linux
|
|
kernel 4.18 and newer)
|
|
</span>
|
|
The <code><teaming></code> element of two interfaces can
|
|
be used to connect them as a team/bond device in the guest
|
|
(assuming proper support in the hypervisor and the guest
|
|
network driver).
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='network'>
|
|
<source network='mybridge'/>
|
|
<mac address='00:11:22:33:44:55'/>
|
|
<model type='virtio'/>
|
|
<teaming type='persistent'/>
|
|
<alias name='ua-backup0'/>
|
|
</interface>
|
|
<interface type='network'>
|
|
<source network='hostdev-pool'/>
|
|
<mac address='00:11:22:33:44:55'/>
|
|
<model type='virtio'/>
|
|
<teaming type='transient' persistent='ua-backup0'/>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
The <code><teaming></code> element required
|
|
attribute <code>type</code> will be set to
|
|
either <code>"persistent"</code> to indicate a device that
|
|
should always be present in the domain,
|
|
or <code>"transient"</code> to indicate a device that may
|
|
periodically be removed, then later re-added to the domain. When
|
|
type="transient", there should be a second attribute
|
|
to <code><teaming></code> called <code>"persistent"</code>
|
|
- this attribute should be set to the alias name of the other
|
|
device in the pair (the one that has <code><teaming
|
|
type="persistent'/></code>).
|
|
</p>
|
|
<p>
|
|
In the particular case of QEMU,
|
|
libvirt's <code><teaming></code> element is used to setup
|
|
a virtio-net "failover" device pair. For this setup, the
|
|
persistent device must be an interface with <code><model
|
|
type="virtio"/></code>, and the transient device must
|
|
be <code><interface type='hostdev'/></code>
|
|
(or <code><interface type='network'/></code> where the
|
|
referenced network defines a pool of SRIOV VFs). The guest will
|
|
then have a simple network team/bond device made of the virtio
|
|
NIC + hostdev NIC pair. In this configuration, the
|
|
higher-performing hostdev NIC will normally be preferred for all
|
|
network traffic, but when the domain is migrated, QEMU will
|
|
automatically unplug the VF from the guest, and then hotplug a
|
|
similar device once migration is completed; while migration is
|
|
taking place, network traffic will use the virtio NIC. (Of
|
|
course the emulated virtio NIC and the hostdev NIC must be
|
|
connected to the same subnet for bonding to work properly).
|
|
</p>
|
|
<p>
|
|
NB1: Since you must know the alias name of the virtio NIC when
|
|
configuring the hostdev NIC, it will need to be manually set in
|
|
the virtio NIC's configuration (as with all other manually set
|
|
alias names, this means it must start with "ua-").
|
|
</p>
|
|
<p>
|
|
NB2: Currently the only implementation of the guest OS
|
|
virtio-net driver supporting virtio-net failover requires that
|
|
the MAC addresses of the virtio and hostdev NIC must
|
|
match. Since that may not always be a requirement in the future,
|
|
libvirt doesn't enforce this limitation - it is up to the
|
|
person/management application that is creating the configuration
|
|
to assure the MAC addresses of the two devices match.
|
|
</p>
|
|
<p>
|
|
NB3: Since the PCI addresses of the SRIOV VFs on the hosts that
|
|
are the source and destination of the migration will almost
|
|
certainly be different, either higher level management software
|
|
will need to modify the <code><source></code> of the
|
|
hostdev NIC (<code><interface type='hostdev'></code>) at
|
|
the start of migration, or (a simpler solution) the
|
|
configuration will need to use a libvirt "hostdev" virtual
|
|
network that maintains a pool of such devices, as is implied in
|
|
the example's use of the libvirt network named "hostdev-pool" -
|
|
as long as the hostdev network pools on both hosts have the same
|
|
name, libvirt itself will take care of allocating an appropriate
|
|
device on both ends of the migration. Similarly the XML for the
|
|
virtio interface must also either work correctly unmodified on
|
|
both the source and destination of the migration (e.g. by
|
|
connecting to the same bridge device on both hosts, or by using
|
|
the same virtual network), or the management software must
|
|
properly modify the interface XML during migration so that the
|
|
virtio device remains connected to the same network segment
|
|
before and after migration.
|
|
</p>
|
|
|
|
<h5><a id="elementsNICSMulticast">Multicast tunnel</a></h5>
|
|
|
|
<p>
|
|
A multicast group is setup to represent a virtual network. Any VMs
|
|
whose network devices are in the same multicast group can talk to each
|
|
other even across hosts. This mode is also available to unprivileged
|
|
users. There is no default DNS or DHCP support and no outgoing network
|
|
access. To provide outgoing network access, one of the VMs should have a
|
|
2nd NIC which is connected to one of the first 4 network types and do the
|
|
appropriate routing. The multicast protocol is compatible with that used
|
|
by user mode linux guests too. The source address used must be from the
|
|
multicast address block.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='mcast'>
|
|
<mac address='52:54:00:6d:90:01'/>
|
|
<source address='230.0.0.1' port='5558'/>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h5><a id="elementsNICSTCP">TCP tunnel</a></h5>
|
|
|
|
<p>
|
|
A TCP client/server architecture provides a virtual network. One VM
|
|
provides the server end of the network, all other VMS are configured as
|
|
clients. All network traffic is routed between the VMs via the server.
|
|
This mode is also available to unprivileged users. There is no default
|
|
DNS or DHCP support and no outgoing network access. To provide outgoing
|
|
network access, one of the VMs should have a 2nd NIC which is connected
|
|
to one of the first 4 network types and do the appropriate routing.</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='server'>
|
|
<mac address='52:54:00:22:c9:42'/>
|
|
<source address='192.168.0.1' port='5558'/>
|
|
</interface>
|
|
...
|
|
<interface type='client'>
|
|
<mac address='52:54:00:8b:c9:51'/>
|
|
<source address='192.168.0.1' port='5558'/>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h5><a id="elementsNICSUDP">UDP unicast tunnel</a></h5>
|
|
|
|
<p>
|
|
A UDP unicast architecture provides a virtual network which enables
|
|
connections between QEMU instances using QEMU's UDP infrastructure.
|
|
|
|
The xml "source" address is the endpoint address to which the UDP socket
|
|
packets will be sent from the host running QEMU.
|
|
The xml "local" address is the address of the interface from which the
|
|
UDP socket packets will originate from the QEMU host.
|
|
<span class="since">Since 1.2.20</span></p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='udp'>
|
|
<mac address='52:54:00:22:c9:42'/>
|
|
<source address='127.0.0.1' port='11115'>
|
|
<local address='127.0.0.1' port='11116'/>
|
|
</source>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h5><a id="elementsNICSModel">Setting the NIC model</a></h5>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='network'>
|
|
<source network='default'/>
|
|
<target dev='vnet1'/>
|
|
<b><model type='ne2k_pci'/></b>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
For hypervisors which support this, you can set the model of
|
|
emulated network interface card.
|
|
</p>
|
|
|
|
<p>
|
|
The values for <code>type</code> aren't defined specifically by
|
|
libvirt, but by what the underlying hypervisor supports (if
|
|
any). For QEMU and KVM you can get a list of supported models
|
|
with these commands:
|
|
</p>
|
|
|
|
<pre>
|
|
qemu -net nic,model=? /dev/null
|
|
qemu-kvm -net nic,model=? /dev/null
|
|
</pre>
|
|
|
|
<p>
|
|
Typical values for QEMU and KVM include:
|
|
ne2k_isa i82551 i82557b i82559er ne2k_pci pcnet rtl8139 e1000 virtio.
|
|
<span class="since">Since 5.2.0</span>, <code>virtio-transitional</code>
|
|
and <code>virtio-non-transitional</code> values are supported.
|
|
See <a href="#elementsVirtioTransitional">Virtio transitional devices</a>
|
|
for more details.
|
|
</p>
|
|
|
|
<h5><a id="elementsDriverBackendOptions">Setting NIC driver-specific options</a></h5>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='network'>
|
|
<source network='default'/>
|
|
<target dev='vnet1'/>
|
|
<model type='virtio'/>
|
|
<b><driver name='vhost' txmode='iothread' ioeventfd='on' event_idx='off' queues='5' rx_queue_size='256' tx_queue_size='256'>
|
|
<host csum='off' gso='off' tso4='off' tso6='off' ecn='off' ufo='off' mrg_rxbuf='off'/>
|
|
<guest csum='off' tso4='off' tso6='off' ecn='off' ufo='off'/>
|
|
</driver>
|
|
</b></interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
Some NICs may have tunable driver-specific options. These are
|
|
set as attributes of the <code>driver</code> sub-element of the
|
|
interface definition. Currently the following attributes are
|
|
available for the <code>"virtio"</code> NIC driver:
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>name</code></dt>
|
|
<dd>
|
|
The optional <code>name</code> attribute forces which type of
|
|
backend driver to use. The value can be either 'qemu' (a
|
|
user-space backend) or 'vhost' (a kernel backend, which
|
|
requires the vhost module to be provided by the kernel); an
|
|
attempt to require the vhost driver without kernel support
|
|
will be rejected. If this attribute is not present, then the
|
|
domain defaults to 'vhost' if present, but silently falls back
|
|
to 'qemu' without error.
|
|
<span class="since">Since 0.8.8 (QEMU and KVM only)</span>
|
|
</dd>
|
|
<dd>
|
|
For interfaces of type='hostdev' (PCI passthrough devices)
|
|
the <code>name</code> attribute can optionally be set to
|
|
"vfio" or "kvm". "vfio" tells libvirt to use VFIO device
|
|
assignment rather than traditional KVM device assignment (VFIO
|
|
is a new method of device assignment that is compatible with
|
|
UEFI Secure Boot), and "kvm" tells libvirt to use the legacy
|
|
device assignment performed directly by the kvm kernel module
|
|
(the default is currently "kvm", but is subject to change).
|
|
<span class="since">Since 1.0.5 (QEMU and KVM only, requires
|
|
kernel 3.6 or newer)</span>
|
|
</dd>
|
|
<dd>
|
|
For interfaces of type='vhostuser', the <code>name</code>
|
|
attribute is ignored. The backend driver used is always
|
|
vhost-user.
|
|
</dd>
|
|
|
|
<dt><code>txmode</code></dt>
|
|
<dd>
|
|
The <code>txmode</code> attribute specifies how to handle
|
|
transmission of packets when the transmit buffer is full. The
|
|
value can be either 'iothread' or 'timer'.
|
|
<span class="since">Since 0.8.8 (QEMU and KVM only)</span><br/><br/>
|
|
|
|
If set to 'iothread', packet tx is all done in an iothread in
|
|
the bottom half of the driver (this option translates into
|
|
adding "tx=bh" to the qemu commandline -device virtio-net-pci
|
|
option).<br/><br/>
|
|
|
|
If set to 'timer', tx work is done in qemu, and if there is
|
|
more tx data than can be sent at the present time, a timer is
|
|
set before qemu moves on to do other things; when the timer
|
|
fires, another attempt is made to send more data.<br/><br/>
|
|
|
|
The resulting difference, according to the qemu developer who
|
|
added the option is: "bh makes tx more asynchronous and reduces
|
|
latency, but potentially causes more processor bandwidth
|
|
contention since the CPU doing the tx isn't necessarily the
|
|
CPU where the guest generated the packets."<br/><br/>
|
|
|
|
<b>In general you should leave this option alone, unless you
|
|
are very certain you know what you are doing.</b>
|
|
</dd>
|
|
<dt><code>ioeventfd</code></dt>
|
|
<dd>
|
|
This optional attribute allows users to set
|
|
<a href='https://patchwork.kernel.org/patch/43390/'>
|
|
domain I/O asynchronous handling</a> for interface device.
|
|
The default is left to the discretion of the hypervisor.
|
|
Accepted values are "on" and "off". Enabling this allows
|
|
qemu to execute VM while a separate thread handles I/O.
|
|
Typically guests experiencing high system CPU utilization
|
|
during I/O will benefit from this. On the other hand,
|
|
on overloaded host it could increase guest I/O latency.
|
|
<span class="since">Since 0.9.3 (QEMU and KVM only)</span><br/><br/>
|
|
|
|
<b>In general you should leave this option alone, unless you
|
|
are very certain you know what you are doing.</b>
|
|
</dd>
|
|
<dt><code>event_idx</code></dt>
|
|
<dd>
|
|
The <code>event_idx</code> attribute controls some aspects of
|
|
device event processing. The value can be either 'on' or 'off'
|
|
- if it is on, it will reduce the number of interrupts and
|
|
exits for the guest. The default is determined by QEMU;
|
|
usually if the feature is supported, default is on. In case
|
|
there is a situation where this behavior is suboptimal, this
|
|
attribute provides a way to force the feature off.
|
|
<span class="since">Since 0.9.5 (QEMU and KVM only)</span><br/><br/>
|
|
|
|
<b>In general you should leave this option alone, unless you
|
|
are very certain you know what you are doing.</b>
|
|
</dd>
|
|
<dt><code>queues</code></dt>
|
|
<dd>
|
|
The optional <code>queues</code> attribute controls the number
|
|
of queues to be used for either
|
|
<a href="https://www.linux-kvm.org/page/Multiqueue"> Multiqueue
|
|
virtio-net</a> or <a href="#elementVhostuser">vhost-user</a> network
|
|
interfaces. Use of multiple packet processing queues requires the
|
|
interface having the <code><model type='virtio'/></code>
|
|
element. Each queue will potentially be handled by a different
|
|
processor, resulting in much higher throughput.
|
|
<span class="since">virtio-net since 1.0.6 (QEMU and KVM only)</span>
|
|
<span class="since">vhost-user since 1.2.17 (QEMU and KVM only)</span>
|
|
</dd>
|
|
<dt><code>rx_queue_size</code></dt>
|
|
<dd>
|
|
The optional <code>rx_queue_size</code> attribute controls
|
|
the size of virtio ring for each queue as described above.
|
|
The default value is hypervisor dependent and may change
|
|
across its releases. Moreover, some hypervisors may pose
|
|
some restrictions on actual value. For instance, latest
|
|
QEMU (as of 2016-09-01) requires value to be a power of two
|
|
from [256, 1024] range.
|
|
<span class="since">Since 2.3.0 (QEMU and KVM only)</span><br/><br/>
|
|
|
|
<b>In general you should leave this option alone, unless you
|
|
are very certain you know what you are doing.</b>
|
|
</dd>
|
|
<dt><code>tx_queue_size</code></dt>
|
|
<dd>
|
|
The optional <code>tx_queue_size</code> attribute controls
|
|
the size of virtio ring for each queue as described above.
|
|
The default value is hypervisor dependent and may change
|
|
across its releases. Moreover, some hypervisors may pose
|
|
some restrictions on actual value. For instance, QEMU
|
|
v2.9 requires value to be a power of two from [256, 1024]
|
|
range. In addition to that, this may work only for a subset of
|
|
interface types, e.g. aforementioned QEMU enables this option
|
|
only for <code>vhostuser</code> type.
|
|
<span class="since">Since 3.7.0 (QEMU and KVM only)</span><br/><br/>
|
|
|
|
<b>In general you should leave this option alone, unless you
|
|
are very certain you know what you are doing.</b>
|
|
</dd>
|
|
<dt>virtio options</dt>
|
|
<dd>
|
|
For virtio interfaces,
|
|
<a href="#elementsVirtio">Virtio-specific options</a> can also be
|
|
set. (<span class="since">Since 3.5.0</span>)
|
|
</dd>
|
|
</dl>
|
|
<p>
|
|
Offloading options for the host and guest can be configured using
|
|
the following sub-elements:
|
|
</p>
|
|
<dl>
|
|
<dt><code>host</code></dt>
|
|
<dd>
|
|
The <code>csum</code>, <code>gso</code>, <code>tso4</code>,
|
|
<code>tso6</code>, <code>ecn</code> and <code>ufo</code>
|
|
attributes with possible values <code>on</code>
|
|
and <code>off</code> can be used to turn off host offloading options.
|
|
By default, the supported offloads are enabled by QEMU.
|
|
<span class="since">Since 1.2.9 (QEMU only)</span>
|
|
The <code>mrg_rxbuf</code> attribute can be used to control
|
|
mergeable rx buffers on the host side. Possible values are
|
|
<code>on</code> (default) and <code>off</code>.
|
|
<span class="since">Since 1.2.13 (QEMU only)</span>
|
|
</dd>
|
|
<dt><code>guest</code></dt>
|
|
<dd>
|
|
The <code>csum</code>, <code>tso4</code>,
|
|
<code>tso6</code>, <code>ecn</code> and <code>ufo</code>
|
|
attributes with possible values <code>on</code>
|
|
and <code>off</code> can be used to turn off guest offloading options.
|
|
By default, the supported offloads are enabled by QEMU.
|
|
<span class="since">Since 1.2.9 (QEMU only)</span>
|
|
</dd>
|
|
</dl>
|
|
|
|
<h5><a id="elementsBackendOptions">Setting network backend-specific options</a></h5>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='network'>
|
|
<source network='default'/>
|
|
<target dev='vnet1'/>
|
|
<model type='virtio'/>
|
|
<b><backend tap='/dev/net/tun' vhost='/dev/vhost-net'/></b>
|
|
<driver name='vhost' txmode='iothread' ioeventfd='on' event_idx='off' queues='5'/>
|
|
<b><tune>
|
|
<sndbuf>1600</sndbuf>
|
|
</tune></b>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
For tuning the backend of the network, the <code>backend</code> element
|
|
can be used. The <code>vhost</code> attribute can override the default vhost
|
|
device path (<code>/dev/vhost-net</code>) for devices with <code>virtio</code> model.
|
|
The <code>tap</code> attribute overrides the tun/tap device path (default:
|
|
<code>/dev/net/tun</code>) for network and bridge interfaces. This does not work
|
|
in session mode. <span class="since">Since 1.2.9</span>
|
|
</p>
|
|
<p>
|
|
For tap devices there is also <code>sndbuf</code> element which can
|
|
adjust the size of send buffer in the host. <span class="since">Since
|
|
0.8.8</span>
|
|
</p>
|
|
<h5><a id="elementsNICSTargetOverride">Overriding the target element</a></h5>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='network'>
|
|
<source network='default'/>
|
|
<b><target dev='vnet1'/></b>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
If no target is specified, certain hypervisors will
|
|
automatically generate a name for the created tun device. This
|
|
name can be manually specified, however the name <i>should not
|
|
start with either 'vnet', 'vif', 'macvtap', or 'macvlan'</i>,
|
|
which are prefixes reserved by libvirt and certain hypervisors.
|
|
Manually specified targets using these prefixes may be ignored.
|
|
</p>
|
|
|
|
<p>
|
|
Note that for LXC containers, this defines the name of the interface
|
|
on the host side. <span class="since">Since 1.2.7</span>, to define
|
|
the name of the device on the guest side, the <code>guest</code>
|
|
element should be used, as in the following snippet:
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='network'>
|
|
<source network='default'/>
|
|
<b><guest dev='myeth'/></b>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h5><a id="elementsNICSBoot">Specifying boot order</a></h5>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='network'>
|
|
<source network='default'/>
|
|
<target dev='vnet1'/>
|
|
<b><boot order='1'/></b>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
For hypervisors which support this, you can set a specific NIC to
|
|
be used for network boot. The <code>order</code> attribute determines
|
|
the order in which devices will be tried during boot sequence. The
|
|
per-device <code>boot</code> elements cannot be used together with
|
|
general boot elements in
|
|
<a href="#elementsOSBIOS">BIOS bootloader</a> section.
|
|
<span class="since">Since 0.8.8</span>
|
|
</p>
|
|
|
|
<h5><a id="elementsNICSROM">Interface ROM BIOS configuration</a></h5>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='network'>
|
|
<source network='default'/>
|
|
<target dev='vnet1'/>
|
|
<b><rom bar='on' file='/etc/fake/boot.bin'/></b>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
For hypervisors which support this, you can change how a PCI Network
|
|
device's ROM is presented to the guest. The <code>bar</code>
|
|
attribute can be set to "on" or "off", and determines whether
|
|
or not the device's ROM will be visible in the guest's memory
|
|
map. (In PCI documentation, the "rombar" setting controls the
|
|
presence of the Base Address Register for the ROM). If no rom
|
|
bar is specified, the qemu default will be used (older
|
|
versions of qemu used a default of "off", while newer qemus
|
|
have a default of "on").
|
|
The optional <code>file</code> attribute is used to point to a
|
|
binary file to be presented to the guest as the device's ROM
|
|
BIOS. This can be useful to provide an alternative boot ROM for a
|
|
network device.
|
|
<span class="since">Since 0.9.10 (QEMU and KVM only)</span>.
|
|
</p>
|
|
<h5><a id="elementDomain">Setting up a network backend in a driver domain</a></h5>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
...
|
|
<interface type='bridge'>
|
|
<source bridge='br0'/>
|
|
<b><backenddomain name='netvm'/></b>
|
|
</interface>
|
|
...
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
The optional <code>backenddomain</code> element allows specifying a
|
|
backend domain (aka driver domain) for the interface. Use the
|
|
<code>name</code> attribute to specify the backend domain name. You
|
|
can use it to create a direct network link between domains (so data
|
|
will not go through host system). Use with type 'ethernet' to create
|
|
plain network link, or with type 'bridge' to connect to a bridge inside
|
|
the backend domain.
|
|
<span class="since">Since 1.2.13 (Xen only)</span>
|
|
</p>
|
|
|
|
<h5><a id="elementQoS">Quality of service</a></h5>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='network'>
|
|
<source network='default'/>
|
|
<target dev='vnet0'/>
|
|
<b><bandwidth>
|
|
<inbound average='1000' peak='5000' floor='200' burst='1024'/>
|
|
<outbound average='128' peak='256' burst='256'/>
|
|
</bandwidth></b>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
This part of interface XML provides setting quality of service. Incoming
|
|
and outgoing traffic can be shaped independently.
|
|
The <code>bandwidth</code> element and its child elements are described
|
|
in the <a href="formatnetwork.html#elementQoS">QoS</a> section of
|
|
the Network XML.
|
|
</p>
|
|
|
|
<h5><a id="elementVlanTag">Setting VLAN tag (on supported network types only)</a></h5>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='bridge'>
|
|
<b><vlan></b>
|
|
<b><tag id='42'/></b>
|
|
<b></vlan></b>
|
|
<source bridge='ovsbr0'/>
|
|
<virtualport type='openvswitch'>
|
|
<parameters interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
|
|
</virtualport>
|
|
</interface>
|
|
<interface type='bridge'>
|
|
<b><vlan trunk='yes'></b>
|
|
<b><tag id='42'/></b>
|
|
<b><tag id='123' nativeMode='untagged'/></b>
|
|
<b></vlan></b>
|
|
...
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
If (and only if) the network connection used by the guest
|
|
supports VLAN tagging transparent to the guest, an
|
|
optional <code><vlan></code> element can specify one or
|
|
more VLAN tags to apply to the guest's network
|
|
traffic <span class="since">Since 0.10.0</span>. Network
|
|
connections that support guest-transparent VLAN tagging include
|
|
1) type='bridge' interfaces connected to an Open vSwitch bridge
|
|
<span class="since">Since 0.10.0</span>, 2) SRIOV Virtual
|
|
Functions (VF) used via type='hostdev' (direct device
|
|
assignment) <span class="since">Since 0.10.0</span>, and 3)
|
|
SRIOV VFs used via type='direct' with mode='passthrough'
|
|
(macvtap "passthru" mode) <span class="since">Since
|
|
1.3.5</span>. All other connection types, including standard
|
|
linux bridges and libvirt's own virtual networks, <b>do not</b>
|
|
support it. 802.1Qbh (vn-link) and 802.1Qbg (VEPA) switches
|
|
provide their own way (outside of libvirt) to tag guest traffic
|
|
onto a specific VLAN. Each tag is given in a
|
|
separate <code><tag></code> subelement
|
|
of <code><vlan></code> (for example: <code><tag
|
|
id='42'/></code>). For VLAN trunking of multiple tags (which
|
|
is supported only on Open vSwitch connections),
|
|
multiple <code><tag></code> subelements can be specified,
|
|
which implies that the user wants to do VLAN trunking on the
|
|
interface for all the specified tags. In the case that VLAN
|
|
trunking of a single tag is desired, the optional
|
|
attribute <code>trunk='yes'</code> can be added to the toplevel
|
|
<code><vlan></code> element to differentiate trunking of a
|
|
single tag from normal tagging.
|
|
</p>
|
|
<p>
|
|
For network connections using Open vSwitch it is also possible
|
|
to configure 'native-tagged' and 'native-untagged' VLAN modes
|
|
<span class="since">Since 1.1.0.</span> This is done with the
|
|
optional <code>nativeMode</code> attribute on
|
|
the <code><tag></code> subelement: <code>nativeMode</code>
|
|
may be set to 'tagged' or 'untagged'. The <code>id</code>
|
|
attribute of the <code><tag></code> subelement
|
|
containing <code>nativeMode</code> sets which VLAN is considered
|
|
to be the "native" VLAN for this interface, and
|
|
the <code>nativeMode</code> attribute determines whether or not
|
|
traffic for that VLAN will be tagged.
|
|
</p>
|
|
|
|
<h5><a id="elementPort">Isolating guests's network traffic from each other</a></h5>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='network'>
|
|
<source network='default'/>
|
|
<b><port isolated='yes'/></b>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
<span class="since">Since 6.1.0.</span> The <code>port</code>
|
|
element property <code>isolated</code>, when set
|
|
to <code>yes</code> (default setting is <code>no</code>) is used
|
|
to isolate this interface's network traffic from that of other
|
|
guest interfaces connected to the same network that also
|
|
have <code><port isolated='yes'/></code>. This setting is
|
|
only supported for emulated interface devices that use a
|
|
standard tap device to connect to the network via a Linux host
|
|
bridge. This property can be inherited from a libvirt network,
|
|
so if all guests that will be connected to the network should be
|
|
isolated, it is better to put the setting in the network
|
|
configuration. (NB: this only prevents guests that
|
|
have <code>isolated='yes'</code> from communicating with each
|
|
other; if there is a guest on the same bridge that doesn't
|
|
have <code>isolated='yes'</code>, even the isolated guests will
|
|
be able to communicate with it.)
|
|
</p>
|
|
|
|
<h5><a id="elementLink">Modifying virtual link state</a></h5>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='network'>
|
|
<source network='default'/>
|
|
<target dev='vnet0'/>
|
|
<b><link state='down'/></b>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
This element provides means of setting state of the virtual network link.
|
|
Possible values for attribute <code>state</code> are <code>up</code> and
|
|
<code>down</code>. If <code>down</code> is specified as the value, the interface
|
|
behaves as if it had the network cable disconnected. Default behavior if this
|
|
element is unspecified is to have the link state <code>up</code>.
|
|
<span class="since">Since 0.9.5</span>
|
|
</p>
|
|
|
|
<h5><a id="mtu">MTU configuration</a></h5>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='network'>
|
|
<source network='default'/>
|
|
<target dev='vnet0'/>
|
|
<b><mtu size='1500'/></b>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
This element provides means of setting MTU of the virtual network link.
|
|
Currently there is just one attribute <code>size</code> which accepts a
|
|
non-negative integer which specifies the MTU size for the interface.
|
|
<span class="since">Since 3.1.0</span>
|
|
</p>
|
|
|
|
<h5><a id="coalesce">Coalesce settings</a></h5>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='network'>
|
|
<source network='default'/>
|
|
<target dev='vnet0'/>
|
|
<b><coalesce>
|
|
<rx>
|
|
<frames max='7'/>
|
|
</rx>
|
|
</coalesce></b>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
This element provides means of setting coalesce settings for
|
|
some interface devices (currently only type <code>network</code>
|
|
and <code>bridge</code>. Currently there is just one attribute,
|
|
<code>max</code>, to tweak, in element <code>frames</code> for
|
|
the <code>rx</code> group, which accepts a non-negative integer
|
|
that specifies the maximum number of packets that will be
|
|
received before an interrupt.
|
|
<span class="since">Since 3.3.0</span>
|
|
</p>
|
|
|
|
<h5><a id="ipconfig">IP configuration</a></h5>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='network'>
|
|
<source network='default'/>
|
|
<target dev='vnet0'/>
|
|
<b><ip address='192.168.122.5' prefix='24'/></b>
|
|
<b><ip address='192.168.122.5' prefix='24' peer='10.0.0.10'/></b>
|
|
<b><route family='ipv4' address='192.168.122.0' prefix='24' gateway='192.168.122.1'/></b>
|
|
<b><route family='ipv4' address='192.168.122.8' gateway='192.168.122.1'/></b>
|
|
</interface>
|
|
...
|
|
<hostdev mode='capabilities' type='net'>
|
|
<source>
|
|
<interface>eth0</interface>
|
|
</source>
|
|
<b><ip address='192.168.122.6' prefix='24'/></b>
|
|
<b><route family='ipv4' address='192.168.122.0' prefix='24' gateway='192.168.122.1'/></b>
|
|
<b><route family='ipv4' address='192.168.122.8' gateway='192.168.122.1'/></b>
|
|
</hostdev>
|
|
...
|
|
</devices>
|
|
...
|
|
</pre>
|
|
|
|
<p>
|
|
<span class="since">Since 1.2.12</span> network devices and
|
|
hostdev devices with network capabilities can optionally be provided
|
|
one or more IP addresses to set on the network device in the
|
|
guest. Note that some hypervisors or network device types will
|
|
simply ignore them or only use the first one.
|
|
The <code>family</code> attribute can be set to
|
|
either <code>ipv4</code> or <code>ipv6</code>, and the
|
|
<code>address</code> attribute contains the IP address. The
|
|
optional <code>prefix</code> is the number of 1 bits in the
|
|
netmask, and will be automatically set if not specified - for
|
|
IPv4 the default prefix is determined according to the network
|
|
"class" (A, B, or C - see RFC870), and for IPv6 the default
|
|
prefix is 64. The optional <code>peer</code> attribute holds the
|
|
IP address of the other end of a point-to-point network
|
|
device <span class="since">(since 2.1.0)</span>.
|
|
</p>
|
|
|
|
<p>
|
|
<span class="since">Since 1.2.12</span> route elements can also be
|
|
added to define IP routes to add in the guest. The attributes of
|
|
this element are described in the documentation for
|
|
the <code>route</code> element
|
|
in <a href="formatnetwork.html#elementsStaticroute">network
|
|
definitions</a>. This is used by the LXC driver.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='ethernet'>
|
|
<b><source/></b>
|
|
<b><ip address='192.168.123.1' prefix='24'/></b>
|
|
<b><ip address='10.0.0.10' prefix='24' peer='192.168.122.5'/></b>
|
|
<b><route family='ipv4' address='192.168.42.0' prefix='24' gateway='192.168.123.4'/></b>
|
|
<b><source/></b>
|
|
...
|
|
</interface>
|
|
...
|
|
</devices>
|
|
...
|
|
</pre>
|
|
|
|
<p>
|
|
<span class="since">Since 2.1.0</span> network devices of type
|
|
"ethernet" can optionally be provided one or more IP addresses
|
|
and one or more routes to set on the <b>host</b> side of the
|
|
network device. These are configured as subelements of
|
|
the <code><source></code> element of the interface, and
|
|
have the same attributes as the similarly named elements used to
|
|
configure the guest side of the interface (described above).
|
|
</p>
|
|
|
|
<h5><a id="elementVhostuser">vhost-user interface</a></h5>
|
|
|
|
<p>
|
|
<span class="since">Since 1.2.7</span> the vhost-user enables the
|
|
communication between a QEMU virtual machine and other userspace process
|
|
using the Virtio transport protocol. A char dev (e.g. Unix socket) is used
|
|
for the control plane, while the data plane is based on shared memory.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface type='vhostuser'>
|
|
<mac address='52:54:00:3b:83:1a'/>
|
|
<source type='unix' path='/tmp/vhost1.sock' mode='server'/>
|
|
<model type='virtio'/>
|
|
</interface>
|
|
<interface type='vhostuser'>
|
|
<mac address='52:54:00:3b:83:1b'/>
|
|
<source type='unix' path='/tmp/vhost2.sock' mode='client'>
|
|
<reconnect enabled='yes' timeout='10'/>
|
|
</source>
|
|
<model type='virtio'/>
|
|
<driver queues='5'/>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
The <code><source></code> element has to be specified
|
|
along with the type of char device.
|
|
Currently, only type='unix' is supported, where the path (the
|
|
directory path of the socket) and mode attributes are required.
|
|
Both <code>mode='server'</code> and <code>mode='client'</code>
|
|
are supported.
|
|
vhost-user requires the virtio model type, thus the
|
|
<code><model></code> element is mandatory.
|
|
<span class="since">Since 4.1.0</span> the element has an
|
|
optional child element <code>reconnect</code> which
|
|
configures reconnect timeout if the connection is lost. It
|
|
has two attributes <code>enabled</code> (which accepts
|
|
<code>yes</code> and <code>no</code>) and
|
|
<code>timeout</code> which specifies the amount of seconds
|
|
after which hypervisor tries to reconnect.
|
|
</p>
|
|
|
|
<h5><a id="elementNwfilter">Traffic filtering with NWFilter</a></h5>
|
|
|
|
<p>
|
|
<span class="since">Since 0.8.0</span> an <code>nwfilter</code> profile
|
|
can be assigned to a domain interface, which allows configuring
|
|
traffic filter rules for the virtual machine.
|
|
|
|
See the <a href="formatnwfilter.html">nwfilter</a> documentation for more
|
|
complete details.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<interface ...>
|
|
...
|
|
<filterref filter='clean-traffic'/>
|
|
</interface>
|
|
<interface ...>
|
|
...
|
|
<filterref filter='myfilter'>
|
|
<parameter name='IP' value='104.207.129.11'/>
|
|
<parameter name='IP6_ADDR' value='2001:19f0:300:2102::'/>
|
|
<parameter name='IP6_MASK' value='64'/>
|
|
...
|
|
</filterref>
|
|
</interface>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
The <code>filter</code> attribute specifies the name of the nwfilter
|
|
to use. Optional <code><parameter></code> elements may be
|
|
specified for passing additional info to the nwfilter via the
|
|
<code>name</code> and <code>value</code> attributes. See
|
|
the <a href="formatnwfilter.html#nwfconceptsvars">nwfilter</a>
|
|
docs for info on parameters.
|
|
</p>
|
|
|
|
|
|
<h4><a id="elementsInput">Input devices</a></h4>
|
|
|
|
<p>
|
|
Input devices allow interaction with the graphical framebuffer
|
|
in the guest virtual machine. When enabling the framebuffer, an
|
|
input device is automatically provided. It may be possible to
|
|
add additional devices explicitly, for example,
|
|
to provide a graphics tablet for absolute cursor movement.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<input type='mouse' bus='usb'/>
|
|
<input type='keyboard' bus='usb'/>
|
|
<input type='mouse' bus='virtio'/>
|
|
<input type='keyboard' bus='virtio'/>
|
|
<input type='tablet' bus='virtio'/>
|
|
<input type='passthrough' bus='virtio'>
|
|
<source evdev='/dev/input/event1'/>
|
|
</input>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><code>input</code></dt>
|
|
<dd>The <code>input</code> element has one mandatory attribute,
|
|
the <code>type</code> whose value can be 'mouse', 'tablet',
|
|
(<span class="since">since 1.2.2</span>) 'keyboard' or
|
|
(<span class="since">since 1.3.0</span>) 'passthrough'.
|
|
The tablet provides absolute cursor movement,
|
|
while the mouse uses relative movement. The optional
|
|
<code>bus</code> attribute can be used to refine the exact device type.
|
|
It takes values "xen" (paravirtualized), "ps2" and "usb" or
|
|
(<span class="since">since 1.3.0</span>) "virtio".</dd>
|
|
</dl>
|
|
|
|
<p>
|
|
The <code>input</code> element has an optional
|
|
sub-element <code><address></code> which can tie the
|
|
device to a particular PCI
|
|
slot, <a href="#elementsAddress">documented above</a>.
|
|
On S390, <code>address</code> can be used to provide a CCW address for
|
|
an input device (<span class="since">since 4.2.0</span>).
|
|
|
|
For type <code>passthrough</code>, the mandatory sub-element <code>source</code>
|
|
must have an <code>evdev</code> attribute containing the absolute path to the
|
|
event device passed through to guests. (KVM only)
|
|
|
|
<span class="since">Since 5.2.0</span>, the <code>input</code> element
|
|
accepts a <code>model</code> attribute which has the values 'virtio',
|
|
'virtio-transitional' and 'virtio-non-transitional'. See
|
|
<a href="#elementsVirtioTransitional">Virtio transitional devices</a>
|
|
for more details.
|
|
</p>
|
|
|
|
<p>
|
|
The subelement <code>driver</code> can be used to tune the virtio
|
|
options of the device:
|
|
<a href="#elementsVirtio">Virtio-specific options</a> can also be
|
|
set. (<span class="since">Since 3.5.0</span>)
|
|
</p>
|
|
|
|
<h4><a id="elementsHub">Hub devices</a></h4>
|
|
|
|
<p>
|
|
A hub is a device that expands a single port into several so
|
|
that there are more ports available to connect devices to a host
|
|
system.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<hub type='usb'/>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><code>hub</code></dt>
|
|
<dd>The <code>hub</code> element has one mandatory attribute,
|
|
the <code>type</code> whose value can only be 'usb'.</dd>
|
|
</dl>
|
|
|
|
<p>
|
|
The <code>hub</code> element has an optional
|
|
sub-element <code><address></code>
|
|
with <code>type='usb'</code>which can tie the device to a
|
|
particular controller, <a href="#elementsAddress">documented
|
|
above</a>.
|
|
</p>
|
|
|
|
<h4><a id="elementsGraphics">Graphical framebuffers</a></h4>
|
|
|
|
<p>
|
|
A graphics device allows for graphical interaction with the
|
|
guest OS. A guest will typically have either a framebuffer
|
|
or a text console configured to allow interaction with the
|
|
admin.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<graphics type='sdl' display=':0.0'/>
|
|
<graphics type='vnc' port='5904' sharePolicy='allow-exclusive'>
|
|
<listen type='address' address='1.2.3.4'/>
|
|
</graphics>
|
|
<graphics type='rdp' autoport='yes' multiUser='yes' />
|
|
<graphics type='desktop' fullscreen='yes'/>
|
|
<graphics type='spice'>
|
|
<listen type='network' network='rednet'/>
|
|
</graphics>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><code>graphics</code></dt>
|
|
<dd>
|
|
<p>
|
|
The <code>graphics</code> element has a mandatory <code>type</code>
|
|
attribute which takes the value <code>sdl</code>, <code>vnc</code>,
|
|
<code>spice</code>, <code>rdp</code>, <code>desktop</code> or
|
|
<code>egl-headless</code>:
|
|
</p>
|
|
<dl>
|
|
<dt><code>sdl</code></dt>
|
|
<dd>
|
|
<p>
|
|
This displays a window on the host desktop, it can take 3 optional
|
|
arguments: a <code>display</code> attribute for the display to use,
|
|
an <code>xauth</code> attribute for the authentication identifier,
|
|
and an optional <code>fullscreen</code> attribute accepting values
|
|
<code>yes</code> or <code>no</code>.
|
|
</p>
|
|
|
|
<p>
|
|
You can use a <code>gl</code> with the <code>enable="yes"</code>
|
|
property to enable OpenGL support in SDL. Likewise you can
|
|
explicitly disable OpenGL support with <code>enable="no"</code>.
|
|
</p>
|
|
</dd>
|
|
<dt><code>vnc</code></dt>
|
|
<dd>
|
|
<p>
|
|
Starts a VNC server. The <code>port</code> attribute specifies
|
|
the TCP port number (with -1 as legacy syntax indicating that it
|
|
should be auto-allocated). The <code>autoport</code> attribute is
|
|
the new preferred syntax for indicating auto-allocation of the TCP
|
|
port to use. The <code>passwd</code> attribute provides a VNC
|
|
password in clear text. If the <code>passwd</code> attribute is
|
|
set to an empty string, then VNC access is disabled. The
|
|
<code>keymap</code> attribute specifies the keymap to use. It is
|
|
possible to set a limit on the validity of the password by giving
|
|
a timestamp <code>passwdValidTo='2010-04-09T15:51:00'</code>
|
|
assumed to be in UTC. The <code>connected</code> attribute allows
|
|
control of connected client during password changes. VNC accepts
|
|
<code>keep</code> value only <span class="since">since 0.9.3</span>.
|
|
NB, this may not be supported by all hypervisors.
|
|
</p>
|
|
<p>
|
|
The optional <code>sharePolicy</code> attribute specifies vnc
|
|
server display sharing policy. <code>allow-exclusive</code> allows
|
|
clients to ask for exclusive access by dropping other connections.
|
|
Connecting multiple clients in parallel requires all clients asking
|
|
for a shared session (vncviewer: -Shared switch). This is
|
|
the default value. <code>force-shared</code> disables exclusive
|
|
client access, every connection has to specify -Shared switch for
|
|
vncviewer. <code>ignore</code> welcomes every connection
|
|
unconditionally <span class="since">since 1.0.6</span>.
|
|
</p>
|
|
<p>
|
|
Rather than using listen/port, QEMU supports a <code>socket</code>
|
|
attribute for listening on a unix domain socket path
|
|
<span class="since">Since 0.8.8</span>.
|
|
</p>
|
|
<p>
|
|
For VNC WebSocket functionality, <code>websocket</code> attribute
|
|
may be used to specify port to listen on (with -1 meaning
|
|
auto-allocation and <code>autoport</code> having no effect due to
|
|
security reasons) <span class="since">Since 1.0.6</span>.
|
|
</p>
|
|
<p>
|
|
Although VNC doesn't support OpenGL natively, it can be paired
|
|
with graphics type <code>egl-headless</code> (see below) which
|
|
will instruct QEMU to open and use drm nodes for OpenGL rendering.
|
|
</p>
|
|
</dd>
|
|
<dt><code>spice</code> <span class="since">Since 0.8.6</span></dt>
|
|
<dd>
|
|
<p>
|
|
Starts a SPICE server. The <code>port</code> attribute specifies
|
|
the TCP port number (with -1 as legacy syntax indicating that it
|
|
should be auto-allocated), while <code>tlsPort</code> gives
|
|
an alternative secure port number. The <code>autoport</code>
|
|
attribute is the new preferred syntax for indicating
|
|
auto-allocation of needed port numbers. The <code>passwd</code>
|
|
attribute provides a SPICE password in clear text. If the
|
|
<code>passwd</code> attribute is set to an empty string, then
|
|
SPICE access is disabled. The <code>keymap</code> attribute
|
|
specifies the keymap to use. It is possible to set a limit on
|
|
the validity of the password by giving a timestamp
|
|
<code>passwdValidTo='2010-04-09T15:51:00'</code> assumed to be
|
|
in UTC.
|
|
</p>
|
|
<p>
|
|
The <code>connected</code> attribute allows control of connected
|
|
client during password changes. SPICE accepts <code>keep</code> to
|
|
keep client connected, <code>disconnect</code> to disconnect client
|
|
and <code>fail</code> to fail changing password . NB, this may not
|
|
be supported by all hypervisors.
|
|
<span class="since">Since 0.9.3</span>
|
|
</p>
|
|
<p>
|
|
The <code>defaultMode</code> attribute sets the default channel
|
|
security policy, valid values are <code>secure</code>,
|
|
<code>insecure</code> and the default <code>any</code> (which is
|
|
secure if possible, but falls back to insecure rather than erroring
|
|
out if no secure path is available).
|
|
<span class="since">Since 0.9.12</span>
|
|
</p>
|
|
<p>
|
|
When SPICE has both a normal and TLS secured TCP port configured,
|
|
it can be desirable to restrict what channels can be run on each
|
|
port. This is achieved by adding one or more <code><channel>
|
|
</code> elements inside the main <code><graphics></code>
|
|
element and setting the <code>mode</code> attribute to either
|
|
<code>secure</code> or <code>insecure</code>. Setting the mode
|
|
attribute overrides the default value as set by
|
|
the <code>defaultMode</code> attribute. (Note that specifying
|
|
<code>any</code> as mode discards the entry as the channel would
|
|
inherit the default mode anyways.) Valid channel names include
|
|
<code>main</code>, <code>display</code>, <code>inputs</code>,
|
|
<code>cursor</code>, <code>playback</code>, <code>record</code>
|
|
(all <span class="since"> since 0.8.6</span>);
|
|
<code>smartcard</code> (<span class="since">since 0.8.8</span>);
|
|
and <code>usbredir</code> (<span class="since">since 0.9.12</span>).
|
|
</p>
|
|
<pre>
|
|
<graphics type='spice' port='-1' tlsPort='-1' autoport='yes'>
|
|
<channel name='main' mode='secure'/>
|
|
<channel name='record' mode='insecure'/>
|
|
<image compression='auto_glz'/>
|
|
<streaming mode='filter'/>
|
|
<clipboard copypaste='no'/>
|
|
<mouse mode='client'/>
|
|
<filetransfer enable='no'/>
|
|
<gl enable='yes' rendernode='/dev/dri/by-path/pci-0000:00:02.0-render'/>
|
|
</graphics></pre>
|
|
<p>
|
|
Spice supports variable compression settings for audio, images and
|
|
streaming. These settings are accessible via the <code>compression
|
|
</code> attribute in all following elements: <code>image</code> to
|
|
set image compression (accepts <code>auto_glz</code>,
|
|
<code>auto_lz</code>, <code>quic</code>, <code>glz</code>,
|
|
<code>lz</code>, <code>off</code>), <code>jpeg</code> for JPEG
|
|
compression for images over wan (accepts <code>auto</code>,
|
|
<code>never</code>, <code>always</code>), <code>zlib</code> for
|
|
configuring wan image compression (accepts <code>auto</code>,
|
|
<code>never</code>, <code>always</code>) and <code>playback</code>
|
|
for enabling audio stream compression (accepts <code>on</code> or
|
|
<code>off</code>). <span class="since">Since 0.9.1</span>
|
|
</p>
|
|
<p>
|
|
Streaming mode is set by the <code>streaming</code> element,
|
|
settings its <code>mode</code> attribute to one of
|
|
<code>filter</code>, <code>all</code> or <code>off</code>.
|
|
<span class="since">Since 0.9.2</span>
|
|
</p>
|
|
<p>
|
|
Copy & Paste functionality (via Spice agent) is set by the
|
|
<code>clipboard</code> element. It is enabled by default, and can
|
|
be disabled by setting the <code>copypaste</code> property to
|
|
<code>no</code>. <span class="since">Since 0.9.3</span>
|
|
</p>
|
|
<p>
|
|
Mouse mode is set by the <code>mouse</code> element, setting its
|
|
<code>mode</code> attribute to one of <code>server</code> or
|
|
<code>client</code>. If no mode is specified, the qemu default will
|
|
be used (client mode). <span class="since">Since 0.9.11</span>
|
|
</p>
|
|
<p>
|
|
File transfer functionality (via Spice agent) is set using the
|
|
<code>filetransfer</code> element. It is enabled by default, and
|
|
can be disabled by setting the <code>enable</code> property to
|
|
<code>no</code>. <span class="since">Since 1.2.2</span>
|
|
</p>
|
|
<p>
|
|
Spice may provide accelerated server-side rendering with OpenGL.
|
|
You can enable or disable OpenGL support explicitly with
|
|
the <code>gl</code> element, by setting the <code>enable</code>
|
|
property. (QEMU only, <span class="since">since 1.3.3</span>).
|
|
Note that this only works locally, since this requires usage of
|
|
UNIX sockets, i.e. using <code>listen</code> types 'socket' or
|
|
'none'. For accelerated OpenGL with remote support, consider
|
|
pairing this element with type <code>egl-headless</code>
|
|
(see below). However, this will deliver weaker performance
|
|
compared to native Spice OpenGL support.
|
|
</p>
|
|
<p>
|
|
By default, QEMU will pick the first available GPU DRM render node.
|
|
You may specify a DRM render node path to use instead. (QEMU only,
|
|
<span class="since">since 3.1.0</span>).
|
|
</p>
|
|
</dd>
|
|
<dt><code>rdp</code></dt>
|
|
<dd>
|
|
<p>
|
|
Starts a RDP server. The <code>port</code> attribute specifies the
|
|
TCP port number (with -1 as legacy syntax indicating that it should
|
|
be auto-allocated). The <code>autoport</code> attribute is the new
|
|
preferred syntax for indicating auto-allocation of the TCP port to
|
|
use. In the VirtualBox driver, the <code>autoport</code> will make
|
|
the hypervisor pick available port from 3389-3689 range when the VM
|
|
is started. The chosen port will be reflected in the <code>port</code>
|
|
attribute. The <code>multiUser</code> attribute is a boolean deciding
|
|
whether multiple simultaneous connections to the VM are permitted.
|
|
The <code>replaceUser</code> attribute is a boolean deciding whether
|
|
the existing connection must be dropped and a new connection must
|
|
be established by the VRDP server, when a new client connects in
|
|
single connection mode.
|
|
</p>
|
|
</dd>
|
|
<dt><code>desktop</code></dt>
|
|
<dd>
|
|
<p>
|
|
This value is reserved for VirtualBox domains for the moment. It
|
|
displays a window on the host desktop, similarly to "sdl", but
|
|
using the VirtualBox viewer. Just like "sdl", it accepts
|
|
the optional attributes <code>display</code> and
|
|
<code>fullscreen</code>.
|
|
</p>
|
|
</dd>
|
|
<dt><code>egl-headless</code><span class="since">Since 4.6.0</span></dt>
|
|
<dd>
|
|
<p>
|
|
This display type provides support for an OpenGL accelerated
|
|
display accessible both locally and remotely (for comparison,
|
|
Spice's native OpenGL support only works locally using UNIX
|
|
sockets at the moment, but has better performance). Since this
|
|
display type doesn't provide any window or graphical console like
|
|
the other types, for practical reasons it should be paired with
|
|
either <code>vnc</code> or <code>spice</code> graphics types.
|
|
This display type is only supported by QEMU domains
|
|
(needs QEMU <span class="since">2.10</span> or newer).
|
|
<span class="Since">5.0.0</span> this element accepts a
|
|
<code><gl/></code> sub-element with an optional attribute
|
|
<code>rendernode</code> which can be used to specify an absolute
|
|
path to a host's DRI device to be used for OpenGL rendering.
|
|
</p>
|
|
<pre>
|
|
<graphics type='spice' autoport='yes'/>
|
|
<graphics type='egl-headless'>
|
|
<gl rendernode='/dev/dri/renderD128'/>
|
|
</graphics>
|
|
</pre>
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
</dl>
|
|
|
|
<p>
|
|
Graphics device uses a <code><listen></code> to set up where
|
|
the device should listen for clients. It has a mandatory attribute
|
|
<code>type</code> which specifies the listen type. Only <code>vnc</code>,
|
|
<code>spice</code> and <code>rdp</code> supports <code><listen>
|
|
</code> element. <span class="since">Since 0.9.4</span>.
|
|
Available types are:
|
|
</p>
|
|
<dl>
|
|
<dt><code>address</code></dt>
|
|
<dd>
|
|
<p>
|
|
Tells a graphics device to use an address specified in the
|
|
<code>address</code> attribute, which will contain either an IP address
|
|
or hostname (which will be resolved to an IP address via a DNS query)
|
|
to listen on.
|
|
</p>
|
|
<p>
|
|
It is possible to omit the <code>address</code> attribute in order to
|
|
use an address from config files <span class="since">Since 1.3.5</span>.
|
|
</p>
|
|
<p>
|
|
The <code>address</code> attribute is duplicated as <code>listen</code>
|
|
attribute in <code>graphics</code> element for backward compatibility.
|
|
If both are provided they must be equal.
|
|
</p>
|
|
</dd>
|
|
<dt><code>network</code></dt>
|
|
<dd>
|
|
<p>
|
|
This is used to specify an existing network in the <code>network</code>
|
|
attribute from libvirt's list of configured networks. The named network
|
|
configuration will be examined to determine an appropriate listen
|
|
address and the address will be stored in live XML in <code>address
|
|
</code> attribute. For example, if the network has an IPv4 address in
|
|
its configuration (e.g. if it has a forward type of <code>route</code>,
|
|
<code>nat</code>, or no forward type (isolated)), the first IPv4
|
|
address listed in the network's configuration will be used.
|
|
If the network is describing a host bridge, the first IPv4 address
|
|
associated with that bridge device will be used, and if the network is
|
|
describing one of the 'direct' (macvtap) modes, the first IPv4 address
|
|
of the first forward dev will be used.
|
|
</p>
|
|
</dd>
|
|
<dt><code>socket</code> <span class="since">since 2.0.0 (QEMU only)</span></dt>
|
|
<dd>
|
|
<p>
|
|
This listen type tells a graphics server to listen on unix socket.
|
|
Attribute <code>socket</code> contains a path to unix socket. If this
|
|
attribute is omitted libvirt will generate this path for you.
|
|
Supported by graphics type <code>vnc</code> and <code>spice</code>.
|
|
</p>
|
|
<p>
|
|
For <code>vnc</code> graphics be backward compatible
|
|
the <code>socket</code> attribute of first <code>listen</code> element
|
|
is duplicated as <code>socket</code> attribute in <code>graphics</code>
|
|
element. If <code>graphics</code> element contains a <code>socket</code>
|
|
attribute all <code>listen</code> elements are ignored.
|
|
</p>
|
|
</dd>
|
|
<dt><code>none</code> <span class="since">since 2.0.0 (QEMU only)</span></dt>
|
|
<dd>
|
|
<p>
|
|
This listen type doesn't have any other attribute. Libvirt supports
|
|
passing a file descriptor through our APIs virDomainOpenGraphics() and
|
|
virDomainOpenGraphicsFD(). No other listen types are allowed if this
|
|
one is used and the graphics device doesn't listen anywhere. You need
|
|
to use one of the two APIs to pass a FD to QEMU in order to connect to
|
|
this graphics device. Supported by graphics type <code>vnc</code> and
|
|
<code>spice</code>.
|
|
</p>
|
|
</dd>
|
|
</dl>
|
|
|
|
<h4><a id="elementsVideo">Video devices</a></h4>
|
|
<p>
|
|
A video device.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<video>
|
|
<model type='vga' vram='16384' heads='1'>
|
|
<acceleration accel3d='yes' accel2d='yes'/>
|
|
</model>
|
|
<driver name='qemu'/>
|
|
</video>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><code>video</code></dt>
|
|
<dd>
|
|
<p>
|
|
The <code>video</code> element is the container for describing
|
|
video devices. For backwards compatibility, if no <code>video</code>
|
|
is set but there is a <code>graphics</code> in domain xml, then
|
|
libvirt will add a default <code>video</code> according to the guest
|
|
type.
|
|
</p>
|
|
<p>
|
|
For a guest of type "kvm", the default <code>video</code> is:
|
|
<code>type</code> with value "cirrus", <code>vram</code> with value
|
|
"16384" and <code>heads</code> with value "1". By default, the first
|
|
video device in domain xml is the primary one, but the optional
|
|
attribute <code>primary</code> (<span class="since">since 1.0.2</span>)
|
|
with value 'yes' can be used to mark the primary in cases of multiple
|
|
video device. The non-primary must be type of "qxl" or
|
|
(<span class="since">since 2.4.0</span>) "virtio".
|
|
</p>
|
|
</dd>
|
|
|
|
<dt><code>model</code></dt>
|
|
<dd>
|
|
<p>
|
|
The <code>model</code> element has a mandatory <code>type</code>
|
|
attribute which takes the value "vga", "cirrus", "vmvga", "xen",
|
|
"vbox", "qxl" (<span class="since">since 0.8.6</span>),
|
|
"virtio" (<span class="since">since 1.3.0</span>),
|
|
"gop" (<span class="since">since 3.2.0</span>),
|
|
"bochs" (<span class="since">since 5.6.0</span>), "ramfb"
|
|
(<span class="since">since 5.9.0</span>), or "none"
|
|
(<span class="since">since 4.6.0</span>, depending on the hypervisor
|
|
features available.
|
|
The purpose of the type <code>none</code> is to instruct libvirt not
|
|
to add a default video device in the guest (see the paragraph above).
|
|
This legacy behaviour can be inconvenient in cases where GPU mediated
|
|
devices are meant to be the only rendering device within a guest and
|
|
so specifying another <code>video</code> device along with type
|
|
<code>none</code>.
|
|
Refer to <a id="elementsHostDev">Host device assignment</a> to see
|
|
how to add a mediated device into a guest.
|
|
</p>
|
|
<p>
|
|
You can provide the amount of video memory in kibibytes (blocks of
|
|
1024 bytes) using <code>vram</code>. This is supported only for guest
|
|
type of "vz", "qemu", "vbox", "vmx" and "xen". If no
|
|
value is provided the default is used. If the size is not a power of
|
|
two it will be rounded to closest one.
|
|
</p>
|
|
<p>
|
|
The number of screen can be set using <code>heads</code>. This is
|
|
supported only for guests type of "vz", "kvm", "vbox" and "vmx".
|
|
</p>
|
|
<p>
|
|
For guest type of "kvm" or "qemu" and model type "qxl" there are
|
|
optional attributes. Attribute <code>ram</code> (<span class="since">
|
|
since 1.0.2</span>) specifies the size of the primary bar, while the
|
|
attribute <code>vram</code> specifies the secondary bar size.
|
|
If <code>ram</code> or <code>vram</code> are not supplied a default
|
|
value is used. The <code>ram</code> should also be rounded to power of
|
|
two as <code>vram</code>. There is also optional attribute
|
|
<code>vgamem</code> (<span class="since">since 1.2.11</span>) to set
|
|
the size of VGA framebuffer for fallback mode of QXL device.
|
|
Attribute <code>vram64</code> (<span class="since">since 1.3.3</span>)
|
|
extends secondary bar and makes it addressable as 64bit memory.
|
|
</p>
|
|
<p><span class="since">Since 5.9.0</span>, the <code>model</code>
|
|
element may also have an optional <code>resolution</code> sub-element.
|
|
The <code>resolution</code> element has attributes <code>x</code> and
|
|
<code>y</code> to set the minimum resolution for the video device. This
|
|
sub-element is valid for model types "vga", "qxl", "bochs", and
|
|
"virtio".
|
|
</p>
|
|
</dd>
|
|
|
|
<dt><code>acceleration</code></dt>
|
|
<dd>
|
|
Configure if video acceleration should be enabled.
|
|
<dl>
|
|
<dt><code>accel2d</code></dt>
|
|
<dd>Enable 2D acceleration (for vbox driver only,
|
|
<span class="since">since 0.7.1</span>)</dd>
|
|
|
|
<dt><code>accel3d</code></dt>
|
|
<dd>Enable 3D acceleration (for vbox driver
|
|
<span class="since">since 0.7.1</span>, qemu driver
|
|
<span class="since">since 1.3.0</span>)</dd>
|
|
|
|
<dt><code>rendernode</code></dt>
|
|
<dd>Absolute path to a host's DRI device to be used for
|
|
rendering (for 'vhostuser' driver only, <span
|
|
class="since">since 5.8.0</span>). If none is specified,
|
|
libvirt will pick one available.</dd>
|
|
</dl>
|
|
</dd>
|
|
|
|
<dt><code>address</code></dt>
|
|
<dd>
|
|
The optional <code>address</code> sub-element can be used to
|
|
tie the video device to a particular PCI slot.
|
|
On S390, <code>address</code> can be used to provide the
|
|
CCW address for the video device (<span class="since">
|
|
since 4.2.0</span>).
|
|
</dd>
|
|
|
|
<dt><code>driver</code></dt>
|
|
<dd>
|
|
The subelement <code>driver</code> can be used to tune the device:
|
|
<dl>
|
|
<dt><code>name</code></dt>
|
|
<dd>
|
|
Specify the backend driver to use, either "qemu" or
|
|
"vhostuser" depending on the hypervisor features available
|
|
(<span class="since">since 5.8.0</span>). "qemu" is the
|
|
default QEMU backend. "vhostuser" will use a separate
|
|
vhost-user process backend (for <code>virtio</code>
|
|
device).
|
|
</dd>
|
|
<dt>virtio options</dt>
|
|
<dd>
|
|
<a href="#elementsVirtio">Virtio-specific options</a> can also be
|
|
set (<span class="since">Since 3.5.0</span>)
|
|
</dd>
|
|
<dt>VGA configuration</dt>
|
|
<dd>
|
|
Control how the video devices exposed to the guest using the
|
|
<code>vgaconf</code> attribute which takes the value "io", "on" or "off".
|
|
At present, it's only applicable to the bhyve's "gop" video model type
|
|
(<span class="since">Since 3.5.0</span>)
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
</dl>
|
|
|
|
<h4><a id="elementsConsole">Consoles, serial, parallel & channel devices</a></h4>
|
|
|
|
<p>
|
|
A character device provides a way to interact with the virtual machine.
|
|
Paravirtualized consoles, serial ports, parallel ports and channels are
|
|
all classed as character devices and so represented using the same syntax.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<parallel type='pty'>
|
|
<source path='/dev/pts/2'/>
|
|
<target port='0'/>
|
|
</parallel>
|
|
<serial type='pty'>
|
|
<source path='/dev/pts/3'/>
|
|
<target port='0'/>
|
|
</serial>
|
|
<serial type='file'>
|
|
<source path='/tmp/file' append='on'>
|
|
<seclabel model='dac' relabel='no'/>
|
|
</source>
|
|
<target port='0'/>
|
|
</serial>
|
|
<console type='pty'>
|
|
<source path='/dev/pts/4'/>
|
|
<target port='0'/>
|
|
</console>
|
|
<channel type='unix'>
|
|
<source mode='bind' path='/tmp/guestfwd'/>
|
|
<target type='guestfwd' address='10.0.2.1' port='4600'/>
|
|
</channel>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
In each of these directives, the top-level element name (parallel, serial,
|
|
console, channel) describes how the device is presented to the guest. The
|
|
guest interface is configured by the <code>target</code> element.
|
|
</p>
|
|
|
|
<p>
|
|
The interface presented to the host is given in the <code>type</code>
|
|
attribute of the top-level element. The host interface is
|
|
configured by the <code>source</code> element.
|
|
</p>
|
|
|
|
<p>
|
|
The <code>source</code> element may contain an optional
|
|
<code>seclabel</code> to override the way that labelling
|
|
is done on the socket path. If this element is not present,
|
|
the <a href="#seclabel">security label is inherited from
|
|
the per-domain setting</a>.
|
|
</p>
|
|
|
|
<p>
|
|
If the interface <code>type</code> presented to the host is "file",
|
|
then the <code>source</code> element may contain an optional attribute
|
|
<code>append</code> that specifies whether or not the information in
|
|
the file should be preserved on domain restart. Allowed values are
|
|
"on" and "off" (default). <span class="since">Since 1.3.1</span>.
|
|
</p>
|
|
|
|
<p>
|
|
Regardless of the <code>type</code>, character devices can
|
|
have an optional log file associated with them. This is
|
|
expressed via a <code>log</code> sub-element, with a
|
|
<code>file</code> attribute. There can also be an <code>append</code>
|
|
attribute which takes the same values described above.
|
|
<span class="since">Since 1.3.3</span>.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<log file="/var/log/libvirt/qemu/guestname-serial0.log" append="off"/>
|
|
...</pre>
|
|
|
|
<p>
|
|
Each character device element has an optional
|
|
sub-element <code><address></code> which can tie the
|
|
device to a
|
|
particular <a href="#elementsControllers">controller</a> or PCI
|
|
slot.
|
|
</p>
|
|
|
|
<p>
|
|
For character device with type <code>unix</code> or <code>tcp</code>
|
|
the <code>source</code> has an optional element <code>reconnect</code>
|
|
which configures reconnect timeout if the connection is lost.
|
|
There are two attributes, <code>enabled</code> where possible
|
|
values are "yes" and "no" and <code>timeout</code> which is in
|
|
seconds. The <code>reconnect</code> attribute is valid only
|
|
for <code>connect</code> mode.
|
|
<span class="since">Since 3.7.0 (QEMU driver only)</span>.
|
|
</p>
|
|
|
|
<h5><a id="elementsCharGuestInterface">Guest interface</a></h5>
|
|
|
|
<p>
|
|
A character device presents itself to the guest as one of the following
|
|
types.
|
|
</p>
|
|
|
|
<h6><a id="elementCharParallel">Parallel port</a></h6>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<parallel type='pty'>
|
|
<source path='/dev/pts/2'/>
|
|
<target port='0'/>
|
|
</parallel>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
<code>target</code> can have a <code>port</code> attribute, which
|
|
specifies the port number. Ports are numbered starting from 0. There are
|
|
usually 0, 1 or 2 parallel ports.
|
|
</p>
|
|
|
|
<h6><a id="elementCharSerial">Serial port</a></h6>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<!-- Serial port -->
|
|
<serial type='pty'>
|
|
<source path='/dev/pts/3'/>
|
|
<target port='0'/>
|
|
</serial>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<!-- USB serial port -->
|
|
<serial type='pty'>
|
|
<target type='usb-serial' port='0'>
|
|
<model name='usb-serial'/>
|
|
</target>
|
|
<address type='usb' bus='0' port='1'/>
|
|
</serial>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
The <code>target</code> element can have an optional <code>port</code>
|
|
attribute, which specifies the port number (starting from 0), and an
|
|
optional <code>type</code> attribute: valid values are,
|
|
<span class="since">since 1.0.2</span>, <code>isa-serial</code> (usable
|
|
with x86 guests), <code>usb-serial</code> (usable whenever USB support
|
|
is available) and <code>pci-serial</code> (usable whenever PCI support
|
|
is available); <span class="since">since 3.10.0</span>,
|
|
<code>spapr-vio-serial</code> (usable with ppc64/pseries guests),
|
|
<code>system-serial</code> (usable with aarch64/virt and,
|
|
<span class="since">since 4.7.0</span>, riscv/virt guests) and
|
|
<code>sclp-serial</code> (usable with s390 and s390x guests) are
|
|
available as well.
|
|
</p>
|
|
|
|
<p>
|
|
<span class="since">Since 3.10.0</span>, the <code>target</code>
|
|
element can have an optional <code>model</code> subelement;
|
|
valid values for its <code>name</code> attribute are:
|
|
<code>isa-serial</code> (usable with the <code>isa-serial</code> target
|
|
type); <code>usb-serial</code> (usable with the <code>usb-serial</code>
|
|
target type); <code>pci-serial</code>
|
|
(usable with the <code>pci-serial</code> target type);
|
|
<code>spapr-vty</code> (usable with the <code>spapr-vio-serial</code>
|
|
target type); <code>pl011</code> and,
|
|
<span class="since">since 4.7.0</span>, <code>16550a</code> (usable
|
|
with the <code>system-serial</code> target type);
|
|
<code>sclpconsole</code> and <code>sclplmconsole</code> (usable with
|
|
the <code>sclp-serial</code> target type). Providing a target model is
|
|
usually unnecessary: libvirt will automatically pick one that's suitable
|
|
for the chosen target type, and overriding that value is generally not
|
|
recommended.
|
|
</p>
|
|
|
|
<p>
|
|
If any of the attributes is not specified by the user, libvirt will
|
|
choose a value suitable for most users.
|
|
</p>
|
|
|
|
<p>
|
|
Most target types support configuring the guest-visible device
|
|
address as <a href="#elementsAddress">documented above</a>; more
|
|
specifically, acceptable address types are <code>isa</code> (for
|
|
<code>isa-serial</code>), <code>usb</code> (for <code>usb-serial</code>),
|
|
<code>pci</code> (for <code>pci-serial</code>) and <code>spapr-vio</code>
|
|
(for <code>spapr-vio-serial</code>). The <code>system-serial</code>
|
|
and <code>sclp-serial</code> target types don't support specifying an
|
|
address.
|
|
</p>
|
|
|
|
<p>
|
|
For the relationship between serial ports and consoles,
|
|
<a href="#elementCharSerialAndConsole">see below</a>.
|
|
</p>
|
|
|
|
<h6><a id="elementCharConsole">Console</a></h6>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<!-- Serial console -->
|
|
<console type='pty'>
|
|
<source path='/dev/pts/2'/>
|
|
<target type='serial' port='0'/>
|
|
</console>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<!-- KVM virtio console -->
|
|
<console type='pty'>
|
|
<source path='/dev/pts/5'/>
|
|
<target type='virtio' port='0'/>
|
|
</console>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
The <code>console</code> element is used to represent interactive
|
|
serial consoles. Depending on the type of guest in use and the specifics
|
|
of the configuration, the <code>console</code> element might represent
|
|
the same device as an existing <code>serial</code> element or a separate
|
|
device.
|
|
</p>
|
|
|
|
<p>
|
|
A <code>target</code> subelement is supported and works the same
|
|
way as with the <code>serial</code> element
|
|
(<a href="#elementCharSerial">see above</a> for details).
|
|
Valid values for the <code>type</code> attribute are:
|
|
<code>serial</code> (described below);
|
|
<code>virtio</code> (usable whenever VirtIO support is available);
|
|
<code>xen</code>, <code>lxc</code> and <code>openvz</code>
|
|
(available when the corresponding hypervisor is in use).
|
|
<code>sclp</code> and <code>sclplm</code> (usable for s390 and
|
|
s390x QEMU guests) are supported for compatibility reasons but should
|
|
not be used for new guests: use the <code>sclpconsole</code> and
|
|
<code>sclplmconsole</code> target models, respectively, with the
|
|
<code>serial</code> element instead.
|
|
</p>
|
|
|
|
<p>
|
|
Of the target types listed above, <code>serial</code> is special in
|
|
that it doesn't represents a separate device, but rather the same
|
|
device as the first <code>serial</code> element. Due to this, there can
|
|
only be a single <code>console</code> element with target type
|
|
<code>serial</code> per guest.
|
|
</p>
|
|
|
|
<p>
|
|
Virtio consoles are usually accessible as <code>/dev/hvc[0-7]</code>
|
|
from inside the guest; for more information, see
|
|
<a href="http://fedoraproject.org/wiki/Features/VirtioSerial">http://fedoraproject.org/wiki/Features/VirtioSerial</a>.
|
|
<span class="since">Since 0.8.3</span>
|
|
</p>
|
|
|
|
<p>
|
|
For the relationship between serial ports and consoles,
|
|
<a href="#elementCharSerialAndConsole">see below</a>.
|
|
</p>
|
|
|
|
<h6><a id="elementCharSerialAndConsole">Relationship between serial ports and consoles</a></h6>
|
|
|
|
<p>
|
|
Due to hystorical reasons, the <code>serial</code> and
|
|
<code>console</code> elements have partially overlapping scopes.
|
|
</p>
|
|
|
|
<p>
|
|
In general, both elements are used to configure one or more serial
|
|
consoles to be used for interacting with the guest. The main difference
|
|
between the two is that <code>serial</code> is used for emulated,
|
|
usually native, serial consoles, whereas <code>console</code> is used
|
|
for paravirtualized ones.
|
|
</p>
|
|
|
|
<p>
|
|
Both emulated and paravirtualized serial consoles have advantages and
|
|
disadvantages:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>
|
|
emulated serial consoles are usually initialized much earlier than
|
|
paravirtualized ones, so they can be used to control the bootloader
|
|
and display both firmware and early boot messages;
|
|
</li>
|
|
<li>
|
|
on several platforms, there can only be a single emulated serial
|
|
console per guest but paravirtualized consoles don't suffer from the
|
|
same limitation.
|
|
</li>
|
|
</ul>
|
|
|
|
<p>
|
|
A configuration such as:
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<console type='pty'>
|
|
<target type='serial'/>
|
|
</console>
|
|
<console type='pty'>
|
|
<target type='virtio'/>
|
|
</console>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
will work on any platform and will result in one emulated serial console
|
|
for early boot logging / interactive / recovery use, and one
|
|
paravirtualized serial console to be used eg. as a side channel. Most
|
|
people will be fine with having just the first <code>console</code>
|
|
element in their configuration, but if a specific configuration is
|
|
desired then both elements should be specified.
|
|
</p>
|
|
|
|
<p>
|
|
Note that, due to the compatibility concerns mentioned earlier, all the
|
|
following configurations:
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<serial type='pty'/>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<console type='pty'/>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<serial type='pty'/>
|
|
<console type='pty'/>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
will be treated the same and will result in a single emulated serial
|
|
console being available to the guest.
|
|
</p>
|
|
|
|
<h6><a id="elementCharChannel">Channel</a></h6>
|
|
|
|
<p>
|
|
This represents a private communication channel between the host and the
|
|
guest.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<channel type='unix'>
|
|
<source mode='bind' path='/tmp/guestfwd'/>
|
|
<target type='guestfwd' address='10.0.2.1' port='4600'/>
|
|
</channel>
|
|
|
|
<!-- KVM virtio channel -->
|
|
<channel type='pty'>
|
|
<target type='virtio' name='arbitrary.virtio.serial.port.name'/>
|
|
</channel>
|
|
<channel type='unix'>
|
|
<source mode='bind' path='/var/lib/libvirt/qemu/f16x86_64.agent'/>
|
|
<target type='virtio' name='org.qemu.guest_agent.0' state='connected'/>
|
|
</channel>
|
|
<channel type='spicevmc'>
|
|
<target type='virtio' name='com.redhat.spice.0'/>
|
|
</channel>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
This can be implemented in a variety of ways. The specific type of
|
|
channel is given in the <code>type</code> attribute of the
|
|
<code>target</code> element. Different channel types have different
|
|
<code>target</code> attributes.
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>guestfwd</code></dt>
|
|
<dd>TCP traffic sent by the guest to a given IP address and port is
|
|
forwarded to the channel device on the host. The <code>target</code>
|
|
element must have <code>address</code> and <code>port</code> attributes.
|
|
<span class="since">Since 0.7.3</span></dd>
|
|
|
|
<dt><code>virtio</code></dt>
|
|
<dd>Paravirtualized virtio channel. Channel is exposed in the guest under
|
|
/dev/vport*, and if the optional element <code>name</code> is specified,
|
|
/dev/virtio-ports/$name (for more info, please see
|
|
<a href="http://fedoraproject.org/wiki/Features/VirtioSerial">http://fedoraproject.org/wiki/Features/VirtioSerial</a>). The
|
|
optional element <code>address</code> can tie the channel to a
|
|
particular <code>type='virtio-serial'</code>
|
|
controller, <a href="#elementsAddress">documented above</a>.
|
|
With qemu, if <code>name</code> is "org.qemu.guest_agent.0",
|
|
then libvirt can interact with a guest agent installed in the
|
|
guest, for actions such as guest shutdown or file system quiescing.
|
|
<span class="since">Since 0.7.7, guest agent interaction
|
|
since 0.9.10</span> Moreover, <span class="since">since 1.0.6</span>
|
|
it is possible to have source path auto generated for virtio unix channels.
|
|
This is very useful in case of a qemu guest agent, where users don't
|
|
usually care about the source path since it's libvirt who talks to
|
|
the guest agent. In case users want to utilize this feature, they should
|
|
leave <code><source></code> element out. <span class="since">Since
|
|
1.2.11</span> the active XML for a virtio channel may contain an optional
|
|
<code>state</code> attribute that reflects whether a process in the
|
|
guest is active on the channel. This is an output-only attribute.
|
|
Possible values for the <code>state</code> attribute are
|
|
<code>connected</code> and <code>disconnected</code>.
|
|
</dd>
|
|
<dt><code>xen</code></dt>
|
|
<dd> Paravirtualized Xen channel. Channel is exposed in the guest as a
|
|
Xen console but identified with a name. Setup and consumption of a Xen
|
|
channel depends on software and configuration in the guest
|
|
(for more info, please see <a href="http://xenbits.xen.org/docs/unstable/misc/channel.txt">http://xenbits.xen.org/docs/unstable/misc/channel.txt</a>).
|
|
Channel source path semantics are the same as the virtio target type.
|
|
The <code>state</code> attribute is not supported since Xen channels
|
|
lack the necessary probing mechanism.
|
|
<span class="since">Since 2.3.0</span>
|
|
</dd>
|
|
<dt><code>spicevmc</code></dt>
|
|
<dd>Paravirtualized SPICE channel. The domain must also have a
|
|
SPICE server as a <a href="#elementsGraphics">graphics
|
|
device</a>, at which point the host piggy-backs messages
|
|
across the <code>main</code> channel. The <code>target</code>
|
|
element must be present, with
|
|
attribute <code>type='virtio'</code>; an optional
|
|
attribute <code>name</code> controls how the guest will have
|
|
access to the channel, and defaults
|
|
to <code>name='com.redhat.spice.0'</code>. The
|
|
optional <code>address</code> element can tie the channel to a
|
|
particular <code>type='virtio-serial'</code> controller.
|
|
<span class="since">Since 0.8.8</span></dd>
|
|
</dl>
|
|
|
|
<h5><a id="elementsCharHostInterface">Host interface</a></h5>
|
|
|
|
<p>
|
|
A character device presents itself to the host as one of the following
|
|
types.
|
|
</p>
|
|
|
|
<h6><a id="elementsCharSTDIO">Domain logfile</a></h6>
|
|
|
|
<p>
|
|
This disables all input on the character device, and sends output
|
|
into the virtual machine's logfile
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<console type='stdio'>
|
|
<target port='1'/>
|
|
</console>
|
|
</devices>
|
|
...</pre>
|
|
|
|
|
|
<h6><a id="elementsCharFle">Device logfile</a></h6>
|
|
|
|
<p>
|
|
A file is opened and all data sent to the character
|
|
device is written to the file.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<serial type="file">
|
|
<source path="/var/log/vm/vm-serial.log"/>
|
|
<target port="1"/>
|
|
</serial>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h6><a id="elementsCharVC">Virtual console</a></h6>
|
|
|
|
<p>
|
|
Connects the character device to the graphical framebuffer in
|
|
a virtual console. This is typically accessed via a special
|
|
hotkey sequence such as "ctrl+alt+3"
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<serial type='vc'>
|
|
<target port="1"/>
|
|
</serial>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h6><a id="elementsCharNull">Null device</a></h6>
|
|
|
|
<p>
|
|
Connects the character device to the void. No data is ever
|
|
provided to the input. All data written is discarded.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<serial type='null'>
|
|
<target port="1"/>
|
|
</serial>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h6><a id="elementsCharPTY">Pseudo TTY</a></h6>
|
|
|
|
<p>
|
|
A Pseudo TTY is allocated using /dev/ptmx. A suitable client
|
|
such as 'virsh console' can connect to interact with the
|
|
serial port locally.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<serial type="pty">
|
|
<source path="/dev/pts/3"/>
|
|
<target port="1"/>
|
|
</serial>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
NB special case if <console type='pty'>, then the TTY
|
|
path is also duplicated as an attribute tty='/dev/pts/3'
|
|
on the top level <console> tag. This provides compat
|
|
with existing syntax for <console> tags.
|
|
</p>
|
|
|
|
<h6><a id="elementsCharHost">Host device proxy</a></h6>
|
|
|
|
<p>
|
|
The character device is passed through to the underlying
|
|
physical character device. The device types must match,
|
|
eg the emulated serial port should only be connected to
|
|
a host serial port - don't connect a serial port to a parallel
|
|
port.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<serial type="dev">
|
|
<source path="/dev/ttyS0"/>
|
|
<target port="1"/>
|
|
</serial>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h6><a id="elementsCharPipe">Named pipe</a></h6>
|
|
|
|
<p>
|
|
The character device writes output to a named pipe. See pipe(7) for
|
|
more info.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<serial type="pipe">
|
|
<source path="/tmp/mypipe"/>
|
|
<target port="1"/>
|
|
</serial>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h6><a id="elementsCharTCP">TCP client/server</a></h6>
|
|
|
|
<p>
|
|
The character device acts as a TCP client connecting to a
|
|
remote server.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<serial type="tcp">
|
|
<source mode="connect" host="0.0.0.0" service="2445"/>
|
|
<protocol type="raw"/>
|
|
<target port="1"/>
|
|
</serial>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
Or as a TCP server waiting for a client connection.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<serial type="tcp">
|
|
<source mode="bind" host="127.0.0.1" service="2445"/>
|
|
<protocol type="raw"/>
|
|
<target port="1"/>
|
|
</serial>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
Alternatively you can use <code>telnet</code> instead
|
|
of <code>raw</code> TCP in order to utilize the telnet protocol
|
|
for the connection.
|
|
</p>
|
|
<p>
|
|
<span class="since">Since 0.8.5,</span> some hypervisors support
|
|
use of either <code>telnets</code> (secure telnet) or <code>tls</code>
|
|
(via secure sockets layer) as the transport protocol for connections.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<serial type="tcp">
|
|
<source mode="connect" host="0.0.0.0" service="2445"/>
|
|
<protocol type="telnet"/>
|
|
<target port="1"/>
|
|
</serial>
|
|
...
|
|
<serial type="tcp">
|
|
<source mode="bind" host="127.0.0.1" service="2445"/>
|
|
<protocol type="telnet"/>
|
|
<target port="1"/>
|
|
</serial>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
<span class="since">Since 2.4.0,</span> the optional attribute
|
|
<code>tls</code> can be used to control whether a chardev
|
|
TCP communication channel would utilize a hypervisor configured
|
|
TLS X.509 certificate environment in order to encrypt the data
|
|
channel. For the QEMU hypervisor, usage of a TLS environment can
|
|
be controlled on the host by the <code>chardev_tls</code> and
|
|
<code>chardev_tls_x509_cert_dir</code> or
|
|
<code>default_tls_x509_cert_dir</code> settings in the file
|
|
/etc/libvirt/qemu.conf. If <code>chardev_tls</code> is enabled,
|
|
then unless the <code>tls</code> attribute is set to "no", libvirt
|
|
will use the host configured TLS environment.
|
|
If <code>chardev_tls</code> is disabled, but the <code>tls</code>
|
|
attribute is set to "yes", then libvirt will attempt to use the
|
|
host TLS environment if either the <code>chardev_tls_x509_cert_dir</code>
|
|
or <code>default_tls_x509_cert_dir</code> TLS directory structure exists.
|
|
</p>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<serial type="tcp">
|
|
<source mode='connect' host="127.0.0.1" service="5555" tls="yes"/>
|
|
<protocol type="raw"/>
|
|
<target port="0"/>
|
|
</serial>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h6><a id="elementsCharUDP">UDP network console</a></h6>
|
|
|
|
<p>
|
|
The character device acts as a UDP netconsole service,
|
|
sending and receiving packets. This is a lossy service.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<serial type="udp">
|
|
<source mode="bind" host="0.0.0.0" service="2445"/>
|
|
<source mode="connect" host="0.0.0.0" service="2445"/>
|
|
<target port="1"/>
|
|
</serial>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h6><a id="elementsCharUNIX">UNIX domain socket client/server</a></h6>
|
|
|
|
<p>
|
|
The character device acts as a UNIX domain socket server,
|
|
accepting connections from local clients.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<serial type="unix">
|
|
<source mode="bind" path="/tmp/foo"/>
|
|
<target port="1"/>
|
|
</serial>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h6><a id="elementsCharSpiceport">Spice channel</a></h6>
|
|
|
|
<p>
|
|
The character device is accessible through spice connection
|
|
under a channel name specified in the <code>channel</code>
|
|
attribute. <span class="since">Since 1.2.2</span>
|
|
</p>
|
|
<p>
|
|
Note: depending on the hypervisor, spiceports might (or might not)
|
|
be enabled on domains with or without <a href="#elementsGraphics">spice
|
|
graphics</a>.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<serial type="spiceport">
|
|
<source channel="org.qemu.console.serial.0"/>
|
|
<target port="1"/>
|
|
</serial>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<h6><a id="elementsNmdm">Nmdm device</a></h6>
|
|
|
|
<p>
|
|
The nmdm device driver, available on FreeBSD, provides two
|
|
tty devices connected together by a virtual null modem cable.
|
|
<span class="since">Since 1.2.4</span>
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<serial type="nmdm">
|
|
<source master="/dev/nmdm0A" slave="/dev/nmdm0B"/>
|
|
</serial>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
The <code>source</code> element has these attributes:
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>master</code></dt>
|
|
<dd>Master device of the pair, that is passed to the hypervisor.
|
|
Device is specified by a fully qualified path.</dd>
|
|
|
|
<dt><code>slave</code></dt>
|
|
<dd>Slave device of the pair, that is passed to the clients for connection
|
|
to the guest console. Device is specified by a fully qualified path.</dd>
|
|
</dl>
|
|
|
|
<h4><a id="elementsSound">Sound devices</a></h4>
|
|
|
|
<p>
|
|
A virtual sound card can be attached to the host via the
|
|
<code>sound</code> element. <span class="since">Since 0.4.3</span>
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<sound model='es1370'/>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<dl>
|
|
<dt><code>sound</code></dt>
|
|
<dd>
|
|
The <code>sound</code> element has one mandatory attribute,
|
|
<code>model</code>, which specifies what real sound device is emulated.
|
|
Valid values are specific to the underlying hypervisor, though typical
|
|
choices are 'es1370', 'sb16', 'ac97', 'ich6' and 'usb'.
|
|
(<span class="since">
|
|
'ac97' only since 0.6.0, 'ich6' only since 0.8.8,
|
|
'usb' only since 1.2.7</span>)
|
|
</dd>
|
|
</dl>
|
|
|
|
<p>
|
|
<span class="since">Since 0.9.13</span>, a sound element
|
|
with <code>ich6</code> model can have optional
|
|
sub-elements <code><codec></code> to attach various audio
|
|
codecs to the audio device. If not specified, a default codec
|
|
will be attached to allow playback and recording.
|
|
</p>
|
|
<p>
|
|
Valid values are:
|
|
</p>
|
|
<p>
|
|
<ul>
|
|
<li>'duplex' - advertise a line-in and a line-out </li>
|
|
<li>'micro' - advertise a speaker and a microphone </li>
|
|
<li>'output' - advertise a line-out
|
|
<span class="since">Since 4.4.0</span></li>
|
|
</ul>
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<sound model='ich6'>
|
|
<codec type='micro'/>
|
|
</sound>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
Each <code>sound</code> element has an optional
|
|
sub-element <code><address></code> which can tie the
|
|
device to a particular PCI
|
|
slot, <a href="#elementsAddress">documented above</a>.
|
|
</p>
|
|
|
|
<h4><a id="elementsWatchdog">Watchdog device</a></h4>
|
|
|
|
<p>
|
|
A virtual hardware watchdog device can be added to the guest via
|
|
the <code>watchdog</code> element.
|
|
<span class="since">Since 0.7.3, QEMU and KVM only</span>
|
|
</p>
|
|
|
|
<p>
|
|
The watchdog device requires an additional driver and management
|
|
daemon in the guest. Just enabling the watchdog in the libvirt
|
|
configuration does not do anything useful on its own.
|
|
</p>
|
|
|
|
<p>
|
|
Currently libvirt does not support notification when the
|
|
watchdog fires. This feature is planned for a future version of
|
|
libvirt.
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<watchdog model='i6300esb'/>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<watchdog model='i6300esb' action='poweroff'/>
|
|
</devices>
|
|
</domain></pre>
|
|
|
|
<dl>
|
|
<dt><code>model</code></dt>
|
|
<dd>
|
|
<p>
|
|
The required <code>model</code> attribute specifies what real
|
|
watchdog device is emulated. Valid values are specific to the
|
|
underlying hypervisor.
|
|
</p>
|
|
<p>
|
|
QEMU and KVM support:
|
|
</p>
|
|
<ul>
|
|
<li>'i6300esb' - the recommended device,
|
|
emulating a PCI Intel 6300ESB </li>
|
|
<li>'ib700' - emulating an ISA iBase IB700 </li>
|
|
<li>'diag288' - emulating an S390 DIAG288 device
|
|
<span class="since">Since 1.2.17</span></li>
|
|
</ul>
|
|
</dd>
|
|
<dt><code>action</code></dt>
|
|
<dd>
|
|
<p>
|
|
The optional <code>action</code> attribute describes what
|
|
action to take when the watchdog expires. Valid values are
|
|
specific to the underlying hypervisor.
|
|
</p>
|
|
<p>
|
|
QEMU and KVM support:
|
|
</p>
|
|
<ul>
|
|
<li>'reset' - default, forcefully reset the guest</li>
|
|
<li>'shutdown' - gracefully shutdown the guest
|
|
(not recommended) </li>
|
|
<li>'poweroff' - forcefully power off the guest</li>
|
|
<li>'pause' - pause the guest</li>
|
|
<li>'none' - do nothing</li>
|
|
<li>'dump' - automatically dump the guest
|
|
<span class="since">Since 0.8.7</span></li>
|
|
<li>'inject-nmi' - inject a non-maskable interrupt
|
|
into the guest
|
|
<span class="since">Since 1.2.17</span></li>
|
|
</ul>
|
|
<p>
|
|
Note 1: the 'shutdown' action requires that the guest
|
|
is responsive to ACPI signals. In the sort of situations
|
|
where the watchdog has expired, guests are usually unable
|
|
to respond to ACPI signals. Therefore using 'shutdown'
|
|
is not recommended.
|
|
</p>
|
|
<p>
|
|
Note 2: the directory to save dump files can be configured
|
|
by <code>auto_dump_path</code> in file /etc/libvirt/qemu.conf.
|
|
</p>
|
|
</dd>
|
|
</dl>
|
|
|
|
<h4><a id="elementsMemBalloon">Memory balloon device</a></h4>
|
|
|
|
<p>
|
|
A virtual memory balloon device is added to all Xen and KVM/QEMU
|
|
guests. It will be seen as <code>memballoon</code> element.
|
|
It will be automatically added when appropriate, so there is no
|
|
need to explicitly add this element in the guest XML unless a
|
|
specific PCI slot needs to be assigned.
|
|
<span class="since">Since 0.8.3, Xen, QEMU and KVM only</span>
|
|
Additionally, <span class="since">since 0.8.4</span>, if the
|
|
memballoon device needs to be explicitly disabled,
|
|
<code>model='none'</code> may be used.
|
|
</p>
|
|
|
|
<p>
|
|
Example: automatically added device with KVM
|
|
</p>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<memballoon model='virtio'/>
|
|
</devices>
|
|
...</pre>
|
|
|
|
<p>
|
|
Example: manually added device with static PCI slot 2 requested
|
|
</p>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<memballoon model='virtio'>
|
|
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
|
|
<stats period='10'/>
|
|
<driver iommu='on' ats='on'/>
|
|
</memballoon>
|
|
</devices>
|
|
</domain></pre>
|
|
|
|
<dl>
|
|
<dt><code>model</code></dt>
|
|
<dd>
|
|
<p>
|
|
The required <code>model</code> attribute specifies what type
|
|
of balloon device is provided. Valid values are specific to
|
|
the virtualization platform
|
|
</p>
|
|
<ul>
|
|
<li>'virtio' - default with QEMU/KVM</li>
|
|
<li>'virtio-transitional' <span class="since">Since 5.2.0</span></li>
|
|
<li>'virtio-non-transitional' <span class="since">Since 5.2.0</span></li>
|
|
<li>'xen' - default with Xen</li>
|
|
</ul>
|
|
See <a href="#elementsVirtioTransitional">Virtio transitional devices</a>
|
|
for more details.
|
|
</dd>
|
|
<dt><code>autodeflate</code></dt>
|
|
<dd>
|
|
<p>
|
|
The optional <code>autodeflate</code> attribute allows to
|
|
enable/disable (values "on"/"off", respectively) the ability of the
|
|
QEMU virtio memory balloon to release some memory at the last moment
|
|
before a guest's process get killed by Out of Memory killer.
|
|
<span class="since">Since 1.3.1, QEMU and KVM only</span>
|
|
</p>
|
|
</dd>
|
|
<dt><code>period</code></dt>
|
|
<dd>
|
|
<p>
|
|
The optional <code>period</code> allows the QEMU virtio memory balloon
|
|
driver to provide statistics through the <code>virsh dommemstat
|
|
[domain]</code> command. By default, collection is not enabled. In
|
|
order to enable, use the <code>virsh dommemstat [domain] --period
|
|
[number]</code> command or <code>virsh edit</code> command to add the
|
|
option to the XML definition. The <code>virsh dommemstat</code> will
|
|
accept the options <code>--live</code>, <code>--current</code>,
|
|
or <code>--config</code>. If an option is not provided, the change
|
|
for a running domain will only be made to the active guest. If the
|
|
QEMU driver is not at the right revision, the attempt to set the
|
|
period will fail. Large values (e.g. many years) might be ignored.
|
|
<span class='since'>Since 1.1.1, requires QEMU 1.5</span>
|
|
</p>
|
|
</dd>
|
|
<dt><code>driver</code></dt>
|
|
<dd>
|
|
For model <code>virtio</code> memballoon,
|
|
<a href="#elementsVirtio">Virtio-specific options</a> can also be
|
|
set. (<span class="since">Since 3.5.0</span>)
|
|
</dd>
|
|
</dl>
|
|
<h4><a id="elementsRng">Random number generator device</a></h4>
|
|
|
|
<p>
|
|
The virtual random number generator device allows the host to pass
|
|
through entropy to guest operating systems.
|
|
<span class="since">Since 1.0.3</span>
|
|
</p>
|
|
|
|
<p>
|
|
Example: usage of the RNG device:
|
|
</p>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<rng model='virtio'>
|
|
<rate period="2000" bytes="1234"/>
|
|
<backend model='random'>/dev/random</backend>
|
|
<!-- OR -->
|
|
<backend model='egd' type='udp'>
|
|
<source mode='bind' service='1234'/>
|
|
<source mode='connect' host='1.2.3.4' service='1234'/>
|
|
</backend>
|
|
<!-- OR -->
|
|
<backend model='builtin'/>
|
|
</rng>
|
|
</devices>
|
|
...
|
|
</pre>
|
|
<dl>
|
|
<dt><code>model</code></dt>
|
|
<dd>
|
|
<p>
|
|
The required <code>model</code> attribute specifies what type
|
|
of RNG device is provided. Valid values are specific to
|
|
the virtualization platform:
|
|
</p>
|
|
<ul>
|
|
<li>'virtio' - supported by qemu and virtio-rng kernel module</li>
|
|
<li>'virtio-transitional' <span class='since'>Since 5.2.0</span></li>
|
|
<li>'virtio-non-transitional' <span class='since'>Since 5.2.0</span></li>
|
|
</ul>
|
|
See <a href="#elementsVirtioTransitional">Virtio transitional devices</a>
|
|
for more details.
|
|
</dd>
|
|
<dt><code>rate</code></dt>
|
|
<dd>
|
|
<p>
|
|
The optional <code>rate</code> element allows limiting the rate at
|
|
which entropy can be consumed from the source. The mandatory
|
|
attribute <code>bytes</code> specifies how many bytes are permitted
|
|
to be consumed per period. An optional <code>period</code> attribute
|
|
specifies the duration of a period in milliseconds; if omitted, the
|
|
period is taken as 1000 milliseconds (1 second).
|
|
<span class='since'>Since 1.0.4</span>
|
|
</p>
|
|
</dd>
|
|
<dt><code>backend</code></dt>
|
|
<dd>
|
|
<p>
|
|
The <code>backend</code> element specifies the source of entropy
|
|
to be used for the domain. The source model is configured using the
|
|
<code>model</code> attribute. Supported source models are:
|
|
</p>
|
|
<dl>
|
|
<dt><code>random</code></dt>
|
|
<dd>
|
|
<p>
|
|
This backend type expects a non-blocking character device
|
|
as input. The file name is specified as contents of the
|
|
<code>backend</code> element. <span class='since'>Since
|
|
1.3.4</span> any path is accepted. Before that
|
|
<code>/dev/random</code> and <code>/dev/hwrng</code> were
|
|
the only accepted paths. When no file name is specified,
|
|
the hypervisor default is used. For QEMU, the default is
|
|
<code>/dev/random</code>. However, the recommended source
|
|
of entropy is <code>/dev/urandom</code> (as it doesn't
|
|
have the limitations of <code>/dev/random</code>).
|
|
</p>
|
|
</dd>
|
|
<dt><code>egd</code></dt>
|
|
<dd>
|
|
<p>
|
|
This backend connects to a source using the EGD protocol.
|
|
The source is specified as a character device. Refer to
|
|
<a href='#elementsCharHostInterface'>character device host interface</a>
|
|
for more information.
|
|
</p>
|
|
</dd>
|
|
<dt><code>builtin</code></dt>
|
|
<dd>
|
|
<p>
|
|
This backend uses qemu builtin random generator, which uses
|
|
<code>getrandom()</code> syscall as the source of entropy.
|
|
(<span class="since">Since 6.1.0 and QEMU 4.2</span>)
|
|
</p>
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
<dt><code>driver</code></dt>
|
|
<dd>
|
|
The subelement <code>driver</code> can be used to tune the device:
|
|
<dl>
|
|
<dt>virtio options</dt>
|
|
<dd>
|
|
<a href="#elementsVirtio">Virtio-specific options</a> can also be
|
|
set. (<span class="since">Since 3.5.0</span>)
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
|
|
</dl>
|
|
|
|
<h4><a id="elementsTpm">TPM device</a></h4>
|
|
|
|
<p>
|
|
The TPM device enables a QEMU guest to have access to TPM
|
|
functionality. The TPM device may either be a TPM 1.2 or
|
|
a TPM 2.0.
|
|
</p>
|
|
<p>
|
|
The TPM passthrough device type provides access to the host's TPM
|
|
for one QEMU guest. No other software may be using the TPM device,
|
|
typically /dev/tpm0, at the time the QEMU guest is started.
|
|
<span class="since">'passthrough' since 1.0.5</span>
|
|
</p>
|
|
|
|
<p>
|
|
Example: usage of the TPM passthrough device
|
|
</p>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<tpm model='tpm-tis'>
|
|
<backend type='passthrough'>
|
|
<device path='/dev/tpm0'/>
|
|
</backend>
|
|
</tpm>
|
|
</devices>
|
|
...
|
|
</pre>
|
|
|
|
<p>
|
|
The emulator device type gives access to a TPM emulator providing
|
|
TPM functionality for each VM. QEMU talks to it over a Unix socket. With
|
|
the emulator device type each guest gets its own private TPM.
|
|
<span class="since">'emulator' since 4.5.0</span>
|
|
The state of the TPM emulator can be encrypted by providing an
|
|
<code>encryption</code> element.
|
|
<span class="since">'encryption' since 5.6.0</span>
|
|
</p>
|
|
<p>
|
|
Example: usage of the TPM Emulator
|
|
</p>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<tpm model='tpm-tis'>
|
|
<backend type='emulator' version='2.0'>
|
|
<encryption secret='6dd3e4a5-1d76-44ce-961f-f119f5aad935'/>
|
|
</backend>
|
|
</tpm>
|
|
</devices>
|
|
...
|
|
</pre>
|
|
<dl>
|
|
<dt><code>model</code></dt>
|
|
<dd>
|
|
<p>
|
|
The <code>model</code> attribute specifies what device
|
|
model QEMU provides to the guest. If no model name is provided,
|
|
<code>tpm-tis</code> will automatically be chosen for non-PPC64
|
|
architectures.
|
|
<span class="since">Since 4.4.0</span>, another available choice
|
|
is the <code>tpm-crb</code>, which should only be used when the
|
|
backend device is a TPM 2.0. <span class="since">Since 6.1.0</span>,
|
|
pSeries guests on PPC64 are supported and the default is
|
|
<code>tpm-spapr</code>.
|
|
|
|
<span class="since">Since 6.5.0</span>, a new model called
|
|
<code>spapr-tpm-proxy</code> was added for pSeries guests. This model
|
|
only works with the <code>passthrough</code> backend. It creates a
|
|
TPM Proxy device that communicates with an existing TPM Resource Manager
|
|
in the host, for example <code>/dev/tpmrm0</code>, enabling the guest to
|
|
run in secure virtual machine mode with the help of an Ultravisor. Adding
|
|
a TPM Proxy to a pSeries guest brings no security benefits unless the guest
|
|
is running on a PPC64 host that has an Ultravisor and a TPM Resource Manager.
|
|
Only one TPM Proxy device is allowed per guest, but a TPM Proxy device can
|
|
be added together with
|
|
other TPM devices.
|
|
</p>
|
|
</dd>
|
|
<dt><code>backend</code></dt>
|
|
<dd>
|
|
<p>
|
|
The <code>backend</code> element specifies the type of
|
|
TPM device. The following types are supported:
|
|
</p>
|
|
<dl>
|
|
<dt><code>passthrough</code></dt>
|
|
<dd>
|
|
<p>
|
|
Use the host's TPM or TPM Resource Manager device.
|
|
</p>
|
|
<p>
|
|
This backend type requires exclusive access to a TPM device on
|
|
the host. An example for such a device is /dev/tpm0. The fully
|
|
qualified file name is specified by path attribute of the
|
|
<code>source</code> element. If no file name is specified then
|
|
/dev/tpm0 is automatically used.
|
|
|
|
<span class="since">Since 6.5.0</span>, when choosing the
|
|
<code>spapr-tpm-proxy</code> model, the file name specified is
|
|
expected to be a TPM Resource Manager device, e.g.
|
|
<code>/dev/tpmrm0</code>.
|
|
</p>
|
|
</dd>
|
|
</dl>
|
|
<dl>
|
|
<dt><code>emulator</code></dt>
|
|
<dd>
|
|
<p>
|
|
For this backend type the 'swtpm' TPM Emulator must be installed on the
|
|
host. Libvirt will automatically start an independent TPM emulator
|
|
for each QEMU guest requesting access to it.
|
|
</p>
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
<dt><code>version</code></dt>
|
|
<dd>
|
|
<p>
|
|
The <code>version</code> attribute indicates the version
|
|
of the TPM. By default a TPM 1.2 is created. This attribute
|
|
only works with the <code>emulator</code> backend. The following
|
|
versions are supported:
|
|
</p>
|
|
<ul>
|
|
<li>'1.2' : creates a TPM 1.2</li>
|
|
<li>'2.0' : creates a TPM 2.0</li>
|
|
</ul>
|
|
</dd>
|
|
<dt><code>encryption</code></dt>
|
|
<dd>
|
|
<p>
|
|
The <code>encryption</code> element allows the state of a TPM emulator
|
|
to be encrypted. The <code>secret</code> must reference a secret object
|
|
that holds the passphrase from which the encryption key will be derived.
|
|
</p>
|
|
</dd>
|
|
</dl>
|
|
|
|
<h4><a id="elementsNVRAM">NVRAM device</a></h4>
|
|
<p>
|
|
nvram device is always added to pSeries guest on PPC64, and its address
|
|
is allowed to be changed. Element <code>nvram</code> (only valid for
|
|
pSeries guest, <span class="since">since 1.0.5</span>) is provided to
|
|
enable the address setting.
|
|
</p>
|
|
<p>
|
|
Example: usage of NVRAM configuration
|
|
</p>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<nvram>
|
|
<address type='spapr-vio' reg='0x00003000'/>
|
|
</nvram>
|
|
</devices>
|
|
...
|
|
</pre>
|
|
<dl>
|
|
<dt><code>spapr-vio</code></dt>
|
|
<dd>
|
|
<p>
|
|
VIO device address type, only valid for PPC64.
|
|
</p>
|
|
</dd>
|
|
<dt><code>reg</code></dt>
|
|
<dd>
|
|
<p>
|
|
Device address
|
|
</p>
|
|
</dd>
|
|
</dl>
|
|
|
|
<h4><a id="elementsPanic">panic device</a></h4>
|
|
<p>
|
|
panic device enables libvirt to receive panic notification from a QEMU
|
|
guest.
|
|
<span class="since">Since 1.2.1, QEMU and KVM only</span>
|
|
</p>
|
|
<p>
|
|
This feature is always enabled for:
|
|
</p>
|
|
<ul>
|
|
<li>pSeries guests, since it's implemented by the guest firmware</li>
|
|
<li>S390 guests, since it's an integral part of the S390 architecture</li>
|
|
</ul>
|
|
<p>
|
|
For the guest types listed above, libvirt automatically adds a
|
|
<code>panic</code> element to the domain XML.
|
|
</p>
|
|
<p>
|
|
Example: usage of panic configuration
|
|
</p>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<panic model='hyperv'/>
|
|
<panic model='isa'>
|
|
<address type='isa' iobase='0x505'/>
|
|
</panic>
|
|
</devices>
|
|
...
|
|
</pre>
|
|
<dl>
|
|
<dt><code>model</code></dt>
|
|
<dd>
|
|
<p>
|
|
The optional <code>model</code> attribute specifies what type
|
|
of panic device is provided. The panic model used when this attribute
|
|
is missing depends on the hypervisor and guest arch.
|
|
</p>
|
|
<ul>
|
|
<li>'isa' - for ISA pvpanic device</li>
|
|
<li>'pseries' - default and valid only for pSeries guests.</li>
|
|
<li>'hyperv' - for Hyper-V crash CPU feature.
|
|
<span class="since">Since 1.3.0, QEMU and KVM only</span></li>
|
|
<li>'s390' - default for S390 guests.
|
|
<span class="since">Since 1.3.5</span></li>
|
|
</ul>
|
|
</dd>
|
|
<dt><code>address</code></dt>
|
|
<dd>
|
|
<p>
|
|
address of panic. The default ioport is 0x505. Most users
|
|
don't need to specify an address, and doing so is forbidden
|
|
altogether for s390, pseries and hyperv models.
|
|
</p>
|
|
</dd>
|
|
</dl>
|
|
|
|
<h4><a id="elementsShmem">Shared memory device</a></h4>
|
|
|
|
<p>
|
|
A shared memory device allows to share a memory region between
|
|
different virtual machines and the host.
|
|
<span class="since">Since 1.2.10, QEMU and KVM only</span>
|
|
</p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<shmem name='my_shmem0'>
|
|
<model type='ivshmem-plain'/>
|
|
<size unit='M'>4</size>
|
|
</shmem>
|
|
<shmem name='shmem_server'>
|
|
<model type='ivshmem-doorbell'/>
|
|
<size unit='M'>2</size>
|
|
<server path='/tmp/socket-shmem'/>
|
|
<msi vectors='32' ioeventfd='on'/>
|
|
</shmem>
|
|
</devices>
|
|
...
|
|
</pre>
|
|
|
|
<dl>
|
|
<dt><code>shmem</code></dt>
|
|
<dd>
|
|
The <code>shmem</code> element has one mandatory attribute,
|
|
<code>name</code> to identify the shared memory. This attribute cannot
|
|
be directory specific to <code>.</code> or <code>..</code> as well as
|
|
it cannot involve path separator <code>/</code>.
|
|
</dd>
|
|
<dt><code>model</code></dt>
|
|
<dd>
|
|
Attribute <code>type</code> of the optional element <code>model</code>
|
|
specifies the model of the underlying device providing the
|
|
<code>shmem</code> device. The models currently supported are
|
|
<code>ivshmem</code> (supports both server and server-less shmem, but is
|
|
deprecated by newer QEMU in favour of the -plain and -doorbell variants),
|
|
<code>ivshmem-plain</code> (only for server-less shmem) and
|
|
<code>ivshmem-doorbell</code> (only for shmem with the server).
|
|
</dd>
|
|
<dt><code>size</code></dt>
|
|
<dd>
|
|
The optional <code>size</code> element specifies the size of the shared
|
|
memory. This must be power of 2 and greater than or equal to 1 MiB.
|
|
</dd>
|
|
<dt><code>server</code></dt>
|
|
<dd>
|
|
The optional <code>server</code> element can be used to configure a server
|
|
socket the device is supposed to connect to. The optional
|
|
<code>path</code> attribute specifies the absolute path to the unix socket
|
|
and defaults to <code>/var/lib/libvirt/shmem/$shmem-$name-sock</code>.
|
|
</dd>
|
|
<dt><code>msi</code></dt>
|
|
<dd>
|
|
The optional <code>msi</code> element enables/disables (values "on"/"off",
|
|
respectively) MSI interrupts. This option can currently be used only
|
|
together with the <code>server</code> element. The <code>vectors</code>
|
|
attribute can be used to specify the number of interrupt
|
|
vectors. The <code>ioeventd</code> attribute enables/disables (values
|
|
"on"/"off", respectively) ioeventfd.
|
|
</dd>
|
|
</dl>
|
|
|
|
<h4><a id="elementsMemory">Memory devices</a></h4>
|
|
|
|
<p>
|
|
In addition to the initial memory assigned to the guest, memory devices
|
|
allow additional memory to be assigned to the guest in the form of
|
|
memory modules.
|
|
|
|
A memory device can be hot-plugged or hot-unplugged depending on the
|
|
guests' memory resource needs.
|
|
|
|
Some hypervisors may require NUMA configured for the guest.
|
|
</p>
|
|
|
|
<p>
|
|
Example: usage of the memory devices
|
|
</p>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<memory model='dimm' access='private' discard='yes'>
|
|
<target>
|
|
<size unit='KiB'>524287</size>
|
|
<node>0</node>
|
|
</target>
|
|
</memory>
|
|
<memory model='dimm'>
|
|
<source>
|
|
<pagesize unit='KiB'>4096</pagesize>
|
|
<nodemask>1-3</nodemask>
|
|
</source>
|
|
<target>
|
|
<size unit='KiB'>524287</size>
|
|
<node>1</node>
|
|
</target>
|
|
</memory>
|
|
<memory model='nvdimm'>
|
|
<uuid>
|
|
<source>
|
|
<path>/tmp/nvdimm</path>
|
|
</source>
|
|
<target>
|
|
<size unit='KiB'>524288</size>
|
|
<node>1</node>
|
|
<label>
|
|
<size unit='KiB'>128</size>
|
|
</label>
|
|
<readonly/>
|
|
</target>
|
|
</memory>
|
|
<memory model='nvdimm' access='shared'>
|
|
<uuid>
|
|
<source>
|
|
<path>/dev/dax0.0</path>
|
|
<alignsize unit='KiB'>2048</alignsize>
|
|
<pmem/>
|
|
</source>
|
|
<target>
|
|
<size unit='KiB'>524288</size>
|
|
<node>1</node>
|
|
<label>
|
|
<size unit='KiB'>128</size>
|
|
</label>
|
|
</target>
|
|
</memory>
|
|
</devices>
|
|
...
|
|
</pre>
|
|
<dl>
|
|
<dt><code>model</code></dt>
|
|
<dd>
|
|
<p>
|
|
Provide <code>dimm</code> to add a virtual DIMM module to the guest.
|
|
<span class="since">Since 1.2.14</span>
|
|
Provide <code>nvdimm</code> model adds a Non-Volatile DIMM
|
|
module. <span class="since">Since 3.2.0</span>
|
|
</p>
|
|
</dd>
|
|
|
|
<dt><code>access</code></dt>
|
|
<dd>
|
|
<p>
|
|
An optional attribute <code>access</code>
|
|
(<span class="since">since 3.2.0</span>) that provides
|
|
capability to fine tune mapping of the memory on per
|
|
module basis. Values are the same as
|
|
<a href="#elementsMemoryBacking">Memory Backing</a>:
|
|
<code>shared</code> and <code>private</code>.
|
|
For <code>nvdimm</code> model, if using real NVDIMM DAX device as
|
|
backend, <code>shared</code> is required.
|
|
</p>
|
|
</dd>
|
|
|
|
<dt><code>discard</code></dt>
|
|
<dd>
|
|
<p>
|
|
An optional attribute <code>discard</code>
|
|
(<span class="since">since 4.4.0</span>) that provides
|
|
capability to fine tune discard of data on per module
|
|
basis. Accepted values are <code>yes</code> and
|
|
<code>no</code>. The feature is described here:
|
|
<a href="#elementsMemoryBacking">Memory Backing</a>.
|
|
This attribute is allowed only for
|
|
<code>model='dimm'</code>.
|
|
</p>
|
|
</dd>
|
|
|
|
<dt><code>uuid</code></dt>
|
|
<dd>
|
|
<p>
|
|
For pSeries guests, an uuid can be set to identify the
|
|
nvdimm module. If absent, libvirt will generate an uuid.
|
|
automatically. This attribute is allowed only for
|
|
<code>model='nvdimm'</code> for pSeries guests.
|
|
<span class="since">Since 6.2.0</span>
|
|
</p>
|
|
</dd>
|
|
|
|
<dt><code>source</code></dt>
|
|
<dd>
|
|
<p>
|
|
For model <code>dimm</code> this element is optional and allows to
|
|
fine tune the source of the memory used for the given memory device.
|
|
If the element is not provided defaults configured via
|
|
<code>numatune</code> are used. If <code>dimm</code> is provided,
|
|
then the following optional elements can be provided as well:
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>pagesize</code></dt>
|
|
<dd>
|
|
<p>
|
|
This element can be used to override the default
|
|
host page size used for backing the memory device.
|
|
The configured value must correspond to a page size
|
|
supported by the host.
|
|
</p>
|
|
</dd>
|
|
|
|
<dt><code>nodemask</code></dt>
|
|
<dd>
|
|
<p>
|
|
This element can be used to override the default
|
|
set of NUMA nodes where the memory would be
|
|
allocated.
|
|
</p>
|
|
</dd>
|
|
</dl>
|
|
|
|
<p>
|
|
For model <code>nvdimm</code> this element is mandatory. The
|
|
mandatory child element <code>path</code> represents a path in
|
|
the host that backs the nvdimm module in the guest. The following
|
|
optional elements may be used:
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>alignsize</code></dt>
|
|
<dd>
|
|
<p>
|
|
The <code>alignsize</code> element defines the page size
|
|
alignment used to mmap the address range for the backend
|
|
<code>path</code>. If not supplied the host page size is used.
|
|
For example, to mmap a real NVDIMM device a 2M-aligned page may
|
|
be required, and host page size is 4KB, then we need to set this
|
|
element to 2MB.
|
|
<span class="since">Since 5.0.0</span>
|
|
</p>
|
|
</dd>
|
|
|
|
<dt><code>pmem</code></dt>
|
|
<dd>
|
|
<p>
|
|
If persistent memory is supported and enabled by the hypervisor
|
|
in order to guarantee the persistence of writes to the vNVDIMM
|
|
backend, then use the <code>pmem</code> element in order to
|
|
utilize the feature.
|
|
<span class="since">Since 5.0.0</span>
|
|
</p>
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
|
|
<dt><code>target</code></dt>
|
|
<dd>
|
|
<p>
|
|
The mandatory <code>target</code> element configures the placement and
|
|
sizing of the added memory from the perspective of the guest.
|
|
</p>
|
|
<p>
|
|
The mandatory <code>size</code> subelement configures the size of the
|
|
added memory as a scaled integer.
|
|
</p>
|
|
<p>
|
|
The <code>node</code> subelement configures the guest NUMA node to
|
|
attach the memory to. The element shall be used only if the guest has
|
|
NUMA nodes configured.
|
|
</p>
|
|
<p>
|
|
The following optional elements may be used:
|
|
</p>
|
|
|
|
<dl>
|
|
<dt><code>label</code></dt>
|
|
<dd>
|
|
<p>
|
|
For NVDIMM type devices one can use <code>label</code> and its
|
|
subelement <code>size</code> to configure the size of
|
|
namespaces label storage within the NVDIMM module. The
|
|
<code>size</code> element has usual meaning described
|
|
<a href="#elementsMemoryAllocation">here</a>.
|
|
<code>label</code> is mandatory for pSeries guests and optional
|
|
for all other architectures.
|
|
For QEMU domains the following restrictions apply:
|
|
</p>
|
|
<ol>
|
|
<li>the minimum label size is 128KiB,</li>
|
|
<li>the remaining size (total-size - label-size), also called guest
|
|
area, will be aligned to 4KiB as default. For pSeries guests, the
|
|
guest area will be aligned down to 256MiB, and the minimum size
|
|
of the guest area must be at least 256MiB.</li>
|
|
</ol>
|
|
</dd>
|
|
|
|
<dt><code>readonly</code></dt>
|
|
<dd>
|
|
<p>
|
|
The <code>readonly</code> element is used to mark the vNVDIMM
|
|
as read-only. Only the real NVDIMM device backend can guarantee
|
|
the guest write persistence, so other backend types should use
|
|
the <code>readonly</code> element.
|
|
<span class="since">Since 5.0.0</span>
|
|
</p>
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
</dl>
|
|
|
|
<h4><a id="elementsIommu">IOMMU devices</a></h4>
|
|
|
|
<p>
|
|
The <code>iommu</code> element can be used to add an IOMMU device.
|
|
<span class="since">Since 2.1.0</span>
|
|
</p>
|
|
|
|
<p>
|
|
Example:
|
|
</p>
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<iommu model='intel'>
|
|
<driver intremap='on'/>
|
|
</iommu>
|
|
</devices>
|
|
...
|
|
</pre>
|
|
<dl>
|
|
<dt><code>model</code></dt>
|
|
<dd>
|
|
<p>
|
|
Supported values are <code>intel</code> (for Q35 guests) and,
|
|
<span class="since">since 5.5.0</span>, <code>smmuv3</code> (for
|
|
ARM virt guests).
|
|
</p>
|
|
</dd>
|
|
<dt><code>driver</code></dt>
|
|
<dd>
|
|
<p>
|
|
The <code>driver</code> subelement can be used to configure
|
|
additional options, some of which might only be available for
|
|
certain IOMMU models:
|
|
</p>
|
|
<dl>
|
|
<dt><code>intremap</code></dt>
|
|
<dd>
|
|
<p>
|
|
The <code>intremap</code> attribute with possible values
|
|
<code>on</code> and <code>off</code> can be used to
|
|
turn on interrupt remapping, a part of the VT-d functionality.
|
|
Currently this requires split I/O APIC
|
|
(<code><ioapic driver='qemu'/></code>).
|
|
<span class="since">Since 3.4.0</span> (QEMU/KVM only)
|
|
</p>
|
|
</dd>
|
|
<dt><code>caching_mode</code></dt>
|
|
<dd>
|
|
<p>
|
|
The <code>caching_mode</code> attribute with possible values
|
|
<code>on</code> and <code>off</code> can be used to
|
|
turn on the VT-d caching mode (useful for assigned devices).
|
|
<span class="since">Since 3.4.0</span> (QEMU/KVM only)
|
|
</p>
|
|
</dd>
|
|
<dt><code>eim</code></dt>
|
|
<dd>
|
|
<p>
|
|
The <code>eim</code> attribute (with possible values
|
|
<code>on</code> and <code>off</code>) can be used to
|
|
configure Extended Interrupt Mode. A q35 domain with
|
|
split I/O APIC (as described in
|
|
<a href="#elementsFeatures">hypervisor features</a>),
|
|
and both interrupt remapping and EIM turned on for
|
|
the IOMMU, will be able to use more than 255 vCPUs.
|
|
<span class="since">Since 3.4.0</span> (QEMU/KVM only)
|
|
</p>
|
|
</dd>
|
|
<dt><code>iotlb</code></dt>
|
|
<dd>
|
|
<p>
|
|
The <code>iotlb</code> attribute with possible values
|
|
<code>on</code> and <code>off</code> can be used to
|
|
turn on the IOTLB used to cache address translation
|
|
requests from devices.
|
|
<span class="since">Since 3.5.0</span> (QEMU/KVM only)
|
|
</p>
|
|
</dd>
|
|
<dt><code>aw_bits</code></dt>
|
|
<dd>
|
|
<p>
|
|
The <code>aw_bits</code> attribute can be used to set
|
|
the address width to allow mapping larger iova addresses
|
|
in the guest.
|
|
<span class="since">Since 6.5.0</span> (QEMU/KVM only)
|
|
</p>
|
|
</dd>
|
|
</dl>
|
|
</dd>
|
|
</dl>
|
|
|
|
<h3><a id="vsock">Vsock</a></h3>
|
|
|
|
<p>A vsock host/guest interface. The <code>model</code> attribute
|
|
defaults to <code>virtio</code>. <span class="since">Since 5.2.0</span>
|
|
<code>model</code> can also be 'virtio-transitional' and
|
|
'virtio-non-transitional', see
|
|
<a href="#elementsVirtioTransitional">Virtio transitional devices</a>
|
|
for more details.
|
|
The optional attribute <code>address</code> of the <code>cid</code>
|
|
element specifies the CID assigned to the guest. If the attribute
|
|
<code>auto</code> is set to <code>yes</code>, libvirt
|
|
will assign a free CID automatically on domain startup.
|
|
<span class="since">Since 4.4.0</span></p>
|
|
|
|
<pre>
|
|
...
|
|
<devices>
|
|
<vsock model='virtio'>
|
|
<cid auto='no' address='3'/>
|
|
</vsock>
|
|
</devices>
|
|
...</pre>
|
|
|
|
|
|
<h3><a id="seclabel">Security label</a></h3>
|
|
|
|
<p>
|
|
The <code>seclabel</code> element allows control over the
|
|
operation of the security drivers. There are three basic
|
|
modes of operation, 'dynamic' where libvirt automatically
|
|
generates a unique security label, 'static' where the
|
|
application/administrator chooses the labels, or 'none'
|
|
where confinement is disabled. With dynamic
|
|
label generation, libvirt will always automatically
|
|
relabel any resources associated with the virtual machine.
|
|
With static label assignment, by default, the administrator
|
|
or application must ensure labels are set correctly on any
|
|
resources, however, automatic relabeling can be enabled
|
|
if desired. <span class="since">'dynamic' since 0.6.1, 'static'
|
|
since 0.6.2, and 'none' since 0.9.10.</span>
|
|
</p>
|
|
|
|
<p>
|
|
If more than one security driver is used by libvirt, multiple
|
|
<code>seclabel</code> tags can be used, one for each driver and
|
|
the security driver referenced by each tag can be defined using
|
|
the attribute <code>model</code>
|
|
</p>
|
|
|
|
<p>
|
|
Valid input XML configurations for the top-level security label
|
|
are:
|
|
</p>
|
|
|
|
<pre>
|
|
<seclabel type='dynamic' model='selinux'/>
|
|
|
|
<seclabel type='dynamic' model='selinux'>
|
|
<baselabel>system_u:system_r:my_svirt_t:s0</baselabel>
|
|
</seclabel>
|
|
|
|
<seclabel type='static' model='selinux' relabel='no'>
|
|
<label>system_u:system_r:svirt_t:s0:c392,c662</label>
|
|
</seclabel>
|
|
|
|
<seclabel type='static' model='selinux' relabel='yes'>
|
|
<label>system_u:system_r:svirt_t:s0:c392,c662</label>
|
|
</seclabel>
|
|
|
|
<seclabel type='none'/>
|
|
</pre>
|
|
|
|
<p>
|
|
If no 'type' attribute is provided in the input XML, then
|
|
the security driver default setting will be used, which
|
|
may be either 'none' or 'dynamic'. If a 'baselabel' is set
|
|
but no 'type' is set, then the type is presumed to be 'dynamic'
|
|
</p>
|
|
|
|
<p>
|
|
When viewing the XML for a running guest with automatic
|
|
resource relabeling active, an additional XML element,
|
|
<code>imagelabel</code>, will be included. This is an
|
|
output-only element, so will be ignored in user supplied
|
|
XML documents
|
|
</p>
|
|
<dl>
|
|
<dt><code>type</code></dt>
|
|
<dd>Either <code>static</code>, <code>dynamic</code> or <code>none</code>
|
|
to determine whether libvirt automatically generates a unique security
|
|
label or not.
|
|
</dd>
|
|
<dt><code>model</code></dt>
|
|
<dd>A valid security model name, matching the currently
|
|
activated security model. Model <code>dac</code> is not available
|
|
when guest is run by unprivileged user.
|
|
</dd>
|
|
<dt><code>relabel</code></dt>
|
|
<dd>Either <code>yes</code> or <code>no</code>. This must always
|
|
be <code>yes</code> if dynamic label assignment is used. With
|
|
static label assignment it will default to <code>no</code>.
|
|
</dd>
|
|
<dt><code>label</code></dt>
|
|
<dd>If static labelling is used, this must specify the full
|
|
security label to assign to the virtual domain. The format
|
|
of the content depends on the security driver in use:
|
|
<ul>
|
|
<li>SELinux: a SELinux context.</li>
|
|
<li>AppArmor: an AppArmor profile.</li>
|
|
<li>
|
|
DAC: owner and group separated by colon. They can be
|
|
defined both as user/group names or uid/gid. The driver will first
|
|
try to parse these values as names, but a leading plus sign can
|
|
used to force the driver to parse them as uid or gid.
|
|
</li>
|
|
</ul>
|
|
</dd>
|
|
<dt><code>baselabel</code></dt>
|
|
<dd>If dynamic labelling is used, this can optionally be
|
|
used to specify the base security label that will be used to generate
|
|
the actual label. The format of the content depends on the security
|
|
driver in use.
|
|
|
|
The SELinux driver uses only the <code>type</code> field of the
|
|
baselabel in the generated label. Other fields are inherited from
|
|
the parent process when using SELinux baselabels.
|
|
|
|
(The example above demonstrates the use of <code>my_svirt_t</code>
|
|
as the value for the <code>type</code> field.)
|
|
</dd>
|
|
<dt><code>imagelabel</code></dt>
|
|
<dd>This is an output only element, which shows the
|
|
security label used on resources associated with the virtual domain.
|
|
The format of the content depends on the security driver in use
|
|
</dd>
|
|
</dl>
|
|
|
|
<p>When relabeling is in effect, it is also possible to fine-tune
|
|
the labeling done for specific source file names, by either
|
|
disabling the labeling (useful if the file lives on NFS or other
|
|
file system that lacks security labeling) or requesting an
|
|
alternate label (useful when a management application creates a
|
|
special label to allow sharing of some, but not all, resources
|
|
between domains), <span class="since">since 0.9.9</span>. When
|
|
a <code>seclabel</code> element is attached to a specific path
|
|
rather than the top-level domain assignment, only the
|
|
attribute <code>relabel</code> or the
|
|
sub-element <code>label</code> are supported. Additionally,
|
|
<span class="since">since 1.1.2</span>, an output-only
|
|
element <code>labelskip</code> will be present for active
|
|
domains on disks where labeling was skipped due to the image
|
|
being on a file system that lacks security labeling.
|
|
</p>
|
|
|
|
<h3><a id="keywrap">Key Wrap</a></h3>
|
|
|
|
<p>The content of the optional <code>keywrap</code> element specifies
|
|
whether the guest will be allowed to perform the S390 cryptographic key
|
|
management operations. A clear key can be protected by encrypting it
|
|
under a unique wrapping key that is generated for each guest VM running
|
|
on the host. Two variations of wrapping keys are generated: one version
|
|
for encrypting protected keys using the DEA/TDEA algorithm, and another
|
|
version for keys encrypted using the AES algorithm. If a
|
|
<code>keywrap</code> element is not included, the guest will be granted
|
|
access to both AES and DEA/TDEA key wrapping by default.</p>
|
|
|
|
<pre>
|
|
<domain>
|
|
...
|
|
<keywrap>
|
|
<cipher name='aes' state='off'/>
|
|
</keywrap>
|
|
...
|
|
</domain>
|
|
</pre>
|
|
<p>
|
|
At least one <code>cipher</code> element must be nested within the
|
|
<code>keywrap</code> element.
|
|
</p>
|
|
<dl>
|
|
<dt><code>cipher</code></dt>
|
|
<dd>The <code>name</code> attribute identifies the algorithm
|
|
for encrypting a protected key. The values supported for this attribute
|
|
are <code>aes</code> for encryption under the AES wrapping key, or
|
|
<code>dea</code> for encryption under the DEA/TDEA wrapping key. The
|
|
<code>state</code> attribute indicates whether the cryptographic key
|
|
management operations should be turned on for the specified encryption
|
|
algorithm. The value can be set to <code>on</code> or <code>off</code>.
|
|
</dd>
|
|
</dl>
|
|
|
|
<p>Note: DEA/TDEA is synonymous with DES/TDES.</p>
|
|
|
|
<h3><a id="launchSecurity">Launch Security</a></h3>
|
|
|
|
<p>
|
|
The contents of the <code><launchSecurity type='sev'></code> element
|
|
is used to provide the guest owners input used for creating an encrypted
|
|
VM using the AMD SEV feature (Secure Encrypted Virtualization).
|
|
|
|
SEV is an extension to the AMD-V architecture which supports running
|
|
encrypted virtual machine (VMs) under the control of KVM. Encrypted
|
|
VMs have their pages (code and data) secured such that only the guest
|
|
itself has access to the unencrypted version. Each encrypted VM is
|
|
associated with a unique encryption key; if its data is accessed to a
|
|
different entity using a different key the encrypted guests data will
|
|
be incorrectly decrypted, leading to unintelligible data.
|
|
|
|
For more information see various input parameters and its format see the
|
|
<a href="https://support.amd.com/TechDocs/55766_SEV-KM_API_Specification.pdf">SEV API spec</a>
|
|
<span class="since">Since 4.4.0</span>
|
|
</p>
|
|
<pre>
|
|
<domain>
|
|
...
|
|
<launchSecurity type='sev'>
|
|
<policy>0x0001</policy>
|
|
<cbitpos>47</cbitpos>
|
|
<reducedPhysBits>1</reducedPhysBits>
|
|
<dhCert>RBBBSDDD=FDDCCCDDDG</dhCert>
|
|
<session>AAACCCDD=FFFCCCDSDS</session>
|
|
</launchSecurity>
|
|
...
|
|
</domain>
|
|
</pre>
|
|
|
|
<dl>
|
|
<dt><code>cbitpos</code></dt>
|
|
<dd>The required <code>cbitpos</code> element provides the C-bit (aka encryption bit)
|
|
location in guest page table entry. The value of <code>cbitpos</code> is
|
|
hypervisor dependent and can be obtained through the <code>sev</code> element
|
|
from the domain capabilities.
|
|
</dd>
|
|
<dt><code>reducedPhysBits</code></dt>
|
|
<dd>The required <code>reducedPhysBits</code> element provides the physical
|
|
address bit reducation. Similar to <code>cbitpos</code> the value of <code>
|
|
reduced-phys-bit</code> is hypervisor dependent and can be obtained
|
|
through the <code>sev</code> element from the domain capabilities.
|
|
</dd>
|
|
<dt><code>policy</code></dt>
|
|
<dd>The required <code>policy</code> element provides the guest policy
|
|
which must be maintained by the SEV firmware. This policy is enforced by
|
|
the firmware and restricts what configuration and operational commands
|
|
can be performed on this guest by the hypervisor. The guest policy
|
|
provided during guest launch is bound to the guest and cannot be changed
|
|
throughout the lifetime of the guest. The policy is also transmitted
|
|
during snapshot and migration flows and enforced on the destination platform.
|
|
|
|
The guest policy is a 4 unsigned byte with the fields shown in Table:
|
|
|
|
<table class="top_table">
|
|
<tr>
|
|
<th> Bit(s) </th>
|
|
<th> Description </th>
|
|
</tr>
|
|
<tr>
|
|
<td> 0 </td>
|
|
<td> Debugging of the guest is disallowed when set </td>
|
|
</tr>
|
|
<tr>
|
|
<td> 1 </td>
|
|
<td> Sharing keys with other guests is disallowed when set </td>
|
|
</tr>
|
|
<tr>
|
|
<td> 2 </td>
|
|
<td> SEV-ES is required when set</td>
|
|
</tr>
|
|
<tr>
|
|
<td> 3 </td>
|
|
<td> Sending the guest to another platform is disallowed when set</td>
|
|
</tr>
|
|
<tr>
|
|
<td> 4 </td>
|
|
<td> The guest must not be transmitted to another platform that is
|
|
not in the domain when set. </td>
|
|
</tr>
|
|
<tr>
|
|
<td> 5 </td>
|
|
<td> The guest must not be transmitted to another platform that is
|
|
not SEV capable when set. </td>
|
|
</tr>
|
|
<tr>
|
|
<td> 6:15 </td>
|
|
<td> reserved </td>
|
|
</tr>
|
|
<tr>
|
|
<td> 16:32 </td>
|
|
<td> The guest must not be transmitted to another platform with a
|
|
lower firmware version. </td>
|
|
</tr>
|
|
</table>
|
|
|
|
</dd>
|
|
<dt><code>dhCert</code></dt>
|
|
<dd>The optional <code>dhCert</code> element provides the guest owners
|
|
base64 encoded Diffie-Hellman (DH) key. The key is used to negotiate a
|
|
master secret key between the SEV firmware and guest owner. This master
|
|
secret key is then used to establish a trusted channel between SEV
|
|
firmware and guest owner.
|
|
</dd>
|
|
<dt><code>session</code></dt>
|
|
<dd>The optional <code>session</code> element provides the guest owners
|
|
base64 encoded session blob defined in the SEV API spec.
|
|
|
|
See SEV spec LAUNCH_START section for the session blob format.
|
|
</dd>
|
|
</dl>
|
|
|
|
<h2><a id="examples">Example configs</a></h2>
|
|
|
|
<p>
|
|
Example configurations for each driver are provide on the
|
|
driver specific pages listed below
|
|
</p>
|
|
|
|
<ul>
|
|
<li><a href="drvxen.html#xmlconfig">Xen examples</a></li>
|
|
<li><a href="drvqemu.html#xmlconfig">QEMU/KVM examples</a></li>
|
|
</ul>
|
|
</body>
|
|
</html>
|