With current perf framework, this patch adds support and documentation
for more perf events, including cache misses, cache references, cpu cycles,
and instructions.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Add support for using the new approach to hotplug vcpus using device_add
during startup of qemu to allow sparse vcpu topologies.
There are a few limitations imposed by qemu on the supported
configuration:
- vcpu0 needs to be always present and not hotpluggable
- non-hotpluggable cpus need to be ordered at the beginning
- order of the vcpus needs to be unique for every single hotpluggable
entity
Qemu also doesn't really allow to query the information necessary to
start a VM with the vcpus directly on the commandline. Fortunately they
can be hotplugged during startup.
The new hotplug code uses the following approach:
- non-hotpluggable vcpus are counted and put to the -smp option
- qemu is started
- qemu is queried for the necessary information
- the configuration is checked
- the hotpluggable vcpus are hotplugged
- vcpus are started
This patch adds a lot of checking code and enables the support to
specify the individual vcpu element with qemu.
Individual vCPU hotplug requires us to track the state of any vCPU. To
allow this add the following XML:
<domain>
...
<vcpu current='2'>3</vcpu>
<vcpus>
<vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
<vcpu id='1' enabled='yes' hotpluggable='yes' order='2'/>
<vcpu id='1' enabled='no' hotpluggable='yes'/>
</vcpus>
...
The 'enabled' attribute allows to control the state of the vcpu.
'hotpluggable' controls whether given vcpu can be hotplugged and 'order'
allows to specify the order to add the vcpus.
I apparently misunderstood Marcel's description of what could and
couldn't be plugged into qemu's pxb-pcie controller (known as
pcie-expander-bus in libvirt) - I specifically allowed directly
connecting a pcie-switch-upstream-port, and it turns out that causes
the guest kernel to crash.
This patch forbids such a connection, and updates the xml docs
appropriately.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1361172
This element will control secure boot implemented by some
firmwares. If the firmware used in <loader/> does support the
feature we must tell it to the underlying hypervisor. However, we
can't know whether loader does support it or not just by looking
at the file. Therefore we have to have an attribute to the
element where users can tell us whether the firmware is secure
boot enabled or not.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since its release of 2.4.0 qemu is able to enable System
Management Module in the firmware, or disable it. We should
expose this capability in the XML. Unfortunately, there's no good
way to determine whether the binary we are talking to supports
it. I mean, if qemu's run with real machine type, the smm
attribute can be seen in 'qom-list /machine' output. But it's not
there when qemu's run with -M none. Therefore we're stuck with
version based check.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1356937
Add the definitions to allow for viewing/setting cgroup period and quota
limits for IOThreads.
This is similar to the work done for emulator quota and period by
commit ids 'b65dafa' and 'e051c482'.
Being able to view/set the IOThread specific values is related to more
recent changes adding global period (commmit id '4d92d58f') and global
quota (commit id '55ecdae') definitions and qemu support (commit id
'4e17ff79' and 'fbcbd1b2'). With a global setting though, if somehow
the IOThread value in the cgroup hierarchy was set "outside of libvirt"
to a value that is incompatible with the global value.
Allowing control over IOThread specific values provides the capability
to alter the IOThread values as necessary.
According to libxl implementation, it supports pvusb
controller of version 1.1 and version 2.0, and it
supports two types of backend, 'pvusb' (dom0 backend)
and 'qusb' (qemu backend). But currently pvusb backend
is not checked in yet.
To match libxl support, extend usb controller schema
to support two more models: qusb1 (qusb, version 1.1)
and 'qusb2' (qusb version 2.0).
Signed-off-by: Chunyan Liu <cyliu@suse.com>
To allow using failover with gluster it's necessary to specify multiple
volume hosts. Add support for starting qemu with such configurations.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
This is place as a sub-element of <source>, where other aspects of the
host-side connection to the network device are located (network or
bridge name, udp listen port, etc). It's a bit odd that the interface
we're configuring with this info is itself named in <target dev='x'/>,
but that ship sailed long ago:
<interface type='ethernet'>
<mac address='00:16:3e:0f:ef:8a'/>
<source>
<ip address='192.168.122.12' family='ipv4'
prefix='24' peer='192.168.122.1'/>
<ip address='192.168.122.13' family='ipv4' prefix='24'/>
<route family='ipv4' address='0.0.0.0'
gateway='192.168.122.1'/>
<route family='ipv4' address='192.168.124.0' prefix='24'
gateway='192.168.124.1'/>
</source>
</interface>
In practice, this will likely only be useful for type='ethernet', so
its presence in any other type of interface is currently forbidden in
the generic device Validate function (but it's been put into the
general population of virDomainNetDef rather than the
ethernet-specific union member so that 1) we can more easily add the
capability to other types if needed, and 2) we can retain the info
when set to an invalid interface type all the way through to
validation and report a proper error, rather than just ignoring it
(which is currently what happens for many other type-specific
settings).
(NB: The already-existing configuration of IP info for the guest-side
of interfaces is in subelements directly under <interface>, and the
name of the guest-side interface (when configurable) is in <guest
dev='x'/>).
(This patch had been pushed earlier in
commit fe6a77898a, but was reverted in
commit d658456530 because it had been
accidentally pushed during the freeze for release 2.0.0)
The peer attribute is used to set the property of the same name in the
interface IP info:
<interface type='ethernet'>
...
<ip family='ipv4' address='192.168.122.5'
prefix='32' peer='192.168.122.6'/>
...
</interface>
Note that this element is used to set the IP information on the
*guest* side interface, not the host side interface - that will be
supported in an upcoming patch.
(This patch now has quite a history: it was originally pushed in
commit 690969af, which was subsequently reverted in commit 1d14b13f,
then reworked and pushed (along with a lot of other related/supporting
patches) in commit 93135abf1; however *that* commit had been
accidentally pushed during dev. freeze for release 2.0.0, so it was
again reverted in commit f6acf039f0).
Signed-off-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Signed-off-by: Laine Stump <laine@laine.org>
This is place as a sub-element of <source>, where other aspects of the
host-side connection to the network device are located (network or
bridge name, udp listen port, etc). It's a bit odd that the interface
we're configuring with this info is itself named in <target dev='x'/>,
but that ship sailed long ago:
<interface type='ethernet'>
<mac address='00:16:3e:0f:ef:8a'/>
<source>
<ip address='192.168.122.12' family='ipv4'
prefix='24' peer='192.168.122.1'/>
<ip address='192.168.122.13' family='ipv4' prefix='24'/>
<route family='ipv4' address='0.0.0.0'
gateway='192.168.122.1'/>
<route family='ipv4' address='192.168.124.0' prefix='24'
gateway='192.168.124.1'/>
</source>
</interface>
In practice, this will likely only be useful for type='ethernet', so
its presence in any other type of interface is currently forbidden in
the generic device Validate function (but it's been put into the
general population of virDomainNetDef rather than the
ethernet-specific union member so that 1) we can more easily add the
capability to other types, and 2) we can retain the info when set to
an invalid interface type all the way through to validation and report
a proper error, rather than just ignoring it (which is currently what
happens for many other type-specific settings).
(NB: The already-existing configuration of IP info for the guest-side
of interfaces is in subelements directly under <interface>, and the
name of the guest-side interface (when configurable) is in <guest
dev='x'/>).
The peer attribute is used to set the property of the same name in the
interface IP info:
<interface type='ethernet'>
...
<ip family='ipv4' address='192.168.122.5'
prefix='32' peer='192.168.122.6'/>
...
</interface>
Note that this element is used to set the IP information on the
*guest* side interface, not the host side interface - that will be
supported in an upcoming patch.
(This is an updated *re*-commit of commit 690969af, which was
subsequently reverted in commit 1d14b13f).
Signed-off-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Signed-off-by: Laine Stump <laine@laine.org>
In the case of chassisNr (used to set chassis_nr of a pci-bridge
controller), 0 is reserved for / used by the pci[e]-root bus. In the
base of busNr, a value of 0 would mean that the root bus had no places
available to plug in new buses, including the pxb itself (the
documentation I wrote for pxb even noted the limit of busNr as 1.254).
NB: oddly, the "chassis" attribute, which is used for pcie-root-port
and pcie-switch-downstream-port *can* be set to 0, since it's the
combination of {chassis, slot} that needs to be unique, not chassis by
itself (and slot 0 of pcie-root is reserved, while pcie-*-port can use
*only* slot 0).
This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1342962
There was no documentation at all for the XML part. I added at least
some. The 2.0.0 introduction date is deliberate as the parser for the
XML is broken.
The schema file was missing entries for 'mbml' and 'mbmt'.
This option allows or disallows detection of zero-writes if it is set to
"on" or "off", respectively. It can be also set to "unmap" in which
case it will try discarding that part of image based on the value of the
"discard" option.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This new listen type is currently supported only by spice graphics.
It's introduced to make it easier and clearer specify to not listen
anywhere in order to start a guest with OpenGL support.
The old way to do this was set spice graphics autoport='no' and don't
specify any ports. The new way is to use <listen type='none'/>. In
order to be able to migrate to old libvirt the migratable XML will be
generated without the listen element and with autoport='no'. Also the
old configuration will be automatically converted to the this listen
type.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
VNC graphics already supports sockets but only via 'socket' attribute.
This patch coverts that attribute into listen type 'socket'.
For backward compatibility we need to handle listen type 'socket' and 'socket'
attribute properly to support old XMLs and new XMLs. If both are provided they
have to match, if only one of them is provided we need to be able to parse that
configuration too.
To not break migration back to old libvirt if the socket is provided by user we
need to generate migratable XML without the listen element and use only 'socket'
attribute.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Hand-entering indexes for 20 PCI controllers is not as tedious as
manually determining and entering their PCI addresses, but it's still
annoying, and the algorithm for determining the proper index is
incredibly simple (in all cases except one) - just pick the lowest
unused index.
The one exception is USB2 controllers because multiple controllers in
the same group have the same index. For these we look to see if 1) the
most recently added USB controller is also a USB2 controller, and 2)
the group *that* controller belongs to doesn't yet have a controller
of the exact model we're just now adding - if both are true, the new
controller gets the same index, but in all other cases we just assign
the lowest unused index.
With this patch in place and combined with the automatic PCI address
assignment, we can define a PCIe switch with several ports like this:
<controller type='pci' model='pcie-root-port'/>
<controller type='pci' model='pcie-switch-upstream-port'/>
<controller type='pci' model='pcie-switch-downstream-port'/>
<controller type='pci' model='pcie-switch-downstream-port'/>
<controller type='pci' model='pcie-switch-downstream-port'/>
<controller type='pci' model='pcie-switch-downstream-port'/>
<controller type='pci' model='pcie-switch-downstream-port'/>
...
These will each get a unique index, and PCI addresses that connect
them together appropriately with no pesky numbers required.
Add a new element to <domain> XML:
<os>
<acpi>
<table type="slic">/path/to/acpi/table/file</table>
</acpi>
</os>
To supply a path to a SLIC (Software Licensing) ACPI
table blob.
https://bugzilla.redhat.com/show_bug.cgi?id=1327537
Prior to this, <address type='pci'/> wasn't allowed when parsing
(domain+bus+slot+function needed to be a "valid" PCI address, meaning
that at least one of domain/bus/slot had to be non-0), the RNG
required bus to be specified, and if type was set to PCI when
formatting, domain+bus+slot+function would always be output.
This makes all the address attributes optional during parse and RNG
validation, and suppresses domain+bus+slot+function if domain+bus+slot
are all 0 (NB: if d+b+s are all 0, any value for function is
nonsensical as that will never happen in the real world, and after
the next patch we will always assign a real working address to any
empty PCI address before it is ever output to anywhere).
Note that explicitly setting all attributes to 0 is equivalent to
setting none of them, which is okay, since 0000:00:00 is reserved in
any PCI bus setup, and can't be used anyway.
We support omitting listen attribute of graphics element so we should
also support omitting address attribute of listen element. This patch
also updates libvirt to always add a listen element into domain XML
except for VNC graphics if socket attribute is specified.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
SRIOV VFs used in macvtap passthrough mode can take advantage of the
SRIOV card's transparent vlan tagging. All the code was there to set
the vlan tag, and it has been used for SRIOV VFs used for hostdev
interfaces for several years, but for some reason, the vlan tag for
macvtap passthrough devices was stubbed out with a -1.
This patch moves a bit of common validation down to a lower level
(virNetDevReplaceNetConfig()) so it is shared by hostdev and macvtap
modes, and updates the macvtap caller to actually send the vlan config
instead of -1.
Add the ability to add an 'iothread' to the controller which will be how
virtio-scsi-pci and virtio-scsi-ccw iothreads have been implemented in qemu.
Describe the new functionality and add tests to parse/validate that the
new attribute can be added.
Rather than be specific about which devices in the <iothreads> description,
let's leave that for the <disk> description for it's <iothread> value.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This adds a ports= attribute to usb controller XML, like
<controller type='usb' model='nec-xhci' ports='8'/>
This maps to:
qemu -device nec-usb-xhci,p2=8,p3=8
Meaning, 8 ports that support both usb2 and usb3 devices. Gerd
suggested to just expose them as one knob.
https://bugzilla.redhat.com/show_bug.cgi?id=1271408
If a panic device is being defined without a model in a domain
the default value is always overwritten with model ISA. An ISA
bus does not exist on S390 and therefore specifying a panic device
results in an unsupported configuration.
Since the S390 architecture inherently provides a crash detection
capability the panic device should be defined in the domain xml.
This patch adds an s390 panic device model and prevents setting a
device address on it.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
This reverts commit 690969af9c, which
added the domain config parts to support a "peer" attribute in domain
interface <ip> elements.
It's being removed temporarily for the release of libvirt 1.3.4
because the feature doesn't work, and there are concerns that it may
need to be modified in an externally visible manner which could create
backward compatibility problems.
Currently we only allow /dev/random and /dev/hwrng as host input
for <rng><backend model='random'/> device. This was added after
various upstream discussions in commit 4932ef45
However this restriction has generated quite a few complaints over
the years, so a new discussion was initiated:
http://www.redhat.com/archives/libvir-list/2016-April/msg00987.html
Several people suggested removing the restriction, and nobody really
spoke up to defend it. So this patch drops the path restriction
entirely
https://bugzilla.redhat.com/show_bug.cgi?id=1074464
When describing attributes and elements, we mostly stick to
a certain pattern; however, there are a few cases when the
information is not presented in the usual way.
Since there doesn't seem to be any reason not to follow the
tried and true formula, rework those bits to fit the rest of
the documentation.
This controller provides a single PCIe port on a new root. It is
similar to pci-expander-bus, intended to provide a bus that can be
associated with a guest-identifiable NUMA node, but is for
machinetypes with PCIe rather than PCI (e.g. q35-based machinetypes).
Aside from PCIe vs. PCI, the other main difference is that a
pci-expander-bus has a companion pci-bridge that is automatically
attached along with it, but pcie-expander-bus has only a single port,
and that port will only connect to a pcie-root-port, or to a
pcie-switch-upstream-port. In order for the bus to be of any use in
the guest, it must have either a pcie-root-port or a
pcie-switch-upstream-port attached (and one or more
pcie-switch-downstream-ports attached to the
pcie-switch-upstream-port).
This is a standard PCI root bus (not a bridge) that can be added to a
440fx-based domain. Although it uses a PCI slot, this is *not* how it
is connected into the PCI bus hierarchy, but is only used for
control. Each pci-expander-bus provides 32 slots (0-31) that can
accept hotplug of standard PCI devices.
The usefulness of pci-expander-bus relative to a pci-bridge is that
the NUMA node of the bus can be specified with the <node> subelement
of <target>. This gives guest-side visibility to the NUMA node of
attached devices (presuming that management apps only assign a device
to a bus that has a NUMA node number matching the node number of the
device on the host).
Each pci-expander-bus also has a "busNr" attribute. The expander-bus
itself will take the busNr specified, and all buses that are connected
to this bus (including the pci-bridge that is automatically added to
any expander bus of model "pxb" (see the next commit)) will use
busNr+1, busNr+2, etc, and the pci-root (or the expander-bus with next
lower busNr) will use bus numbers lower than busNr.
This cleanups the documentation, reformat some of the paragraphs to use
<p> instead of </br> and rewrites the listen part to be more extendable.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
In order to follow recent comments which indicate support for specific
feature bits are supported by a specific QEMU version add the version
from whence the relaxed, vapic, and spinlocks support was added.
This patch adds support for "vpindex", "runtime", "synic",
"stimer", and "vendor_id" features available in qemu 2.5+.
- When Hyper-V "vpindex" is on, guest can use MSR HV_X64_MSR_VP_INDEX
to get virtual processor ID.
- Hyper-V "runtime" enlightement feature allows to use MSR
HV_X64_MSR_VP_RUNTIME to get the time the virtual processor consumes
running guest code, as well as the time the hypervisor spends running
code on behalf of that guest.
- Hyper-V "synic" stands for Synthetic Interrupt Controller, which is
lapic extension controlled via MSRs.
- Hyper-V "stimer" switches on Hyper-V SynIC timers MSR's support.
Guest can setup and use fired by host events (SynIC interrupt and
appropriate timer expiration message) as guest clock events
- Hyper-V "reset" allows guest to reset VM.
- Hyper-V "vendor_id" exposes hypervisor vendor id to guest.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Most hypervisors use Hardware Assisted Paging by default and don't
require specifying the feature in domain conf. But some hypervisors
support disabling HAP on a per-domain basis. To enable HAP by default
yet provide a knob to disable it, extend the <hap> feature with a
'state=on|off' attribute, similar to <pvspinlock> and <vmport> features.
In the absence of <hap>, the hypervisor default (on) is used. <hap>
without the state attribute would be the same as <hap state='on'/> for
backwards compatibility. And of course <hap state='off'/> disables hap.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Extend the chardev source XML so that there is a new optional
<log/> element, which is applicable to all character device
backend types. For example, to log output of a TCP backed
serial port
<serial type='tcp'>
<source mode='connect' host='127.0.0.1' service='9999'/>
<protocol type='raw'/>
<log file='/var/log/libvirt/qemu/demo-serial0.log' append='on'/>
<target port='0'/>
</serial>
Not all hypervisors will support use of logfiles.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This attribute is used to extend secondary PCI bar and expose it to the
guest as 64bit memory. It works like this: attribute vram is there to
set size of secondary PCI bar and guest sees it as 32bit memory,
attribute vram64 can extend this secondary PCI bar. If both attributes
are used, guest sees two memory bars, both address the same memory, with
the difference that the 32bit bar can address only the first part of the
whole memory.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1260749
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Add Spice graphics gl attribute. qemu 2.6 should have -spice gl=on argument to
enable opengl rendering context (patches on the ML). This is necessary to
actually enable virgl rendering.
Add a qemuxml2argv test for virtio-gpu + spice with virgl.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Recent changes to the handling of GIC version, specifically commit
2a7b11eafb, have clearly defined what values are acceptable for the
version attribute of the <gic> element. Update the documentation
accordingly.
Excessive memory balloon inflation can cause invocation of OOM-killer,
when Linux is under severe memory pressure. QEMU memballoon device
has a feature to release some memory at the last moment before some
process will be get killed by OOM-killer.
Introduce a new optional balloon device attribute 'autodeflate' to
enable or disable this feature.
If no port number was provided for a storage pool libvirt defaults to
port 6789; however, librbd/librados already default to 6789 when no port
number is provided.
In the future Ceph will switch to a new port for the Ceph monitors since
port 6789 is already assigned to a different application by IANA.
Port 6789 is assigned to SMC-HTTPS and Ceph now has port 3300 assigned as
the 'Ceph monitor' port.
In this case it is the best solution to not hardcode any port number into
libvirt and let librados handle the connection.
Only if a user specifies a different port number we pass it down to librados,
otherwise we leave it blank.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
merge
To be used by the family of virtio input devices:
<input type='mouse' bus='virtio'/>
<input type='tablet' bus='virtio'/>
<input type='keyboard' bus='virtio'/>
https://bugzilla.redhat.com/show_bug.cgi?id=1231114
qemu 2.5 provides virtio video device. It can be used with -device
virtio-vga for primary devices, or -device virtio-gpu for non-vga
devices. However, only the primary device (VGA) is supported with this
patch.
Reference:
https://bugzilla.redhat.com/show_bug.cgi?id=1195176
Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
'model' attribute was added to a panic device but only one panic
device is allowed. This patch changes panic device presence
from 'optional' to 'zeroOrMore'.
Libvirt already has two types of panic devices - pvpanic and pSeries firmware.
This patch introduces the 'model' attribute and a new type of panic device.
'isa' model is for ISA pvpanic device.
'pseries' model is a default value for pSeries guests.
'hyperv' model is the new type. It's used for Hyper-V crash.
Schema and docs are updated for the new attribute.
Adjust the config code so that it does not enforce that target memory
node is specified. To avoid breakage, adjust the qemu memory hotplug
config checker to disallow such config for now.
The example pvspinlock XML is:
<pvspinlock/>
While this is accepted by libvirt and works correctly, it's currently
always output as a tristate like
<pvspinlock state='on'/>
So document that format instead
Adds a new interface type using UDP sockets, this seems only applicable
to QEMU but have edited tree-wide to support the new interface type.
The interface type required the addition of a "localaddr" (local
address), this then maps into the following xml and qemu call.
<interface type='udp'>
<mac address='52:54:00:5c:67:56'/>
<source address='127.0.0.1' port='11112'>
<local address='127.0.0.1' port='22222'/>
</source>
<model type='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
</interface>
QEMU call:
-net socket,udp=127.0.0.1:11112,localaddr=127.0.0.1:22222
Notice the xml "local" entry becomes the "localaddr" for the qemu call.
reference:
http://lists.gnu.org/archive/html/qemu-devel/2011-11/msg00629.html
Signed-off-by: Jonathan Toppins <jtoppins@cumulusnetworks.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
This controller can be connected only to a port on a
pcie-switch-upstream-port. It provides a single hotpluggable port that
will accept any PCI or PCIe device, as well as any device requiring a
pcie-*-port (the only current example of such a device is the
pcie-switch-upstream-port).
This controller can be connected only to a pcie-root-port or a
pcie-switch-downstream-port (which will be added in a later patch),
which is the reason for the new connect type
VIR_PCI_CONNECT_TYPE_PCIE_PORT. A pcie-switch-upstream-port provides
32 ports (slot=0 to slot=31) on the downstream side, which can only
have pci controllers of model "pcie-switch-downstream-port" plugged
into them, which is the reason for the other new connect type
VIR_PCI_CONNECT_TYPE_PCIE_SWITCH.
This controller can be connected (at domain startup time only - not
hotpluggable) only to a port on the pcie root complex ("pcie-root" in
libvirt config), hence the new connect type
VIR_PCI_CONNECT_TYPE_PCIE_ROOT. It provides a hotpluggable port that
will accept any PCI or PCIe device.
New attributes must be added to the controller <target> subelement for
this - chassis and port are guest-visible option values that will be
set by libvirt with values derived from the controller's index and pci
address information.
There are some configuration options to some types of pci controllers
that are currently automatically derived from other parts of the
controller's configuration. For example, in qemu a pci-bridge
controller has an option that is called "chassis_nr"; up until now
libvirt has always set chassis_nr to the index of the pci-bridge. So
this:
<controller type='pci' model='pci-bridge' index='2'/>
will always result in:
-device pci-bridge,chassis_nr=2,...
on the qemu commandline. In the future we may decide there is a better
way to derive that option, but even in that case we will need for
existing domains to retain the same chassis_nr they were using in the
past - that is something that is visible to the guest so it is part of
the guest ABI and changing it would lead to problems for migrating
guests (or just guests with very picky OSes).
The <target> subelement has been added as a place to put the new
"chassisNr" attribute that will be filled in by libvirt when it
auto-generates the chassisNr; it will be saved in the config, then
reused any time the domain is started:
<controller type='pci' model='pci-bridge' index='2'>
<model type='pci-bridge'/>
<target chassisNr='2'/>
</controller>
The one oddity of all this is that if the controller configuration
is changed (for example to change the index or the pci address
where the controller is plugged in), the items in <target> will
*not* be re-generated, which might lead to conflict. I can't
really see any way around this, but fortunately if there is a
material conflict qemu will let us know and we will pass that on
to the user.
This new subelement is used in PCI controllers: the toplevel
*attribute* "model" of a controller denotes what kind of PCI
controller is being described, e.g. a "dmi-to-pci-bridge",
"pci-bridge", or "pci-root". But in the future there will be different
implementations of some of those types of PCI controllers, which
behave similarly from libvirt's point of view (and so should have the
same model), but use a different device in qemu (and present
themselves as a different piece of hardware in the guest). In an ideal
world we (i.e. "I") would have thought of that back when the pci
controllers were added, and used some sort of type/class/model
notation (where class was used in the way we are now using model, and
model was used for the actual manufacturer's model number of a
particular family of PCI controller), but that opportunity is long
past, so as an alternative, this patch allows selecting a particular
implementation of a pci controller with the "name" attribute of the
<model> subelement, e.g.:
<controller type='pci' model='dmi-to-pci-bridge' index='1'>
<model name='i82801b11-bridge'/>
</controller>
In this case, "dmi-to-pci-bridge" is the kind of controller (one that
has a single PCIe port upstream, and 32 standard PCI ports downstream,
which are not hotpluggable), and the qemu device to be used to
implement this kind of controller is named "i82801b11-bridge".
Implementing the above now will allow us in the future to add a new
kind of dmi-to-pci-bridge that doesn't use qemu's i82801b11-bridge
device, but instead uses something else (which doesn't yet exist, but
qemu people have been discussing it), all without breaking existing
configs.
(note that for the existing "pci-bridge" type of PCI controller, both
the model attribute and <model> name are 'pci-bridge'. This is just a
coincidence, since it turns out that in this case the device name in
qemu really is a generic 'pci-bridge' rather than being the name of
some real-world chip)
"Further" clarification (and testing) shows that using a SCSI Fibre
Channel NPIV device/lun from a storage pool as a <disk type='volume'
device'lun'> will work. So just add that to the allowable options
Related to: https://bugzilla.redhat.com/show_bug.cgi?id=1230179
Rather than calling virDomainHostdevAssignAddress during the parsing
of the XML, move the setting of a default hostdev address to domain/
device post processing.
Since the parse code no longer generates an address, we can remove
the virDomainDefMaybeAddHostdevSCSIcontroller since the call to
virDomainHostdevAssignAddress will attempt to add the controllers
that were not already defined in the XML.
This patch will also enforce that the address type is type 'drive'
when a SCSI subsystem <hostdev> element is provided with an <address>.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The information on companion controllers we give in our documentation is
rather sparse. For example, it looks like any controller can be used as
a companion one. Also, when using ich9-uhci2, for example, we are able
to set some sensible defaults, but it might get confusing for the user
as we don't do that for all controller models.
https://bugzilla.redhat.com/show_bug.cgi?id=1069590
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Update the descriptions for disk and hostdev sgio in order to indicate
not all hypervisors and OS's support this feature
Signed-off-by: John Ferlan <jferlan@redhat.com>
While re-reading what I wrote for commit id '785a8940e', I realized
I needed to clarify that being able to present as a 'lun', the mode
property for the pool source element needed to be "host" (or empty)
and not "direct".
It was described correctly later in the mode host description, but
this just ensures it's not missed here as well.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Also move the mention of version numbers for the various PCI
controller models up to the end of the sentence where they are first
given, to avoid confusion.
When support for the pcie-root and dmi-to-pci-bridge buses on a Q35
machinetype was added, I was concerned that even though qemu at the
time allowed plugging a PCI device into a PCIe port, that it might not
be supported in the future. To prevent painful backtracking in the
possible future where this happened, I disallowed such connections
except in a few specific cases requested by qemu developers (indicated
in the code with the flag VIR_PCI_CONNECT_TYPE_EITHER_IF_CONFIG).
Now that a couple years have passed, there is a clear message from
qemu that there is no danger in allowing PCI devices to be plugged
into PCIe ports. This patch eliminates
VIR_PCI_CONNECT_TYPE_EITHER_IF_CONFIG and changes the code to always
allow PCI->PCIe or PCIe->PCI connection *when the PCI address is
specified in the config. (For newly added devices that haven't yet
been given a PCI address, the auto-placement still prefers using the
correct type of bus).
This patch provides support for the new watchdog model "diag288".
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Daniel Hansel <daniel.hansel@linux.vnet.ibm.com>
Reviewed-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
This patch provides support for a new watchdog action "inject-nmi" which
allows to define an inject of a non-maskable interrupt into a guest.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Daniel Hansel <daniel.hansel@linux.vnet.ibm.com>
Reviewed-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
The type='scsi' parameter of an address element is ignored
if placed within a hostdev section, and rejected by the XML
schema used by virt-xml-validate. Remove it from the doc,
and correct a typo in the remaining address arguments.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Reviewed-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Defining a domain with a SCSI disk attached via a hostdev
tag and a source address unit value longer than two digits
causes an error when editing the domain with virsh edit,
even if no changes are made to the domain definition.
The error suggests invalid XML, somewhere:
# virsh edit lmb_guest
error: XML document failed to validate against schema:
Unable to validate doc against /usr/local/share/libvirt/schemas/domain.rng
Extra element devices in interleave
Element domain failed to validate content
The virt-xml-validate tool fails with a similar error:
# virt-xml-validate lmb_guest.xml
Relax-NG validity error : Extra element devices in interleave
lmb_guest.xml:17: element devices: Relax-NG validity error :
Element domain failed to validate content
lmb_guest.xml fails to validate
The hostdev tag requires a source address to be specified,
which includes bus, target, and unit address attributes.
According to the SCSI Architecture Model spec (section
4.9 of SAM-2), a LUN address is 64 bits and thus could be
up to 20 decimal digits long. Unfortunately, the XML
schema limits this string to just two digits. Similarly,
the target field can be up to 32 bits in length, which
would be 10 decimal digits.
# lsscsi -xx
[0:0:19:0x4022401100000000] disk IBM 2107900 3.44 /dev/sda
# lsscsi
[0:0:19:1074872354]disk IBM 2107900 3.44 /dev/sda
# cat lmb_guest.xml
<domain type='kvm'>
<name>lmb_guest</name>
<memory unit='MiB'>1024</memory>
...trimmed...
<devices>
<controller type='scsi' model='virtio-scsi' index='0'/>
<hostdev mode='subsystem' type='scsi'>
<source>
<adapter name='scsi_host0'/>
<address bus='0' target='19' unit='1074872354'/>
</source>
</hostdev>
...trimmed...
Since the reference unit and target fields are used in
several places in the XML schema, create a separate one
specific for SCSI Logical Units that will permit the
greater length. This permits both the validation utility
and the virsh edit command to succeed when a hostdev
tag is included.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Reviewed-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1220527
This type of information defines attributes of a system
baseboard. With one exception: board type is yet not implemented
in qemu so it's not introduced here either.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since the background for Admin API is merged upstream, we are bumping
the minor release version as discussed previously
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1021480
Seems the property has been deprecated for qemu, although seemingly ignored.
This patch enforces from a libvirt perspective that a scsi-block 'lun'
device should not provide the 'serial' property.
https://bugzilla.redhat.com/show_bug.cgi?id=1228007
When attaching a scsi volume lun via the attach-device --config or
--persistent options, there was no translation of the source pool
like there was for the live path, thus the attempt to modify the config
would fail since not enough was known about the disk.
This patch adds the support of queues attribute of the driver element
for vhost-user interface type. Example:
<interface type='vhostuser'>
<mac address='52:54:00:ee:96:6d'/>
<source type='unix' path='/tmp/vhost2.sock' mode='client'/>
<model type='virtio'/>
<driver queues='4'/>
</interface>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1207692
Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The guest firmware provides the same functionality as the pvpanic
device, and the relevant element should always be present in the
domain XML to reflect this fact, so add it after parsing the
definition if it wasn't there already.
The guest firmware provides the same functionality as the pvpanic
device, which is not available in QEMU on pSeries, so the domain
XML should be allowed to contain the <panic> element.
On the other hand, unlike the pvpanic device, the guest firmware
can't be configured, so report an error if an address has been
provided in the XML.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1182388
https://bugzilla.redhat.com/show_bug.cgi?id=998813
Like usb-serial, the pci-serial device allows a serial device to be
attached to PCI bus. An example XML looks like this:
<serial type='dev'>
<source path='/dev/ttyS2'/>
<target type='pci-serial' port='0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</serial>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Two new domain configuration XML elements are added to enable/disable
the protected key management operations for a guest:
<domain>
...
<keywrap>
<cipher name='aes|dea' state='on|off'/>
</keywrap>
...
</domain>
Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Signed-off-by: Daniel Hansel <daniel.hansel@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Some platforms, like aarch64, don't have APIC but GIC. So there's
no reason to have <apic/> feature turned on. However, we are
still missing <gic/> feature. This commit introduces the feature
to XML parser and formatter, adds documentation and updates RNG
schema.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
A new feature that can be turned on or off.
The QEMU machine vmport option allows to set the VMWare IO port
emulation. This emulation is useful for absolute pointer input when the
guest has vmware input drivers, and is enabled by default for kvm.
However it is unnecessary for Spice-enabled VM, since the agent already
handles absolute pointer and multi-monitors. Furthermore, it prevents
Spice from switching to relative input since the regular ps/2 pointer
driver is replaced by the vmware driver. It is thus advised to disable
vmport when using a Spice VM. This will permit the Spice client to
switch from absolute to relative pointer, as it may be required for
certain games or applications.
With iothreadid's allowing any 'id' value for an iothread_id, the
iothreadsched code needs a slight adjustment to allow for "any"
unsigned int value in order to create the bitmap of ids that will
have scheduler adjustments. Adjusted the doc description as well.
Remove the iothreadspin array from cputune and replace with a cpumask
to be stored in the iothreadids list.
Adjust the test output because our printing goes in order of the iothreadids
list now.
Adding a new XML element 'iothreadids' in order to allow defining
specific IOThread ID's rather than relying on the algorithm to assign
IOThread ID's starting at 1 and incrementing to iothreads count.
This will allow future patches to be able to add new IOThreads by
a specific iothread_id and of course delete any exisiting IOThread.
Each iothreadids element will have 'n' <iothread> children elements
which will have attribute "id". The "id" will allow for definition
of any "valid" (eg > 0) iothread_id value.
On input, if any <iothreadids> <iothread>'s are provided, they will
be marked so that we only print out what we read in.
On input, if no <iothreadids> are provided, the PostParse code will
self generate a list of ID's starting at 1 and going to the number
of iothreads defined for the domain (just like the current algorithm
numbering scheme). A future patch will rework the existing algorithm
to make use of the iothreadids list.
On output, only print out the <iothreadids> if they were read in.
because network address is required by route, so
here we should add one avoid user misunderstand.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
The virNodeDeviceDettach API only works on PCI devices.
Originally added by commit 10d3272e, but the API never
supported USB devices.
Reported by: Martin Polednik <mpolednik@redhat.com>
This patch adds code that parses and formats configuration for memory
devices.
A simple configuration would be:
<memory model='dimm'>
<target>
<size unit='KiB'>524287</size>
<node>0</node>
</target>
</memory>
A complete configuration of a memory device:
<memory model='dimm'>
<source>
<pagesize unit='KiB'>4096</pagesize>
<nodemask>1-3</nodemask>
</source>
<target>
<size unit='KiB'>524287</size>
<node>1</node>
</target>
</memory>
This patch preemptively forbids use of the <memory> device in individual
drivers so the users are warned right away that the device is not
supported.
To enable memory hotplug the maximum memory size and slot count need to
be specified. As qemu supports now other units than mebibytes when
specifying memory, use the new interface in this case.
Add a XML element that will allow to specify maximum supportable memory
and the count of memory slots to use with memory hotplug.
To avoid possible confusion and misuse of the new element this patch
also explicitly forbids the use of the maxMemory setting in individual
drivers's post parse callbacks. This limitation will be lifted when the
support is implemented.
I spent quite some time figuring that backingStore info
isn't included in the dom xml, unless guest is up and
running. Hopefully putting that in the doc should help.
Also, several people have complained that libvirt reports
a backing file as raw, even though they expected it to be
qcow2; where the culprit is usually the user forgetting to
create the file with qemu-img create -o backing_fmt=qcow2.
This patch adds that info to the doc.
Signed-off-by: Deepak C Shetty <deepakcs@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Midonet is an opensource virtual networking that over lays the IP
network between hypervisors. Currently, such networks can be made
with the openvswitch virtualport type.
This patch, defines the schema and documentation that will serve
as basis for the follow up patches that will add support to libvirt
for using Midonet virtual ports for its interfaces. The schema
definition requires that the port profile expresses its interfaceid
as part of the port profile. For that reason, this is part of the
patch too.
Signed-off-by: Antoni Segura Puimedon <toni+libvirt@midokura.com>
We're parsing memballoon status period as unsigned int, but when we're
trying to set it, both we and qemu use signed int. That means large
values will get wrapped around to negative one resulting in error.
Basically the same problem as commit e3a7b874 was dealing with when
updating live domain.
QEMU changed the accepted value to int64 in commit 1f9296b5, but even
values as INT_MAX don't make sense since the value passed means seconds.
Hence adding capability flag for this change isn't worth it.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1140958
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The version attribute in redirdev filters refers to the revision
of the device, not the version of the USB protocol.
Explicitly state that this is not the USB protocol and remove references
to those round version numbers that resemble USB protocol versions.
https://bugzilla.redhat.com/show_bug.cgi?id=1177237
To prevent a confusion about missing chardev argument in qemu
command line add a note about that behavior into documentation.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1129198
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
There was a mess in the way how we store unlimited value for memory
limits and how we handled values provided by user. Internally there
were two possible ways how to store unlimited value: as 0 value or as
VIR_DOMAIN_MEMORY_PARAM_UNLIMITED. Because we chose to store memory
limits as unsigned long long, we cannot use -1 to represent unlimited.
It's much easier for us to say that everything greater than
VIR_DOMAIN_MEMORY_PARAM_UNLIMITED means unlimited and leave 0 as valid
value despite that it makes no sense to set limit to 0.
Remove unnecessary function virCompareLimitUlong. The update of test
is to prevent the 0 to be miss-used as unlimited in future.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1146539
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Our documentation isn't 100% clear about hostdev 'managed' attribute usage,
because it only makes sense to use it with PCI devices, yet we format
this attribute to all hostdev devices. By adding a note into the docs,
we can possibly avoid confusion from customer's side and also avoid a solution
using ternary logic.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1155887
At least Xen supports backend drivers in another domain (aka "driver
domain"). This patch introduces an XML config option for specifying the
backend domain name for <disk> and <interface> devices. E.g.
<disk>
<backenddomain name='diskvm'/>
...
</disk>
<interface type='bridge'>
<backenddomain name='netvm'/>
...
</interface>
In the future, same option will be needed for USB devices (hostdev
objects), but for now libxl doesn't have support for PVUSB.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Add an XML attribute to allow disabling merge of rx buffers
on the host:
<interface ...>
...
<model type='virtio'/>
<driver ...>
<host mrg_rxbuf='off'/>
</driver>
</interface>
https://bugzilla.redhat.com/show_bug.cgi?id=1186886
In order for QEMU vCPU (and other) threads to run with RT scheduler,
libvirt needs to take care of that so QEMU doesn't have to run privileged.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1178986
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Change the wording in the device-address-part of the docmunentation since
the ccw bus address support added to the optional address parameter of
virsh attach-disk for S390.
Signed-off-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
It is only usable for NETWORK and BRIDGE type interfaces.
Error out when trying to start a domain where the custom
tap device path is specified for interfaces of other types,
or when the daemon is not privileged.
Note that this cannot be checked at definition time, because
the comparison is against actual type.
https://bugzilla.redhat.com/show_bug.cgi?id=1147195
It is only supported for virtio adapters.
Silently drop it if it was specified for other models,
as is done for other virtio attributes.
Also mention this in the documentation.
https://bugzilla.redhat.com/show_bug.cgi?id=1147195
https://bugzilla.redhat.com/show_bug.cgi?id=1170492
In one of our previous commits (dc8b7ce7) we've done a functional
change even though it was intended as pure refactor. The problem is,
that the following XML:
<vcpu placement='static' current='2'>6</vcpu>
<cputune>
<emulatorpin cpuset='1-3'/>
</cputune>
<numatune>
<memory mode='strict' placement='auto'/>
</numatune>
gets translated into this one:
<vcpu placement='auto' current='2'>6</vcpu>
<cputune>
<emulatorpin cpuset='1-3'/>
</cputune>
<numatune>
<memory mode='strict' placement='auto'/>
</numatune>
We should not change the vcpu placement mode. Moreover, we're doing
something similar in case of emulatorpin and iothreadpin. If they were
set, but vcpu placement was auto, we've mistakenly removed them from
the domain XML even though we are able to set them independently on
vcpus.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Ploop is a pseudo device which makeit possible to access
to an image in a file as a block device. Like loop devices,
but with additional features, like snapshots, write tracker
and without double-caching.
It used in PCS for containers and in OpenVZ. You can manage
ploop devices and images with ploop utility
(http://git.openvz.org/?p=ploop).
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
QEMU supports feature specification with -cpu host and we just skip
using that. Since QEMU developers themselves would like to use this
feature, this patch modifies the code to work.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1178850
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
It was brought to my attention that some -boot options may not
work with UEFI. For instance, rebootTimeout is very SeaBIOS
specific,splash logo is not implemented yet on OVMF, and so on.
We should document this limitation at least.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>