Added <capabilities> in the <features> section of LXC domains
configuration. This section can contain elements named after the
capabilities like:
<mknod state="on"/>, keep CAP_MKNOD capability
<sys_chroot state="off"/> drop CAP_SYS_CHROOT capability
Users can restrict or give more capabilities than the default using
this mechanism.
Introduce a new function to parse the provided scsi_host parent address
and unique_id value in order to find the /sys/class/scsi_host directory
which will allow a stable SCSI host address
Add a test to scsihosttest to lookup the host# name by using the PCI address
and unique_id value
Add an optional unique_id parameter to nodedev. Allows for easier lookup
and display of the unique_id value in order to document for use with
scsi_host code.
Between reboots and kernel reloads, the SCSI host number used for SCSI
storage pools may change requiring modification to the storage pool XML
in order to use a specific SCSI host adapter.
This patch introduces the "parentaddr" element and "unique_id" attribute
for the SCSI host adapter in order to uniquely identify the adapter
between reboots and kernel reloads. For now the goal is to only parse
and format the XML. Both will be required to be provided in order to
uniquely identify the desired SCSI host.
The new XML is expected to be as follows:
<adapter type='scsi_host'>
<parentaddr unique_id='3'>
<address domain='0x0000' bus='0x00' slot='0x1f' func='0x2'/>
</parentaddr>
</adapter>
where "parentaddr" is the parent device of the SCSI host using the PCI
address on which the device resides and the value from the unique_id file
for the device. Both the PCI address and unique_id values will be used
to traverse the /sys/class/scsi_host/ directories looking at each link
to match the PCI address reformatted to the directory link format where
"domain🚌slot:function" is found. Then for each matching directory
the unique_id file for the scsi_host will be used to match the unique_id
value in the xml.
For a PCI address listed above, this will be formatted to "0000:00:1f.2"
and the links in /sys/class/scsi_host will be used to find the host#
to be used for the 'scsi_host' device. Each entry is a link to the
/sys/bus/pci/devices directories, e.g.:
% ls -al /sys/class/scsi_host/host2
lrwxrwxrwx. 1 root root 0 Jun 1 00:22 /sys/class/scsi_host/host2 -> ../../devices/pci0000:00/0000:00:1f.2/ata3/host2/scsi_host/host2
% cat /sys/class/scsi_host/host2/unique_id
3
The "parentaddr" and "name" attributes are mutually exclusive to identify
the SCSI host number. Use of the "parentaddr" element will be the preferred
mechanism.
This patch only supports to parse and format the XMLs. Later patches will
add code to find out the scsi host number.
Gluster volumes don't start with a leading slash. Our schema for netfs
gluster pools enforces it though. Luckily mount.glusterfs skips it.
Allow a slashless volume name for glusterfs netfs mounts in the schema.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1101999
libvirt supports pci domain already, so update the documentation.
Otherwise users who lookup the documentation for how to use hostdev may
miss the domain and encounter error when pass-through a pci device in a
domain other than 0.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Disk type 'lun' enables SCSI command passthrough for a disk. We stated
that it works only with "block" disks. Qemu supports it also when using
the iSCSI protocol.
LXC network devices can now be assigned a custom NIC device name on the
container side. For example, this is configured with:
<interface type='network'>
<source network='default'/>
<guest dev="eth1"/>
</interface>
In this example the network card will appear as eth1 in the guest.
The previous commit 09d4d26 put the interleave at the wrong point;
it didn't allow interleaving with <memory>.
* docs/schema/domaincommon.rng (numatune): Fix interleave location.
* tests/qemuxml2argvdata/qemuxml2argv-numatune-memnode.xml: Adjust test.
Signed-off-by: Eric Blake <eblake@redhat.com>
In XML format, by definition, order of fields should not matter, so
order of parsing the elements doesn't affect the end result. When
specifying guest NUMA cells, we depend only on the order of the 'cell'
elements. With this patch all older domain XMLs are parsed as before,
but with the 'id' attribute they are parsed and formatted according to
that field. This will be useful when we have tuning settings for
particular guest NUMA node.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This patch adds support for the QEMU vhost-user feature to libvirt.
vhost-user enables the communication between a QEMU virtual machine
and other userspace process using the Virtio transport protocol.
It uses a char dev (e.g. Unix socket) for the control plane,
while the data plane based on shared memory.
The XML looks like:
<interface type='vhostuser'>
<mac address='52:54:00:3b:83:1a'/>
<source type='unix' path='/tmp/vhost.sock' mode='server'/>
<model type='virtio'/>
</interface>
Signed-off-by: Michele Paolino <m.paolino@virtualopensystems.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add 'nocow' to storage volume xml so that user can have an option
to set NOCOW flag to the newly created volume. It's useful on btrfs
file system to enhance performance.
Btrfs has low performance when hosting VM images, even more when the guest
in those VM are also using btrfs as file system. One way to mitigate this
bad performance is to turn off COW attributes on VM files. Generally, there
are two ways to turn off COW on btrfs: a) by mounting fs with nodatacow,
then all newly created files will be NOCOW. b) per file. Add the NOCOW file
attribute. It could only be done to empty or new files.
This patch tries the second way, according to 'nocow' option, it could set
NOCOW flag per file:
for raw file images, handle 'nocow' in libvirt code; for non-raw file images,
pass 'nocow=on' option to qemu-img, and let qemu-img to handle that (requires
qemu-img version >= 2.1).
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Our documentation for features was rather sparse; this fleshes out
more of the details for other existing capabilities (and cost me
some time trawling git history).
* docs/formatcaps.html.in: Document it feature bits.
Signed-off-by: Eric Blake <eblake@redhat.com>
The link to the page "how to get your code into an open source
project" has been fixed.
Signed-off-by: Michele Paolino <m.paolino@virtualopensystems.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Fix a couple of typos ('chap' should have been 'iscsi' and there was
a stray 'iqn.2013-07.com.example:iscsi-pool' entry. Clean up the
description of the <auth> element for the disk
This new module holds and formats capabilities for emulator. If you
are about to create a new domain, you may want to know what is the
host or hypervisor capable of. To make sure we don't regress on the
XML, the formatting is not something left for each driver to
implement, rather there's general format function.
The domain capabilities is a lockable object (even though the locking
is not necessary yet) which uses reference counter.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Check if the buffer is in error state and report an error if it is.
This replaces the pattern:
if (virBufferError(buf)) {
virReportOOMError();
goto cleanup;
}
with:
if (virBufferCheckError(buf) < 0)
goto cleanup;
Document typical buffer usage to favor this.
Also remove the redundant FreeAndReset - if an error has
been set via virBufferSetError, the content is already freed.
This introduces two new attributes "cmd_per_lun" and "max_sectors" same
with the names QEMU uses for virtio-scsi. An example of the XML:
<controller type='scsi' index='0' model='virtio-scsi' cmd_per_lun='50'
max_sectors='512'/>
The corresponding QEMU command line:
-device virtio-scsi-pci,id=scsi0,cmd_per_lun=50,max_sectors=512,
bus=pci.0,addr=0x3
Signed-off-by: Mike Perez <thingee@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
We publish libvirt-api.xml for others to use, and in fact, the
libvirt-python bindings use it to generate python constants that
correspond to our enum values. However, we had an off-by-one bug
that any enum that relied on C's rules for implicit initialization
of the first enum member to 0 got listed in the xml as having a
value of 1 (and all later members of the enum were equally
botched).
The fix is simple - since we add one to the previous value when
encountering an enum without an initializer, the previous value
must start at -1 so that the first enum member is assigned 0.
The python generator code has had the off-by-one ever since DV
first wrote it years ago, but most of our public enums were immune
because they had an explicit = 0 initializer. The only affected
enums are:
- virDomainEventGraphicsAddressType (such as
VIR_DOMAIN_EVENT_GRAPHICS_ADDRESS_IPV4), since commit 987e31e
(libvirt v0.8.0)
- virDomainCoreDumpFormat (such as VIR_DOMAIN_CORE_DUMP_FORMAT_RAW),
since commit 9fbaff0 (libvirt v1.2.3)
- virIPAddrType (such as VIR_IP_ADDR_TYPE_IPV4), since commit
03e0e79 (not yet released)
Thanks to Nehal J Wani for reporting the problem on IRC, and
for helping me zero in on the culprit function.
* docs/apibuild.py (CParser.parseEnumBlock): Fix implicit enum
values.
Signed-off-by: Eric Blake <eblake@redhat.com>
The interface state for bonds and vlans does seem to reflect the state
of the underlying physical devices, at least in some cases, so it
makes sense to allow reporting it (netcf now does).
The link state/speed for bridge devices is meaningless though, so we
don't even look for it.
The interface xml schema was written with strict rules about the
ordering of the elements. This was never intentional, but just due to
omission of <interleave> in the appropriate places. This patch just
adds in <interleave> wherever there is more than one element, and
re-indents everything else appropriately.
In section "Block / character devices" of "Host device assignment",
the description of hostdev element has some error:
For a block device, the type should be "storage", not "block";
For a character device, the type should be "misc", not "char".
Signed-off-by: Jincheng Miao <jmiao@redhat.com>
There are two places where you'll find info on page sizes. The first
one is under <cpu/> element, where all supported pages sizes are
listed. Then the second one is under each <cell/> element which refers
to concrete NUMA node. At this place, the size of page's pool is
reported. So the capabilities XML looks something like this:
<capabilities>
<host>
<uuid>01281cda-f352-cb11-a9db-e905fe22010c</uuid>
<cpu>
<arch>x86_64</arch>
<model>Westmere</model>
<vendor>Intel</vendor>
<topology sockets='1' cores='1' threads='1'/>
...
<pages unit='KiB' size='4'/>
<pages unit='KiB' size='2048'/>
<pages unit='KiB' size='1048576'/>
</cpu>
...
<topology>
<cells num='4'>
<cell id='0'>
<memory unit='KiB'>4054408</memory>
<pages unit='KiB' size='4'>1013602</pages>
<pages unit='KiB' size='2048'>3</pages>
<pages unit='KiB' size='1048576'>1</pages>
<distances/>
<cpus num='1'>
<cpu id='0' socket_id='0' core_id='0' siblings='0'/>
</cpus>
</cell>
<cell id='1'>
<memory unit='KiB'>4071072</memory>
<pages unit='KiB' size='4'>1017768</pages>
<pages unit='KiB' size='2048'>3</pages>
<pages unit='KiB' size='1048576'>1</pages>
<distances/>
<cpus num='1'>
<cpu id='1' socket_id='0' core_id='0' siblings='1'/>
</cpus>
</cell>
...
</cells>
</topology>
...
</host>
<guest/>
</capabilities>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit 7c6fc39 introduced a regression in the XML produced for older
clients. The argument at the time was that clients shouldn't be
depending on output-only data for something that is only going to
be triggered for a transient guest; but John Ferlan reported that
the automated testsuite was such a client. It's better to be safe
than sorry by guaranteeing back-compat cruft. Note that later
patches will be using <mirror> for active block commit, but there
we don't have to worry about back-compat.
* src/conf/domain_conf.c (virDomainDiskDefFormat): Restore old
style output when necessary.
* docs/schemas/domaincommon.rng: Validate back-compat style.
* docs/formatdomain.html.in: Update the documentation.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-mirror-old.xml:
Update tests.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
This new element is there to represent PCI-Express capabilities
of a PCI devices, like link speed, number of lanes, etc.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
While exposing the info under <interface/> in previous patch works, it
may work only in cases where interface is configured on the host.
However, orchestrating application may want to know the link state and
speed even in that case. That's why we ought to expose this in nodedev
XML too:
virsh # nodedev-dumpxml net_eth0_f0_de_f1_2b_1b_f3
<device>
<name>net_eth0_f0_de_f1_2b_1b_f3</name>
<path>/sys/devices/pci0000:00/0000:00:19.0/net/eth0</path>
<parent>pci_0000_00_19_0</parent>
<capability type='net'>
<interface>eth0</interface>
<address>f0🇩🇪f1:2b:1b:f3</address>
<link speed='1000' state='up'/>
<capability type='80203'/>
</capability>
</device>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently it is not possible to determine the speed of an interface
and whether a link is actually detected from the API. Orchestrating
platforms want to be able to determine when the link has failed and
where multiple speeds may be available which one the interface is
actually connected at. This commit introduces an extension to our
interface XML (without implementation to interface driver backends):
<interface type='ethernet' name='eth0'>
<start mode='none'/>
<mac address='aa:bb:cc:dd:ee:ff'/>
<link speed='1000' state='up'/>
<mtu size='1492'/>
...
</interface>
Where @speed is negotiated link speed in Mbits per second, and state
is the current NIC state (can be one of the following: "unknown",
"notpresent", "down", "lowerlayerdown","testing", "dormant", "up").
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Now that we track a disk mirror as a virStorageSource, we might
as well update the XML to theoretically allow any type of
mirroring destination (not just a local file). A later patch
will also be reusing <mirror> to track the block commit of the
top layer of a chain, which is another case where libvirt needs
to update the backing chain after the job is finally pivoted,
and since backing chains can have network backing files as the
destination to commit into, it makes more sense to display that
in the XML.
This patch changes output-only XML; it was already documented
that <mirror> does not affect a domain definition at this point
(because qemu doesn't provide persistent bitmaps yet). Any
application that was starting a block copy job with older libvirt
and then relying on the domain XML to determine if it was
complete will no longer be able to access the file= and format=
attributes of mirror that were previously used. However, this is
not going to be a problem in practice: the only time a block copy
job works is on a transient domain, and any app that is managing
a transient domain probably already does enough of its own
bookkeeping to know which file it is mirroring into without
having to re-read it from the libvirt XML. The one thing that
was likely to be used in a mirroring job was the ready=
attribute, which is unchanged. Meanwhile, I made sure the schema
and parser still accept the old format, even if we no longer
output it, so that upgrading from an older version of libvirt is
seamless.
* docs/schemas/domaincommon.rng (diskMirror): Alter definition.
* src/conf/domain_conf.c (virDomainDiskDefParseXML): Parse two
styles of mirror elements.
(virDomainDiskDefFormat): Output new style.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror-old.xml: New
file, copied from...
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: ...here
before modernizing.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-mirror-old*: New
files.
* tests/qemuxml2xmltest.c (mymain): Test both styles.
Signed-off-by: Eric Blake <eblake@redhat.com>
A PCI device can be associated with a specific NUMA node. Later, when
a guest is pinned to one NUMA node the PCI device can be assigned on
different NUMA node. This makes DMA transfers travel across nodes and
thus results in suboptimal performance. We should expose the NUMA node
locality for PCI devices so management applications can make better
decisions.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
At the moment we are missing even basic documentation on our
capabilities XML. Without demand on completeness, I'm
reorganizing the document structure and adding very basic
documentation to two major components of the capabilities XML.
These stubs are intended to be enhanced in the future.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If user or management application wants to create a guest,
it may be useful to know the cost of internode latencies
before the guest resources are pinned. For example:
<capabilities>
<host>
...
<topology>
<cells num='2'>
<cell id='0'>
<memory unit='KiB'>4004132</memory>
<distances>
<sibling id='0' value='10'/>
<sibling id='1' value='20'/>
</distances>
<cpus num='2'>
<cpu id='0' socket_id='0' core_id='0' siblings='0'/>
<cpu id='2' socket_id='0' core_id='2' siblings='2'/>
</cpus>
</cell>
<cell id='1'>
<memory unit='KiB'>4030064</memory>
<distances>
<sibling id='0' value='20'/>
<sibling id='1' value='10'/>
</distances>
<cpus num='2'>
<cpu id='1' socket_id='0' core_id='0' siblings='1'/>
<cpu id='3' socket_id='0' core_id='2' siblings='3'/>
</cpus>
</cell>
</cells>
</topology>
...
</host>
...
</capabilities>
We can see the distance from node1 to node0 is 20 and within nodes 10.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Our documentation generator is a bit messy, to say the least. For
instance, the description to return values of a function is
searched within C comment. Currently, all lines that start with
'returns' or 'Returns' are viewed as return value description.
However, there are some valid uses where the 'returns' word is in
the middle of a sentence describing function behavior not the
return value. And there are no places where 'returns' is used to
describe return values. For instance:
virDomainDetachDeviceFlags, virConnectNetworkEventRegisterAny and
virDomainGetDiskErrors. This leads to HTML documemtation being
generated incorrectly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
<interface type='hostdev' managed='yes'> is supported, but
nowhere mentions 'managed' in <interface type='hostdev'> syntax.
Update documentation to cover it.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
The XML for quite a longish backing chain is shown below:
<disk type='network' device='disk'>
<driver name='qemu' type='qcow2'/>
<source protocol='nbd' name='bar'>
<host transport='unix' socket='/var/run/nbdsock'/>
</source>
<backingStore type='block' index='1'>
<format type='qcow2'/>
<source dev='/dev/HostVG/QEMUGuest1'/>
<backingStore type='file' index='2'>
<format type='qcow2'/>
<source file='/tmp/image2.qcow'/>
<backingStore type='file' index='3'>
<format type='qcow2'/>
<source file='/tmp/image3.qcow'/>
<backingStore type='file' index='4'>
<format type='qcow2'/>
<source file='/tmp/image4.qcow'/>
<backingStore type='file' index='5'>
<format type='qcow2'/>
<source file='/tmp/image5.qcow'/>
<backingStore type='file' index='6'>
<format type='raw'/>
<source file='/tmp/Fedora-17-x86_64-Live-KDE.iso'/>
<backingStore/>
</backingStore>
</backingStore>
</backingStore>
</backingStore>
</backingStore>
</backingStore>
<target dev='vdb' bus='virtio'/>
</disk>
Various disk types and formats can be mixed in one chain. The
<backingStore/> empty element marks the end of the backing chain and it
is there mostly for future support of parsing the chain provided by a
user. If it's missing, we are supposed to probe for the rest of the
chain ourselves, otherwise complete chain was provided by the user. The
index attributes of backingStore elements can be used to unambiguously
identify a specific part of the image chain.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When the default was changed from kvm to vfio, the documentation for
hostdev and interface was changed, but the documentation in <network>
was forgotten.
Also document when the default was changed from "always kvm" to "vfio
if available, else kvm" (1.0.5).
I noticed that depending on the <driver> attributes the user passed
in, the output may omit the <driver> element altogether. For example,
the rerror_policy has had this problem since commit 4bb4109 in Oct
2011. But in adding testsuite coverage to expose it, I found another
problem: the C code is just fine without a driver name, but the
XML validator required either a name or a cache mode.
* src/conf/domain_conf.c (virDomainDiskDefFormat): Update
conditional.
* docs/schemas/domaincommon.rng (diskDriver): Simplify.
* tests/qemuxml2argvdata/qemuxml2argv-disk-drive-copy-on-read.xml:
* tests/qemuxml2argvdata/qemuxml2argv-disk-drive-copy-on-read.args:
New files.
* tests/qemuxml2argvdata/qemuxml2argv-disk-drive-discard.xml:
Enhance test.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-drive-discard.xml:
Likewise.
* tests/qemuxml2argvtest.c (mymain): New test.
* tests/qemuxml2xmltest.c (mymain): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
To make <disk> schema more maintainable and to allow for moving the
pieces to a common file in the future. It relies on the ability to
override definitions as part of an include, set up in the previous
patch.
The diff is a bit hard to read, because it mixes reindentation
with refactoring; 'git diff -b --patience' may help.
* docs/schemas/domaincommon.rng (disk): Refactor into pieces.
(diskSource, diskSourceFile, diskSourceBlock, diskSourceDir)
(diskSourceVolume: New defines.
(diskSourceNetwork): Revise scope.
* docs/schemas/domainsnapshot.rng (disksnapshot): Adjust.
* tests/domainsnapshotxml2xmlin/disk-seclabel-invalid.xml,
tests/domainsnapshotxml2xmlin/disk-network-seclabel-invalid.xml: New
tests to check seclabel is forbidden in domain snapshot by schema.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This patch is my first experience playing with nested grammars,
as documented in http://relaxng.org/tutorial-20011203.html#IDA3PZR.
I plan on doing more overrides in order to make the RelaxNG
grammar mirror the C code refactoring into a common
virStorageSource, but where different clients of that source do
not support the same subset of functionality. By starting with
something fairly easy to validate, I can make sure my later
patches will be possible.
This patch adds a use of the no-op <ref
name='sourceStartupPolicy'/> to the disksnapshot definition, so
that the snapshot version of a type='file' <source> more closely
resembles the version in domaincommon. A future patch will merge
the two files into using a common define, but this patch is
sufficient for testing that adding <source
startupPolicy='optional'/> in any of the
tests/domainsnapshotxml2xmlin/*.xml files still gets rejected
unless it occurs within the <domain> subelement, because the
definition of startupPolicy is empty outside of domain.rng.
* docs/schemas/storagecommon.rng (storageStartupPolicy)
(storageSourceExtra): Create no-op defaults.
* docs/schemas/domainsnapshot.rng (domain): Use nested grammar
to avoid restricting <domain>.
(storageSourceExtra): Create new override.
(disksnapshot): Access overrides through common names.
* docs/schemas/domaincommon.rng (disk): Access overrides through
common names.
* docs/schemas/domain.rng (storageStartupPolicy)
(storageSourceExtra): Create new overrides.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Domain snapshots should only permit an external snapshot into
a storage format that permits a backing chain, since the new
snapshot file necessarily must be backed by the existing file.
The C code for the qemu driver is a little bit stricter in
currently enforcing only qcow2 or qed, but at the XML parser
level, including virt-xml-validate, it is fairly easy to
enforce that a user can't request a 'raw' external snapshot.
* docs/schemas/storagecommon.rng (storageFormat): Split out...
(storageFormatBacking): ...new sublist.
* docs/schemas/domainsnapshot.rng (disksnapshotdriver): Use new
type.
* src/util/virstoragefile.h (virStorageFileFormat): Rearrange for
easier code management.
* src/util/virstoragefile.c (virStorageFileFormat, fileTypeInfo):
Likewise.
* src/conf/snapshot_conf.c (virDomainSnapshotDiskDefParseXML): Use
new marker to limit selection of formats.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
We had incomplete RelaxNG support for storage formats listed
in virstoragefile.h: commit 027bf2e added 'vdi' but forgot
to update the <volume> and <domain> xml lists; the <volume>
list was also missing 'fat' and 'vhd'. Maintaining two lists
is a recipe for them getting out of sync, so make the list
common so that both contexts benefit the next time we add a
format in a single location.
* docs/schemas/domaincommon.rng (storageFormat): Move...
* docs/schemas/storagecommon.rng: ...here, and add vdi.
* docs/schemas/storagevol.rng (formatfile): Use common list.
Signed-off-by: Eric Blake <eblake@redhat.com>
In general, we try to make virt-xml-validate tolerant of input
elements in any order when possible. However, as written, the
RNG grammar did not permit <source> unless there was an explicit
type= attribute (even though the C code manages just fine by
defaulting to type='file'). After making the attribute optional
on the 'file' branch, I noticed that the use of diskspec was now
redundant with the branch when no <source> was supplied.
View this patch with 'git diff -b' for a better picture of the
schema change.
* docs/schemas/domaincommon.rng (disk): Hoist 'diskspec' out of
choice, make type='file' default, and still preserve interleave.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-source-pool.xml:
* tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-drive-discard.xml:
New files.
* tests/qemuxml2argvdata/qemuxml2argv-disk-source-pool.xml:
* tests/qemuxml2argvdata/qemuxml2argv-disk-drive-discard.xml:
Reorder XML.
* tests/qemuxml2xmltest.c (mymain): Cover new files.
Signed-off-by: Eric Blake <eblake@redhat.com>
Having two tiny files with a couple definitions didn't make
as much sense as one common file, especially since I plan to
add more definitions and use it in more places.
* docs/schemas/storageencryption.rng: Merge this...
* docs/schemas/storagefilefeatures.rng: ...and this, into...
* docs/schemas/storagecommon.rng: ...this new file.
* docs/schemas/Makefile.am (schema_DATA): Reflect renames.
* docs/schemas/storagevol.rng: Likewise.
* docs/schemas/domaincommon.rng: Likewise.
* libvirt.spec.in: Likewise.
* mingw-libvirt.spec.in: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
According to our documentation the "key" value has the following
meaning: "Providing an identifier for the volume which identifies a
single volume." The currently used keys for gluster volumes consist of
the gluster volume name and file path. This can't be considered unique
as a different storage server can serve a volume with the same name.
Unfortunately I wasn't able to figure out a way to retrieve the gluster
volume UUID which would avoid the possibility of having two distinct
keys identifying a single volume.
Use the full URI as the key for the volume to avoid the more critical
ambiguity problem and document the possible change to UUID.
This patch adds an element to QEMU's capability XML, to
show if the underlying QEMU binary supports the live disk
snapshotting or not.
This allows any client to know ahead of time if the feature
is available.
Without this information available, the only way to check
for the snapshot support is to request one and check for
errors.
Signed-off-by: Francesco Romani <fromani@redhat.com>
A earlier commit changed the global log buffer so that it only
records messages that are explicitly requested via the log
filters setting. This removes the performance burden, and
improves the signal/noise ratio for messages in the global
buffer. At the same time though, it is somewhat pointless, since
all the recorded log messages are already going to be sent to an
explicit log output like syslog, stderr or the journal. The
global log buffer is thus just duplicating this data on stderr
upon crash.
The log_buffer_size config parameter is left in the augeas
lens to prevent breakage for users on upgrade. It is however
completely ignored hereafter.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Any source file which calls the logging APIs now needs
to have a VIR_LOG_INIT("source.name") declaration at
the start of the file. This provides a static variable
of the virLogSource type.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=862887
Add a netmask for the source and destination IP address for the
ebtables --arp-ip-src and --arp-ip-dst options. Extend the XML
parser with support for XML attributes for these netmasks similar
to already supported netmasks. Extend the documentation.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
This is a request for adding a VMmanager application as requested and
described by Ksenya Phil.
Signed-off-by: Ksenya Phil <philka2003@mail.ru>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
With most of our storage backends it's possible to have two separate
volume keys to point to a single volume. (By creating sym/hard-links to
local files or by mounting remote filesystems to two different locations
and creating pools on top of them) Document this possibility.
Auditing all callers of virCommandRun and virCommandWait that
passed a non-NULL pointer for exit status turned up some
interesting observations. Many callers were merely passing
a pointer to avoid the overall command dying, but without
caring what the exit status was - but these callers would
be better off treating a child death by signal as an abnormal
exit. Other callers were actually acting on the status, but
not all of them remembered to filter by WIFEXITED and convert
with WEXITSTATUS; depending on the platform, this can result
in a status being reported as 256 times too big. And among
those that correctly parse the output, it gets rather verbose.
Finally, there were the callers that explicitly checked that
the status was 0, and gave their own message, but with fewer
details than what virCommand gives for free.
So the best idea is to move the complexity out of callers and
into virCommand - by default, we return the actual exit status
already cleaned through WEXITSTATUS and treat signals as a
failed command; but the few callers that care can ask for raw
status and act on it themselves.
* src/util/vircommand.h (virCommandRawStatus): New prototype.
* src/libvirt_private.syms (util/command.h): Export it.
* docs/internals/command.html.in: Document it.
* src/util/vircommand.c (virCommandRawStatus): New function.
(virCommandWait): Adjust semantics.
* tests/commandtest.c (test1): Test it.
* daemon/remote.c (remoteDispatchAuthPolkit): Adjust callers.
* src/access/viraccessdriverpolkit.c (virAccessDriverPolkitCheck):
Likewise.
* src/fdstream.c (virFDStreamCloseInt): Likewise.
* src/lxc/lxc_process.c (virLXCProcessStart): Likewise.
* src/qemu/qemu_command.c (qemuCreateInBridgePortWithHelper):
Likewise.
* src/xen/xen_driver.c (xenUnifiedXendProbe): Simplify.
* tests/reconnect.c (mymain): Likewise.
* tests/statstest.c (mymain): Likewise.
* src/bhyve/bhyve_process.c (virBhyveProcessStart)
(virBhyveProcessStop): Don't overwrite virCommand error.
* src/libvirt.c (virConnectAuthGainPolkit): Likewise.
* src/openvz/openvz_driver.c (openvzDomainGetBarrierLimit)
(openvzDomainSetBarrierLimit): Likewise.
* src/util/virebtables.c (virEbTablesOnceInit): Likewise.
* src/util/viriptables.c (virIpTablesOnceInit): Likewise.
* src/util/virnetdevveth.c (virNetDevVethCreate): Fix debug
message.
* src/qemu/qemu_capabilities.c (virQEMUCapsInitQMP): Add comment.
* src/storage/storage_backend_iscsi.c
(virStorageBackendISCSINodeUpdate): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
When probing QEMU capabilities fails for a binary generate a
log message with MESSAGE_ID==8ae2f3fb-2dbe-498e-8fbd-012d40afa361.
This can be directly queried from journald based on the UUID
instead of needing string grep. This lets tools like libguestfs'
bug reporting tool trivially do automated sanity tests on the
host they're running on.
$ journalctl MESSAGE_ID=8ae2f3fb-2dbe-498e-8fbd-012d40afa361
Feb 21 17:11:01 localhost.localdomain lt-libvirtd[9196]:
Failed to probe capabilities for /bin/qemu-system-alpha:
internal error: Child process (LC_ALL=C LD_LIBRARY_PATH=
/home/berrange/src/virt/libvirt/src/.libs PATH=/usr/lib64/
ccache:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:
/usr/bin:/root/bin HOME=/root USER=root LOGNAME=root
/bin/qemu-system-alpha -help) unexpected exit status 127:
/bin/qemu-system-alpha: error while loading shared libraries:
libglapi.so.0: cannot open shared object file: No such file
or directory
$ journalctl MESSAGE_ID=8ae2f3fb-2dbe-498e-8fbd-012d40afa361 --output=json
{ ...snip...
"LIBVIRT_SOURCE" : "file",
"PRIORITY" : "3",
"CODE_FILE" : "qemu/qemu_capabilities.c",
"CODE_LINE" : "2770",
"CODE_FUNC" : "virQEMUCapsLogProbeFailure",
"MESSAGE_ID" : "8ae2f3fb-2dbe-498e-8fbd-012d40afa361",
"LIBVIRT_QEMU_BINARY" : "/bin/qemu-system-xtensa",
"MESSAGE" : "Failed to probe capabilities for /bin/qemu-system-xtensa:
internal error: Child process (LC_ALL=C LD_LIBRARY_PATH=/home/berrange
/src/virt/libvirt/src/.libs PATH=/usr/lib64/ccache:/usr/local/sbin:
/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin HOME=/root
USER=root LOGNAME=root /bin/qemu-system-xtensa -help) unexpected
exit status 127: /bin/qemu-system-xtensa: error while loading shared
libraries: libglapi.so.0: cannot open shared object file: No such
file or directory\n" }
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When a virError is raised, pass the error domain and code
onto the systemd journald using metadata fields.
This allows error messages to be queried by code eg
$ journalctl LIBVIRT_CODE=43
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The logging doc had a hand-written table of contents
instead of using the automatic XSL generated one.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Recent autotest/virt-test testing on f20 discovered an anomaly in how
the bandwidth options are documented and used. This was discovered due
to a bug fix in the /sbin/tc utility found in iproute-3.11.0.1 (on f20)
in which overflow was actually caught and returned as an error. The fix
was first introduced in iproute-3.10 (search on iproute2 commit 'a303853e').
The autotest/virt-test test for virsh domiftune was attempting to send
the largest unsigned integer value (4294967295) for maximum value
testing. The libvirt xml implementation was designed to manage values
in kilobytes thus when this value was passed to /sbin/tc, it (now)
properly rejected the 4294967295kbps value.
Investigation of the problem discovered that formatdomain.html.in and
formatnetwork.html.in described the elements and property types slightly
differently, although they use the same code - virNetDevBandwidthParseRate()
(shared by portgroups, domains, and networks xml parsers). Rather than
have the descriptions in two places, this patch will combine and reword
the description under formatnetwork.html.in and have formatdomain.html.in
link to that description.
This documentation faux pas was continued into the virsh man page where
the bandwidth description for both 'attach-interface' and 'domiftune'
did not indicate the format of each value, thus leading to the test using
largest unsigned integer value assuming "bps" rather than "kbps", which
ultimately was wrong.
The previous OOM testing support would re-run the entire "main"
method each iteration, failing a different malloc each time.
When a test suite has 'n' allocations, the number of repeats
requires is (n * (n + 1) ) / 2. This gets very large, very
quickly.
This new OOM testing support instead integrates at the
virtTestRun level, so each individual test case gets repeated,
instead of the entire test suite. This means the values of
'n' are orders of magnitude smaller.
The simple usage is
$ VIR_TEST_OOM=1 ./qemuxml2argvtest
...
29) QEMU XML-2-ARGV clock-utc ... OK
Test OOM for nalloc=36 .................................... OK
30) QEMU XML-2-ARGV clock-localtime ... OK
Test OOM for nalloc=36 .................................... OK
31) QEMU XML-2-ARGV clock-france ... OK
Test OOM for nalloc=38 ...................................... OK
...
the second lines reports how many mallocs have to be failed, and thus
how many repeats of the test will be run.
If it crashes, then running under valgrind will often show the problem
$ VIR_TEST_OOM=1 ../run valgrind ./qemuxml2argvtest
When debugging problems it is also helpful to select an individual
test case
$ VIR_TEST_RANGE=30 VIR_TEST_OOM=1 ../run valgrind ./qemuxml2argvtest
When things get really tricky, it is possible to request that just
specific allocs are failed. eg to fail allocs 5 -> 12, use
$ VIR_TEST_RANGE=30 VIR_TEST_OOM=1:5-12 ../run valgrind ./qemuxml2argvtest
In the worse case, you might want to know the stack trace of the
alloc which was failed then VIR_TEST_OOM_TRACE can be set. If it
is set to 1 then it will only print if it thinks a mistake happened.
This is often not reliable, so setting it to 2 will make it print
the stack trace for every alloc that is failed.
$ VIR_TEST_OOM_TRACE=2 VIR_TEST_RANGE=30 VIR_TEST_OOM=1:5-5 ../run valgrind ./qemuxml2argvtest
30) QEMU XML-2-ARGV clock-localtime ... OK
Test OOM for nalloc=36 !virAllocN
/home/berrange/src/virt/libvirt/src/util/viralloc.c:180
virHashCreateFull
/home/berrange/src/virt/libvirt/src/util/virhash.c:144
virDomainDefParseXML
/home/berrange/src/virt/libvirt/src/conf/domain_conf.c:11745
virDomainDefParseNode
/home/berrange/src/virt/libvirt/src/conf/domain_conf.c:12646
virDomainDefParse
/home/berrange/src/virt/libvirt/src/conf/domain_conf.c:12590
testCompareXMLToArgvFiles
/home/berrange/src/virt/libvirt/tests/qemuxml2argvtest.c:106
virtTestRun
/home/berrange/src/virt/libvirt/tests/testutils.c:250
mymain
/home/berrange/src/virt/libvirt/tests/qemuxml2argvtest.c:418 (discriminator 2)
virtTestMain
/home/berrange/src/virt/libvirt/tests/testutils.c:750
??
??:0
_start
??:?
FAILED
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There is no keyboard support currently in libvirt.
For some platforms (PPC64 QEMU) this makes graphics unusable,
since the keyboard is not implicit and it can't be added via libvirt.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
There might be some use cases, where user wants to prepare the host or
its environment prior to starting a network and do some cleanup after
the network has been shut down. Consider all the functionality that
libvirt doesn't currently have as an example what a hook script can
possibly do.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add support for gluster backed images as sources for snapshots in the
qemu driver. This will also simplify adding further network backed
volumes as sources for snapshot in case qemu will support them.
Add a new character device backend called 'spiceport' that uses
spice's channel for communications and apart from spicevmc can be used
as a backend for any character device from libvirt's point of view.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Some grammar fixes.
s/namespace,set/namespace, set
s/container being allowed/container are allowed
s/the <code>uid/The <code>uid
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
Add a new <timer> for the HyperV reference time counter enlightenment
and the iTSC reference page for Windows guests.
This feature provides a paravirtual approach to track timer events for
the guest (similar to kvmclock) with the option to use real hardware
clock on systems with a iTSC with compensation across various hosts.
According to the documentation describing various tunables for domain
timers not all the fields are supported by all the driver types. Express
these in the RNG:
- rtc, platform: Only these support the "track" attribute.
- tsc: only one to support "frequency" and "mode" attributes
- hpet, pit: tickpolicy/catchup attribute/element
- kvmclock: no extra attributes are supported
Additionally the attributes of the <catchup> element for
tickpolicy='catchup' are optional according to the parsing code. Express
this in the XML and fix a spurious space added while formatting the
<catchup> element and add tests for it.
https://bugzilla.redhat.com/show_bug.cgi?id=1057321
pointed out that we weren't honoring the <bandwidth> element in
libvirt networks using <forward mode='bridge'/>. In fact, these
networks are just a method of giving a libvirt network name to an
existing Linux host bridge on the system, and libvirt doesn't have
enough information to know where to set such limits. We are working on
a method of supporting network bandwidths for some specific cases of
<forward mode='bridge'/>, but currently libvirt doesn't support it. So
the proper thing to do now is just log an error when someone tries to
put a <bandwidth> element in that type of network. (It's unclear if we
will be able to do proper bandwidth limiting for macvtap networks, and
most definitely we will not be able to support it for hostdev
networks).
While looking through the network XML documentation and comparing it
to the networkValidate function, I noticed that we also ignore the
presence of a mac address in the config in the same cases, rather than
failing so that the user will understand that their desired action has
not been taken.
This patch updates networkValidate() (which is called any time a
persistent network is defined, or a transient network created) to log
an error and fail if it finds either a <bandwidth> or <mac> element
and the network forward mode is anything except 'route'. 'nat', or
nothing. (Yes, neither of those elements is acceptable for any macvtap
mode, nor for a hostdev network).
NB: This does *not* cause failure to start any existing network that
contains one of those elements, so someone might have erroneously
defined such a network in the past, and that network will continue to
function unmodified. I considered it too disruptive to suddenly break
working configs on the next reboot after a libvirt upgrade.
While at it, also relinquish active commit rights:
[x years between commits] is probably a poster child example of inactivity :)
Signed-off-by: Eric Blake <eblake@redhat.com>