https://bugzilla.redhat.com/show_bug.cgi?id=1135396
There are two ways how to tell qemu to use huge pages. The first one
is suitable for domains with NUMA nodes: the path to hugetlbfs mount
is appended to NUMA node definition on the command line. The second
one is suitable for UMA domains: here there's this global '-mem-path'
argument that accepts path to the hugetlbfs mount point. However, the
latter case was not used for all the cases that it should be. For
instance:
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB' nodeset='0'/>
</hugepages>
</memoryBacking>
didn't trigger the '-mem-path' so the huge pages - despite being
configured - were not used at all.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
As of 136ad4974 it is possible to specify different huge pages per
guest NUMA node. However, there's no check if nodeset specified in
./hugepages/page contains only those guest NUMA nodes that exist.
In other words with current code it is possible to define meaningless
combination:
<memoryBacking>
<hugepages>
<page size='1048576' unit='KiB' nodeset='0,2-3'/>
<page size='2048' unit='KiB' nodeset='1,4'/>
</hugepages>
</memoryBacking>
<vcpu placement='static'>4</vcpu>
<cpu>
<numa>
<cell id='0' cpus='0' memory='1048576'/>
<cell id='1' cpus='1' memory='1048576'/>
<cell id='2' cpus='2' memory='1048576'/>
<cell id='3' cpus='3' memory='1048576'/>
</numa>
</cpu>
Notice the node 4 in <hugepages/>?
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
As of f05b6a918e the test produces the list of paths that can
be passed to <loader/> and libvirt knows about them. However,
during the process of generating the list the paths are checked
for their presence. This may produce different results on
different systems. Therefore, the path - if missing - is
added to pretend it's there.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Check to see if the UEFI binary mentioned in qemu.conf actually
exists, and if so expose it in domcapabilities like
<loader ...>
<value>/path/to/ovmf</value>
</loader>
We introduce some generic domcaps infrastructure for handling
a dynamic list of string values, it may be of use for future bits.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Up till now the virQEMUCapsFillDomainCaps() was type of void as
there was no way for it to fail. This is, however, going to
change in the next commit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
As of 542899168c we learned libvirt to use UEFI for domains.
However, management applications may firstly query if libvirt
supports it. And this is where virConnectGetDomainCapabilities()
API comes handy.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For tuning the network, alternative devices
for creating tap and vhost devices can be specified via:
<backend tap='/dev/net/tun' vhost='/dev/net-vhost'/>
We already are checking for negative value, reporting an error, but
using wrong function and the check only succeeds when a value that
cannot be converted to number successfully is encountered. This patch
provides just a minor change in call of the right version
of function virStrToLong.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1138539
I noticed this with the recent iothread pinning code, but the
problem existed longer than that. The XML validation required
users to supply <cputune> children in a strict order, even though
there was no conceptual reason why they can't occur in any order.
docs/ changes best viewed with -w
* docs/schemas/domaincommon.rng (cputune): Add interleave.
* tests/qemuxml2argvdata/qemuxml2argv-cputune-iothreads.xml: Swap
up order, copying canonical form...
* tests/qemuxml2xmloutdata/qemuxml2xmlout-cputune-iothreads.xml:
...here.
* tests/qemuxml2xmltest.c (mymain): Mark the difference.
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1101574
Add an option 'iothreadpin' to the <cpuset> to allow for setting the
CPU affinity for each IOThread.
The iothreadspin will mimic the vcpupin with respect to being able to
assign each iothread to a specific CPU, although iothreads ids start
at 1 while vcpu ids start at 0. This matches the iothread naming scheme.
Coverity complained that checking the return of virDomainCreate()
was not consistent amongst the callers - so added the return check
to the objecteventtest.c and adjust the virt-login-shell to compare
< 0 rather than just non zero for the failure condition.
Upstream qemu 1.4 added some drive-mirror tunables not present
when it was first introduced in 1.3. Management apps may want
to set these in some cases (for example, without tuning
granularity down to sector size, a copy may end up occupying
more bytes than the original because an entire cluster is
copied even when only a sector within the cluster is dirty,
although tuning it down results in more CPU time to do the
copy). I haven't personally needed to use the parameters, but
since they exist, and since the new API supports virTypedParams,
we might as well expose them.
Since the tuning parameters aren't often used, and omitted from
the QMP command when unspecified, I think it is safe to rely on
qemu 1.3 to issue an error about them being unsupported, rather
than trying to create a new capability bit in libvirt.
Meanwhile, all versions of qemu from 1.4 to 2.1 have a bug where
a bad granularity (such as non-power-of-2) gives a poor message:
error: internal error: unable to execute QEMU command 'drive-mirror': Invalid parameter 'drive-virtio-disk0'
because of abuse of QERR_INVALID_PARAMETER (which is supposed to
name the parameter that was given a bad value, rather than the
value passed to some other parameter). I don't see that a
capability check will help, so we'll just live with it (and it
has since been improved in upstream qemu).
* src/qemu/qemu_monitor.h (qemuMonitorDriveMirror): Add
parameters.
* src/qemu/qemu_monitor.c (qemuMonitorDriveMirror): Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDriveMirror):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDriveMirror):
Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockCopyCommon): Likewise.
(qemuDomainBlockRebase, qemuDomainBlockCopy): Adjust callers.
* src/qemu/qemu_migration.c (qemuMigrationDriveMirror): Likewise.
* tests/qemumonitorjsontest.c (qemuMonitorJSONDriveMirror): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Rename the VIR_MOCK_IMPL* macros to VIR_MOCK_WRAP*
and add new VIR_MOCK_IMPL macros which let you directly
implement overrides in the preloaded source.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Test suites using the port allocator don't want to have different
behaviour depending on whether a port is in use on the host. Add
a VIR_PORT_ALLOCATOR_SKIP_BIND_CHECK which test suites can use
to skip the bind() test. The port allocator will thus only track
ports in use by the test suite process itself. This is fine when
using the port allocator to generate guest configs which won't
actually be launched
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Coverity complains that the various checks for autoincrement and changed
variables are DEADCODE - seems to me to be a false positive - so mark it.
Signed-off-by: John Ferlan <jferlan@redhat.com>
When using split UEFI image, it may come handy if libvirt manages per
domain _VARS file automatically. While the _CODE file is RO and can be
shared among multiple domains, you certainly don't want to do that on
the _VARS file. This latter one needs to be per domain. So at the
domain startup process, if it's determined that domain needs _VARS
file it's copied from this master _VARS file. The location of the
master file is configurable in qemu.conf.
Temporary, on per domain basis the location of master NVRAM file can
be overridden by this @template attribute I'm inventing to the
<nvram/> element. All it does is holding path to the master NVRAM file
from which local copy is created. If that's the case, the map in
qemu.conf is not consulted.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
QEMU now supports UEFI with the following command line:
-drive file=/usr/share/OVMF/OVMF_CODE.fd,if=pflash,format=raw,unit=0,readonly=on \
-drive file=/usr/share/OVMF/OVMF_VARS.fd,if=pflash,format=raw,unit=1 \
where the first line reflects <loader> and the second one <nvram>.
Moreover, these two lines obsolete the -bios argument.
Note that UEFI is unusable without ACPI. This is handled properly now.
Among with this extension, the variable file is expected to be
writable and hence we need security drivers to label it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Up to now, users can configure BIOS via the <loader/> element. With
the upcoming implementation of UEFI this is not enough as BIOS and
UEFI are conceptually different. For instance, while BIOS is ROM, UEFI
is programmable flash (although all writes to code section are
denied). Therefore we need new attribute @type which will
differentiate the two. Then, new attribute @readonly is introduced to
reflect the fact that some images are RO.
Moreover, the OVMF (which is going to be used mostly), works in two
modes:
1) Code and UEFI variable store is mixed in one file.
2) Code and UEFI variable store is separated in two files
The latter has advantage of updating the UEFI code without losing the
configuration. However, in order to represent the latter case we need
yet another XML element: <nvram/>. Currently, it has no additional
attributes, it's just a bare element containing path to the variable
store file.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
To date, anyone performing a block copy and pivot ends up with
the destination being treated as <disk type='file'>. While this
works for data access for a block device, it has at least one
noticeable shortcoming: virDomainGetBlockInfo() reports allocation
differently for block devices visited as files (the size of the
device) than for block devices visited as <disk type='block'>
(the maximum sector used, as reported by qemu); and this difference
is significant when trying to manage qcow2 format on block devices
that can be grown as needed.
Of course, the more powerful virDomainBlockCopy() API can already
express the ability to set the <disk> type. But a new API can't
be backported, while a new flag to an existing API can; and it is
also rather inconvenient to have to resort to the full power of
generating XML when just adding a flag to the older call will do
the trick. So this patch enhances blockcopy to let the user flag
when the resulting XML after the copy must list the device as
type='block'.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_REBASE_COPY_DEV):
New flag.
* src/libvirt.c (virDomainBlockRebase): Document it.
* tools/virsh-domain.c (opts_block_copy, blockJobImpl): Add
--blockdev option.
* tools/virsh.pod (blockcopy): Document it.
* src/qemu/qemu_driver.c (qemuDomainBlockRebase): Allow new flag.
(qemuDomainBlockCopy): Remember the flag, and make sure it is only
used on actual block devices.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Test it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit fba6bc4 introduced support for the 'invtsc' feature,
which blocks migration. We should not include it in the
host-model CPU by default, because it's intended to be used
with migration.
https://bugzilla.redhat.com/show_bug.cgi?id=1138221
This commit is rather big. Firstly, the in memory config
representation is adjusted like if security_driver was set to "none".
The rest is then just adaptation to the new code that will generate
different seclabels.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Our style overwhelmingly uses hanging braces (the open brace
hangs at the end of the compound condition, rather than on
its own line), with the primary exception of the top level function
body. Fix the few remaining outliers, before adding a syntax
check in a later patch.
* src/interface/interface_backend_netcf.c (netcfStateReload)
(netcfInterfaceClose, netcf_to_vir_err): Correct use of { in
compound statement.
* src/conf/domain_conf.c (virDomainHostdevDefFormatSubsys)
(virDomainHostdevDefFormatCaps): Likewise.
* src/network/bridge_driver.c (networkAllocateActualDevice):
Likewise.
* src/util/virfile.c (virBuildPathInternal): Likewise.
* src/util/virnetdev.c (virNetDevGetVirtualFunctions): Likewise.
* src/util/virnetdevmacvlan.c
(virNetDevMacVLanVPortProfileCallback): Likewise.
* src/util/virtypedparam.c (virTypedParameterAssign): Likewise.
* src/util/virutil.c (virGetWin32DirectoryRoot)
(virFileWaitForDevices): Likewise.
* src/vbox/vbox_common.c (vboxDumpNetwork): Likewise.
* tests/seclabeltest.c (main): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Coverity determined that 'log' and 'newenv' were not freed in
some cases. Free them in 'error' branch and normal branch.
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
For virtio-blk-pci disks with the disk iothread attribute that are
running the correct emulator, add the "iothread=iothread#" to the
-device command line in order to enable iothreads for the disk as
long as the command is available, the disk iothread value provided is
valid, and is supported for the disk device being added
Add a new capability to ensure the iothreads feature exists for the qemu
emulator being run - requires the "query-iothreads" QMP command. Using the
domain XML add correspoding command argument in order to generate the
threads. The iothreads will use a name space "iothread#" where, the
future patch to add support for using an iothread to a disk definition to
merely define which of the available threads to use.
Add tests to ensure the xml/argv processing is correct. Note that no
change was made to qemuargv2xmltest.c as processing the -object element
would require knowing more than just iothreads.
Even though we kept adding new and new modules (e.g. vbox or bhyve)
the test wasn't updated. Do that now. Moreover, while it's not
crucial, it's nice to reorder test cases to match the order in which
the daemon loads the modules.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
QEMU 2.1 added support for the kvm=off option to the -cpu command,
allowing the KVM hypervisor signature to be hidden from the guest.
This enables disabling of some paravirualization features in the
guest as well as allowing certain drivers which test for the
hypervisor to load. Domain XML syntax is as follows:
<domain type='kvm>
...
<features>
...
<kvm>
<hidden state='on'/>
</kvm>
</features>
...
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
I noticed a line 'int nparams = 0;;' in remote_dispatch.h, and
tracked down where it was generated. While at it, I found a
couple of other double semicolons. Additionally, I noticed that
commit df0b57a95 left a stale reference to the file name
remote_dispatch_bodies.h.
* src/conf/numatune_conf.c (virDomainNumatuneNodeParseXML): Drop
empty statement.
* tests/virdbustest.c (testMessageStruct, testMessageSimple):
Likewise.
* src/rpc/gendispatch.pl (remote_dispatch_bodies.h): Likewise, and
update stale comments.
Signed-off-by: Eric Blake <eblake@redhat.com>
When trying to set numatune mode directly using virsh numatune command,
correct error is raised, however numatune structure was not deallocated,
thus resulting in creating an empty numatune element in the guest XML,
if none was present before. Running the same command aftewards results
in a successful change with broken XML structure. Patch fixes the
deallocation problem as well as checking for invalid attribute
combination VIR_DOMAIN_NUMATUNE_PLACEMENT_AUTO + a nonempty nodeset.
Resolves https://bugzilla.redhat.com/show_bug.cgi?id=1129998
That sets a new flag, but that flag does mean the child will get
LISTEN_FDS and LISTEN_PID environment variables properly set and
passed FDs reordered so that it corresponds with LISTEN_FDS (they must
start right after STDERR_FILENO).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When formatting the forward mode addresses or interfaces the switch was
done based on the type of the network rather than of the type of the
individual <interface>/<address> element. In case a user would specify
an incorrect network type ("passhtrough") with <address> elements,
libvirtd would crash as it would attempt to format an <interface>.
Use the type of the individual element to format the XML.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1132347
https://bugzilla.redhat.com/show_bug.cgi?id=1103245
An advice appeared there on the qemu-devel list [1]. When a domain is
suspended and then resumed guest kernel is not aware of this. So we've
introduced virDomainSetTime API that resets the time within guest
using qemu-ga. On the other hand, qemu itself is trying to make RTC
beat faster to catch the difference. But if we don't tell qemu that
guest's time was reset via the other method, both mechanisms are
applied resulting in again wrong guest time. In order to avoid summing
both corrections we need to tell qemu that it should not use the RTC
injection if the guest time is set via guest agent.
1: http://www.mail-archive.com/qemu-devel@nongnu.org/msg236435.html
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Update bhyveBuildDiskArgStr to support volumes:
- Make virBhyveProcessBuildBhyveCmd and
virBhyveProcessBuildLoadCmd take virConnectPtr as the
first argument instead of bhyveConnPtr as virConnectPtr is
needed for virStorageTranslateDiskSourcePool,
- Add virStorageTranslateDiskSourcePool call to
virBhyveProcessBuildBhyveCmd and
virBhyveProcessBuildLoadCmd,
- Allow disks of type VIR_STORAGE_TYPE_VOLUME
Currently, qemu driver uses qemuTranslateDiskSourcePool()
to translate disk volume information. This function is
general enough and could be used for other drivers as well,
so move it to conf/domain_conf.c along with its helpers.
- qemuTranslateDiskSourcePool: move to storage/storage_driver.c
and rename to virStorageTranslateDiskSourcePool,
- qemuAddISCSIPoolSourceHost: move to storage/storage_driver.c
and rename to virStorageAddISCSIPoolSourceHost,
- qemuTranslateDiskSourcePoolAuth: move to storage/storage_driver.c
and rename to virStorageTranslateDiskSourcePoolAuth,
- Update users of qemuTranslateDiskSourcePool to use a
new name.
Wrap formatting code common to xm and xl in xenFormatConfigCommon
and export it.
Signed-off-by: Kiarie Kahurani <davidkiarie4@gmail.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
src/xenxs contains parsing/formating functions for the various xen
config formats, and is better named src/xenconfig.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1128751
There's this <driver/> element under <interface/> which can have
several attributes. However, the driver element is currently formated
only if the driver's name or txmode has been specified. This makes
only a little sense as we parse even partial <driver/>, for instance:
<interface type='user'>
<mac address='52:54:00:e5:48:58'/>
<model type='virtio'/>
<driver ioeventfd='on' event_idx='on' queues='5'/>
</interface>
But such XML would never get formatted back.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
A command to freeze a part of mounted file systems is implemented in
upstream QEMU-guest-agent with a name of 'guest-fsfreeze-freeze-list'.
This fixes the name of the command used to partial fsfreeze in qemu driver
when 'mountpoints' option is specified to virDomainFSFreeze API.
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Introduce a new structure to handle an iSCSI host device based on the
existing virDomainHostdevSubsysSCSI by adding a "protocol='iscsi'" to
the <source/> element. The existing scsi_host subsystem RNG was modified
to read an optional "protocol='adapter'", although it won't be written
out nor is it documented as an option (by choice).
The new hostdev structure mimics the existing <disk/> element for an
iSCSI device (network) device. New XML is:
<hostdev mode='subsystem' type='scsi' managed='yes'>
<source protocol='iscsi' name='iqn.1992-01.com.example'>
<host name='example.org' port='3260'/>
<auth username='myname'>
<secret type='iscsi' usage='mycluster_myname'/>
</auth>
</source>
<address type='drive' controller='0' bus='0' target='2' unit='5'/>
</hostdev>
The controller element will mimic the existing scsi_host code insomuch
as when 'lsi' and 'virtio-scsi' are used.
A future patch is going to wire up qemu active block commit jobs;
but as they have similar events and are canceled/pivoted in the
same way as block copy jobs, it is easiest to track all bookkeeping
for the commit job by reusing the <mirror> element. This patch
adds domain XML to track which job was responsible for creating a
mirroring situation, and adds a job='copy' attribute to all
existing uses of <mirror>. Along the way, it also massages the
qemu monitor backend to read the new field in order to generate
the correct type of libvirt job (even though it requires a
future patch to actually cause a qemu event that can be reported
as an active commit). It also prepares to update persistent XML
to match changes made to live XML when a copy completes.
* docs/schemas/domaincommon.rng: Enhance schema.
* docs/formatdomain.html.in: Document it.
* src/conf/domain_conf.h (_virDomainDiskDef): Add a field.
* src/conf/domain_conf.c (virDomainBlockJobType): String conversion.
(virDomainDiskDefParseXML): Parse job type.
(virDomainDiskDefFormat): Output job type.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Distinguish
active from regular commit.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): Set job type.
(qemuDomainBlockPivot, qemuDomainBlockJobImpl): Clean up job type
on completion.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-mirror-old.xml:
Update tests.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Likewise.
* tests/qemuxml2argvdata/qemuxml2argv-disk-active-commit.xml: New
file.
* tests/qemuxml2xmltest.c (mymain): Drive new test.
Signed-off-by: Eric Blake <eblake@redhat.com>
If all features are set to default (including the capabilities policy),
but some capabilities are toggled, we need to output the <features>
element when formatting the config.
Doing a blockcopy operation across a libvirtd restart is not very
robust at the moment. In particular, we are clearing the <mirror>
element prior to telling qemu to finish the job. Also, thanks to the
ability to request async completion, the user can easily regain
control prior to qemu actually finishing the effort, and they should
be able to poll the domain XML to see if the job is still going.
A future patch will fix things to actually wait until qemu is done
before modifying the XML to reflect the job completion. But since
qemu issues identical BLOCK_JOB_COMPLETE events regardless of whether
the job was cancelled (kept the original disk) or completed (pivoted
to the new disk), we have to track which of the two operations were
used to end the job. Furthermore, we'd like to avoid attempts to
end a job where we are already waiting on an earlier request to qemu
to end the job. Likewise, if we miss the qemu event (perhaps because
it arrived during a libvirtd restart), we still need enough state
recorded to be able to determine how to modify the domain XML once
we reconnect to qemu and manually learn whether the job still exists.
Although this patch doesn't actually fix the problem, it is a
preliminary step that makes it possible to track whether a job
has already begun steps towards completion.
* src/conf/domain_conf.h (virDomainDiskMirrorState): New enum.
(_virDomainDiskDef): Convert bool mirroring to new enum.
* src/conf/domain_conf.c (virDomainDiskDefParseXML)
(virDomainDiskDefFormat): Handle new values.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Adjust
client.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot)
(qemuDomainBlockJobImpl): Likewise.
* docs/schemas/domaincommon.rng (diskMirror): Expose new values.
* docs/formatdomain.html.in (elementsDisks): Document it.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Test it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Use better detection of hugetlbfs mount points. Yes, there can be
multiple mount points each serving different huge page size.
Since we already have ability to override the mount point in the
qemu.conf file, this crazy backward compatibility code is brought in.
Now we allow multiple mount points, so the "hugetlbfs_mount" option
must take an list of strings (mount points). But previously, it was
just a string, so we must accept both types now.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
* docs/schemas/domaincommon.rng: Add bhyve domain type, nmdm
serial type and master and slave optional attributes for
serial that are used by nmdm
* tests/domainschematest: Add bhyvexml2argvdata directory
to validate bhyve XMLs
Libvirt documents that the default entropy source for the 'random'
backend of a RNG device is /dev/random. Instead of storing and
propagating NULL across our code and checking it in multiple places fill
the default in the post parse callback and use that in the other places.
virTimeFieldsThenRaw will never return negative result, so I clean up
the related meaningless judgements to make it better.
Signed-off-by: James <james.wangyufei@huawei.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Add support for CDROM devices for bhyve driver using
bhyve(8)'s 'ahci-cd' device type.
As bhyve currently does not support media insertion at runtime,
disallow to start a domain with an empty source path for cdrom
devices.
Added <capabilities> in the <features> section of LXC domains
configuration. This section can contain elements named after the
capabilities like:
<mknod state="on"/>, keep CAP_MKNOD capability
<sys_chroot state="off"/> drop CAP_SYS_CHROOT capability
Users can restrict or give more capabilities than the default using
this mechanism.
In the fbd91d49 commit, new scsihostdata dir is added to EXTRA_DIST in
the tests/Makefile.am. However, the directory itself is not created
anywhere, nor in the commit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Introduce a new function to parse the provided scsi_host parent address
and unique_id value in order to find the /sys/class/scsi_host directory
which will allow a stable SCSI host address
Add a test to scsihosttest to lookup the host# name by using the PCI address
and unique_id value
Add an optional unique_id parameter to nodedev. Allows for easier lookup
and display of the unique_id value in order to document for use with
scsi_host code.
Introduce a new function to read the current scsi_host entry and return
the value found in the 'unique_id' file.
Add a 'scsihosttest' test (similar to the fchosttest, but incorporating some
of the concepts of the mocked pci test library) in order to read the
unique_id file like would be found in the /sys/class/scsi_host tree.
Between reboots and kernel reloads, the SCSI host number used for SCSI
storage pools may change requiring modification to the storage pool XML
in order to use a specific SCSI host adapter.
This patch introduces the "parentaddr" element and "unique_id" attribute
for the SCSI host adapter in order to uniquely identify the adapter
between reboots and kernel reloads. For now the goal is to only parse
and format the XML. Both will be required to be provided in order to
uniquely identify the desired SCSI host.
The new XML is expected to be as follows:
<adapter type='scsi_host'>
<parentaddr unique_id='3'>
<address domain='0x0000' bus='0x00' slot='0x1f' func='0x2'/>
</parentaddr>
</adapter>
where "parentaddr" is the parent device of the SCSI host using the PCI
address on which the device resides and the value from the unique_id file
for the device. Both the PCI address and unique_id values will be used
to traverse the /sys/class/scsi_host/ directories looking at each link
to match the PCI address reformatted to the directory link format where
"domain🚌slot:function" is found. Then for each matching directory
the unique_id file for the scsi_host will be used to match the unique_id
value in the xml.
For a PCI address listed above, this will be formatted to "0000:00:1f.2"
and the links in /sys/class/scsi_host will be used to find the host#
to be used for the 'scsi_host' device. Each entry is a link to the
/sys/bus/pci/devices directories, e.g.:
% ls -al /sys/class/scsi_host/host2
lrwxrwxrwx. 1 root root 0 Jun 1 00:22 /sys/class/scsi_host/host2 -> ../../devices/pci0000:00/0000:00:1f.2/ata3/host2/scsi_host/host2
% cat /sys/class/scsi_host/host2/unique_id
3
The "parentaddr" and "name" attributes are mutually exclusive to identify
the SCSI host number. Use of the "parentaddr" element will be the preferred
mechanism.
This patch only supports to parse and format the XMLs. Later patches will
add code to find out the scsi host number.
Gluster volumes don't start with a leading slash. Our schema for netfs
gluster pools enforces it though. Luckily mount.glusterfs skips it.
Allow a slashless volume name for glusterfs netfs mounts in the schema.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1101999
LXC network devices can now be assigned a custom NIC device name on the
container side. For example, this is configured with:
<interface type='network'>
<source network='default'/>
<guest dev="eth1"/>
</interface>
In this example the network card will appear as eth1 in the guest.
The previous commit 09d4d26 put the interleave at the wrong point;
it didn't allow interleaving with <memory>.
* docs/schema/domaincommon.rng (numatune): Fix interleave location.
* tests/qemuxml2argvdata/qemuxml2argv-numatune-memnode.xml: Adjust test.
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently, we only bind the whole QEMU domain to memory nodes
specified in nodemask altogether. That, however, doesn't make much
sense when one wants to control from where the memory for particular
guest nodes should be allocated. QEMU allows us to do that by
specifying 'host-nodes' parameter for the 'memory-backend-ram' object,
so let's use that.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When qemu switched to using OptsVisitor for -numa parameter, it did
two things in the same patch. One of them is that the numa parameter
is now visible in "query-command-line-options", the second one is that
it enabled using disjoint cpu ranges for -numa specification. This
will be used in later patch.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
That can be lately achieved with by having .param == NULL in the
virQEMUCapsCommandLineProps struct.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There were numerous places where numatune configuration (and thus
domain config as well) was changed in different ways. On some
places this even resulted in persistent domain definition not to be
stable (it would change with daemon's restart).
In order to uniformly change how numatune config is dealt with, all
the internals are now accessible directly only in numatune_conf.c and
outside this file accessors must be used.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In XML format, by definition, order of fields should not matter, so
order of parsing the elements doesn't affect the end result. When
specifying guest NUMA cells, we depend only on the order of the 'cell'
elements. With this patch all older domain XMLs are parsed as before,
but with the 'id' attribute they are parsed and formatted according to
that field. This will be useful when we have tuning settings for
particular guest NUMA node.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This patch adds support for the QEMU vhost-user feature to libvirt.
vhost-user enables the communication between a QEMU virtual machine
and other userspace process using the Virtio transport protocol.
It uses a char dev (e.g. Unix socket) for the control plane,
while the data plane based on shared memory.
The XML looks like:
<interface type='vhostuser'>
<mac address='52:54:00:3b:83:1a'/>
<source type='unix' path='/tmp/vhost.sock' mode='server'/>
<model type='virtio'/>
</interface>
Signed-off-by: Michele Paolino <m.paolino@virtualopensystems.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add file in storagevolxml2xmlin and storagevolxml2xmlout, let
storagevolxml2xmltest and storagevolschematest cover 'nocow'.
Add test case to storagevolxml2argvtest to cover 'nocow'.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Assign the value we're comparing:
(val = func()) < 0
instead of assigning the comparison value:
(val = func() < 0)
Both were introduced along with the code,
the TLS tests by commit bd789df in 0.9.4
net events by commit de87691 in 1.2.2.
Note that the event id type fix is a no-op:
vshNetworkEventIdTypeFromString can only return
-1 (failure) and the event is never used or
0 (the only possible event) and the value of 0 < 0 is still 0.
Rename linuxDomainInterfaceStats to virNetInterfaceStats in order
to allow adding platform specific implementations without
making consumer worrying about specific implementation to be used.
Also, rename util/virstatslinux.c to util/virstats.c so placing
other platform specific implementations into this file don't
look unexpected from the file name.
https://bugzilla.redhat.com/show_bug.cgi?id=1113860
We've always done that. Well, until 990e46c45. Point is, if we don't
format model, we may lose a domain on libvirtd restart. If the
seclabel is implicit however, we should skip it's formatting.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1066894
With current code it's possible to have for instance:
virsh dumpxml mydomain | grep seclabel
<seclabel type='dynamic' model='selinux' relabel='yes'/>
<seclabel type='dynamic' model='selinux' relabel='yes'/>
<seclabel type='dynamic' model='selinux' relabel='yes'/>
<seclabel type='dynamic' model='selinux' relabel='yes'/>
<seclabel type='dynamic' model='selinux' relabel='yes'/>
what doesn't make any sense. We should reject the XML in the config
parsing phase.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
To allow changing the name that is recorded in the overlay of the TOP
image used in a block commit operation, we need to specify the backing
name to qemu. This is done via the "backing-file" attribute to the
block-commit command.
Replace the inline "auth" struct in virStorageSource with a pointer
to a virStorageAuthDefPtr and utilize between the domain_conf, qemu_conf,
and qemu_command sources for finding the auth data for a domain disk
Ressurect the disk-drive-network-iscsi-auth and disk-drive-network-rbd-auth
tests. Make adjustments to the args and xml file to be compatible with
other changes made to the non "-auth" so that the only difference is the
authentication information.
Adjust the qemuargv2xmltest.c to filter out "<secret" and "</auth>" since
the args -> xml has no concept of usage it doesn't get printed. This results
in the </auth> being printed on the same line as "<secret" and the secret
XML is not closed - a bit of an issue, but soon to be fixed.
Use the probing functionality added in the last patch to turn on
a capability bit when active commit is present, and gate active
commit on that capability.
For my own reference: the difference between BLOCKJOB_SYNC and
BLOCKJOB_ASYNC is whether qemu generated an event at the
conclusion of blockpull; basically, RHEL 6.2 was the only release
of qemu that has the sync semantics and lacks the event. RHEL
6.3 added blockcopy, but also picked up on the upstream style
of qemu generating events. As no one is likely to backport
active commit to RHEL 6.2, it's safe for blockcommit to always
require async blockjob support.
Modifying qemucapabilitiestest is painful; the .replies files would
be so much easier if they had comments correlating which command
generated the given reply. Maybe I'll fix that up later...
* src/qemu/qemu_capabilities.h (QEMU_CAPS_ACTIVE_COMMIT): New
capability.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Use the new bit
* src/qemu/qemu_capabilities.c (virQEMUCaps): Name the new bit.
(virQEMUCapsProbeQMPCommands): Set it.
* tests/qemucapabilitiesdata/caps_1.3.1-1.replies: Update.
* tests/qemucapabilitiesdata/caps_1.4.2-1.replies: Likewise.
* tests/qemucapabilitiesdata/caps_1.5.3-1.replies: Likewise.
* tests/qemucapabilitiesdata/caps_1.6.0-1.replies: Likewise.
* tests/qemucapabilitiesdata/caps_1.6.50-1.replies: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
We are about to turn on support for active block commit. Although
qemu 2.0 was the first version to mostly support it, that version
mis-handles 0-length files, and doesn't have anything available for
easy probing. But qemu 2.1 fixed bugs, and made life simpler by
letting the 'top' argument be optional. Unless someone begs for
active commit with qemu 2.0, for now we are just going to enable
it only by probing for qemu 2.1 behavior (anyone backporting active
commit can also backport the optional argument behavior). This
requires qemu.git commit 7676e2c597000eff3a7233b40cca768b358f9bc9.
Although all our actual uses of block-commit supply arguments for
both base and top, we can omit both arguments and use a bogus
device string to trigger an interesting behavior in qemu. All QMP
commands first do argument validation, failing with GenericError
if a mandatory argument is missing. Once that passes, the code
in the specific command gets to do further checking, and the qemu
developers made sure that if device is the only supplied argument,
then the block-commit code will look up the device first, with a
failure of DeviceNotFound, before attempting any further argument
validation (most other validations fail with GenericError). Thus,
the category of error class can reliably be used to decipher
whether the top argument was optional, which in turn implies a
working active commit. Since we expect our bogus device string to
trigger an error either way, the code is written to return a
distinct return value without spamming the logs.
* src/qemu/qemu_monitor.h (qemuMonitorSupportsActiveCommit): New
prototype.
* src/qemu/qemu_monitor.c (qemuMonitorSupportsActiveCommit):
Implement it.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockCommit):
Allow NULL for top and base, for probing purposes.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockCommit):
Likewise, implementing the probe.
* tests/qemumonitorjsontest.c (mymain): Enable...
(testQemuMonitorJSONqemuMonitorSupportsActiveCommit): ...a new test.
Signed-off-by: Eric Blake <eblake@redhat.com>