RDMA Live migration requires registering memory with the hardware, and
thus QEMU offers a new 'capability' to pre-register / mlock() the guest
memory in advance for higher RDMA performance before the migration
begins. This capability is disabled by default, which means QEMU will
register the memory with the hardware in an on-demand basis.
This patch exposes this capability with the following example usage:
virsh migrate --live --rdma-pin-all --migrateuri rdma://hostname domain qemu+ssh://hostname/system
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This patch adds support for RDMA protocol in migration URIs.
USAGE: $ virsh migrate --live --migrateuri rdma://hostname domain qemu+ssh://hostname/system
Since libvirt runs QEMU in a pretty restricted environment, several
files needs to be added to cgroup_device_acl (in qemu.conf) for QEMU to
be able to access the host's infiniband hardware. Full documenation of
the feature can be found on QEMU wiki:
http://wiki.qemu.org/Features/RDMALiveMigration
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Currently we only support TCP protocol for native QEMU migration but
this is going to be changed. Let's make the code more general and remove
hardcoded TCP protocol from several places.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
For compatibility with old libvirt we need to support both tcp:host and
tcp://host migration URIs. Let's make the code that parses them a bit
cleaner.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
RDMA migration uses the 'setup' state in QEMU to optionally lock
all memory before the migration starts. The total time spent in
this state is exposed as VIR_DOMAIN_JOB_SETUP_TIME.
Additionally, QEMU also exports migration throughput (mbps) for both
memory and disk, so let's add them too: VIR_DOMAIN_JOB_MEMORY_BPS,
VIR_DOMAIN_JOB_DISK_BPS.
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
commit 72f919f558 introduced an user
friendly error message when trying to use IDE disks as readonly.
Do the same thing for the SATA bus.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1112939
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Add a new parameter that will allow to return the XML stored in the save
image for further manipulation and adjust the callers. This option will
be used in later patches.
There is no need to acquire the driver-wide lock in
libxlDomainDefineXML. When switching to jobs in the libxl
driver, most driver-wide locks were removed. The locking here
was preserved since I mistakenly thought virDomainObjListAdd
needed protection. This is not the case, so remove the
unnecessary locking.
Commit id '9a2f36ec' added a build conditional of CAP_SYS_RAWIO
in order to determine whether or not a disk definition using rawio
should be allowed on platforms without CAP_SYS_RAWIO. If one was
found, virReportError was used but the code didn't goto cleanup.
This patch adds the goto.
We are not detecting the presence of FIPS from QEMU, but from procfs and
that means it's not QEMU capability. It was decided that we will pass
this flag to QEMU even if it's not supported by old QEMU binaries.
This patch also reverts changes done by commit a21cfb0f to
qemucapabilitestest and implements a new test case in qemuxml2argvtest.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1135431
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1141879
A long time ago I've implemented support for so called multiqueue
net. The idea was to let guest network traffic be processed by
multiple host CPUs and thus increasing performance. However, this
behavior is enabled by QEMU via special ioctl() iterated over the
all tap FDs passed in by libvirt. Unfortunately, SELinux comes in
and disallows the ioctl() call because the /dev/net/tun has label
system_u:object_r:tun_tap_device_t:s0 and 'attach_queue' ioctl()
is not allowed on tun_tap_device_t type. So after discussion with
a SELinux developer we've decided that the FDs passed to the QEMU
should be labelled with svirt_t type and SELinux policy will
allow the ioctl(). Therefore I've made a patch
(cf976d9dcf) that does exactly this. The patch
was fixed then by a443193139 and
b635b7a1af. However, things are not
that easy - even though the API to label FD is called
(fsetfilecon_raw) the underlying file is labelled too! So
effectively we are mangling /dev/net/tun label. Yes, that broke
dozen of other application from openvpn, or boxes, to qemu
running other domains.
The best solution would be if SELinux provides a way to label an
FD only, which could be then labeled when passed to the qemu.
However that's a long path to go and we should fix this
regression AQAP. So I went to talk to the SELinux developer again
and we agreed on temporary solution that:
1) All the three patches are reverted
2) SELinux temporarily allows 'attach_queue' on the
tun_tap_device_t
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
- Provide an implementation for buildPool and deletePool operations
for the ZFS storage backend.
- Add VIR_STORAGE_POOL_SOURCE_DEVICE flag to ZFS pool poolOptions
as now we can specify devices to build pool from
- storagepool.rng: add an optional 'sourceinfodev' to 'sourcezfs' and
add an optional 'target' to 'poolzfs' entity
- Add a couple of tests to storagepoolxml2xmltest
If the qemu being used doesn't support JSON, then querying for IOThread
data would fail. In that case, ensure the *iothreads is NULL and return 0
as the count of iothreads available.
Currently, build with clang fails with:
CC qemu/libvirt_driver_qemu_impl_la-qemu_command.lo
qemu/qemu_command.c:6580:58: error: implicit conversion from enumeration type
'virMemAccess' to different enumeration type 'virTristateSwitch'
[-Werror,-Wenum-conversion]
virTristateSwitch memAccess = def->cpu->cells[i].memAccess;
~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~^~~~~~~~~
1 error generated.
Fix that by using virMemAccess instead of virTristateSwitch.
Commit f05b6a91 added virQEMUDriverConfigPtr argument to the
virQEMUCapsFillDomainCaps function and it uses forward declaration
of virQEMUDriverConfig and virQEMUDriverConfigPtr that casues clang
build to fail:
gmake[3]: Entering directory `/usr/home/novel/code/libvirt/src'
CC qemu/libvirt_driver_qemu_impl_la-qemu_capabilities.lo
In file included from qemu/qemu_capabilities.c:43:
In file included from qemu/qemu_hostdev.h:27:
qemu/qemu_conf.h:63:37: error: redefinition of typedef 'virQEMUDriverConfig'
is a C11 feature [-Werror,-Wtypedef-redefinition]
typedef struct _virQEMUDriverConfig virQEMUDriverConfig;
^
qemu/qemu_capabilities.h:328:37: note: previous definition is here
typedef struct _virQEMUDriverConfig virQEMUDriverConfig;
^
Fix that by passing loader and nloader config attributes directly
instead of passing complete config.
Commit f36a94f introduced a double free on all success paths
in qemuSharedDeviceEntryInsert.
Only call qemuSharedDeviceEntryFree on the error path and
set entry to NULL before jumping there if the entry already
is in the hash table.
https://bugzilla.redhat.com/show_bug.cgi?id=1142722
Clean up all _virDomainMemoryStat.
Signed-off-by: James <james.wangyufei@huawei.com>
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Clean up all _virDomainBlockStats.
Signed-off-by: James <james.wangyufei@huawei.com>
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Clean up all _virDomainInterfaceStats.
Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Live definition was used to look up the disk index while persistent one
was indexed leading to a crash in qemuDomainGetBlockIoTune. Use the
correct def and report a nice error.
Unfortunately it's accessible via read-only connection, though it can
only crash libvirtd in the cases where the guest is hot-plugging disks
without reflecting those changes to the persistent definition. So
avoiding hotplug, or doing hotplug where persistent is always modified
alongside live definition, will avoid the out-of-bounds access.
Introduced in: eca96694a7f992be633d48d5ca03cedc9bbc3c9aa (v0.9.8)
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1140724
Reported-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1135396
There are two ways how to tell qemu to use huge pages. The first one
is suitable for domains with NUMA nodes: the path to hugetlbfs mount
is appended to NUMA node definition on the command line. The second
one is suitable for UMA domains: here there's this global '-mem-path'
argument that accepts path to the hugetlbfs mount point. However, the
latter case was not used for all the cases that it should be. For
instance:
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB' nodeset='0'/>
</hugepages>
</memoryBacking>
didn't trigger the '-mem-path' so the huge pages - despite being
configured - were not used at all.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
As of 136ad4974 it is possible to specify different huge pages per
guest NUMA node. However, there's no check if nodeset specified in
./hugepages/page contains only those guest NUMA nodes that exist.
In other words with current code it is possible to define meaningless
combination:
<memoryBacking>
<hugepages>
<page size='1048576' unit='KiB' nodeset='0,2-3'/>
<page size='2048' unit='KiB' nodeset='1,4'/>
</hugepages>
</memoryBacking>
<vcpu placement='static'>4</vcpu>
<cpu>
<numa>
<cell id='0' cpus='0' memory='1048576'/>
<cell id='1' cpus='1' memory='1048576'/>
<cell id='2' cpus='2' memory='1048576'/>
<cell id='3' cpus='3' memory='1048576'/>
</numa>
</cpu>
Notice the node 4 in <hugepages/>?
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This patch implements the VIR_DOMAIN_STATS_BLOCK group of statistics.
To do so, a helper function to get the block stats of all the disks of
a domain is added.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
This patch implements the VIR_DOMAIN_STATS_INTERFACE group of
statistics.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
This patch implements the VIR_DOMAIN_STATS_VCPU group of statistics. To
do so, this patch also extracts a helper to gather the vCPU information.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
This patch implements the VIR_DOMAIN_STATS_CPU_TOTAL group of
statistics.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Future patches which will implement more bulk stats groups for QEMU will
need to access the connection object.
To accommodate that, a few changes are needed:
* enrich internal prototype to pass qemu driver object
* add per-group flag to mark if one collector needs monitor access or not
* If at least one collector of the requested stats needs monitor access
we must start a query job for each domain. The specific collectors
will run nested monitor jobs inside that.
* If the job can't be acquired we pass flags to the collector so
specific collectors that need monitor access can be skipped in order
to gather as much data as is possible.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Check to see if the UEFI binary mentioned in qemu.conf actually
exists, and if so expose it in domcapabilities like
<loader ...>
<value>/path/to/ovmf</value>
</loader>
We introduce some generic domcaps infrastructure for handling
a dynamic list of string values, it may be of use for future bits.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Up till now the virQEMUCapsFillDomainCaps() was type of void as
there was no way for it to fail. This is, however, going to
change in the next commit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
qemu for IBM Power processor architecture is adding functionality for
supporting multiple 'pseries' machine type versions, each with different
capabilities. This patch is for supporting the same
Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
As of 542899168c we learned libvirt to use UEFI for domains.
However, management applications may firstly query if libvirt
supports it. And this is where virConnectGetDomainCapabilities()
API comes handy.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
virStorageSourceInitChainElement initializes a new storage chain element
for use as a new disk source. If the new element doesn't contain the
driver name, copy it from the old source.
This fixes issue where a disk would forget the driver after a snapshot.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1140984
Commit b606bbb4 broke reporting of errors when setting of guest time
fails via the guest agent as the return value is not checked and later
overwritten by the return value qemuMonitorRTCResetReinjection();
Fix this by checking the return value before resetting the RTC
reinjection.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1142294
For tuning the network, alternative devices
for creating tap and vhost devices can be specified via:
<backend tap='/dev/net/tun' vhost='/dev/net-vhost'/>
Instead of checking upfront if the <driver> element will be needed
in a big condition, just format all the attributes into a string
and output the <driver> element if the string is not empty.
We already are checking for negative value, reporting an error, but
using wrong function and the check only succeeds when a value that
cannot be converted to number successfully is encountered. This patch
provides just a minor change in call of the right version
of function virStrToLong.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1138539
The first one occurs in openvzDomainMigratePrepare3Params() where in
case no remote uri is given, the distant hostname is used. The name is
obtained via virGetHostname() which require callers to free the
returned value.
The second leak lies in openvzDomainMigratePerform3Params(). There's a
virCommand used later. However, at the beginning of the function
virCheckFlags() is called which returns. So the command created was
leaked.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently, the setns() wrapper is supported only for x86_64 and i686
which leaves us failing to build on other platforms like arm, aarch64
and so on. This means, that the wrapper needs to be extended to those
platforms and make to fail on runtime not compile time.
The syscall numbers for other platforms was fetched using this
command:
kernel.git $ git grep "define.*__NR_setns" | grep -e arm -e powerpc -e s390
arch/arm/include/uapi/asm/unistd.h:#define __NR_setns (__NR_SYSCALL_BASE+375)
arch/arm64/include/asm/unistd32.h:#define __NR_setns 375
arch/powerpc/include/uapi/asm/unistd.h:#define __NR_setns 350
arch/s390/include/uapi/asm/unistd.h:#define __NR_setns 339
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The backing store string location offset 0 determines that the file
isn't present. The string size shouldn't be then checked:
from qemu.git/docs/specs/qcow2.txt
== Header ==
The first cluster of a qcow2 image contains the file header:
Byte 0 - 3: magic
QCOW magic string ("QFI\xfb")
4 - 7: version
Version number (valid values are 2 and 3)
8 - 15: backing_file_offset
Offset into the image file at which the backing file name
is stored (NB: The string is not null terminated). 0 if the
image doesn't have a backing file.
16 - 19: backing_file_size
Length of the backing file name in bytes. Must not be
longer than 1023 bytes. Undefined if the image doesn't have
a backing file. ^^^^^^^^^
This patch intentionally leaves the backing file string size check in
place in case a malformatted file would be presented to libvirt. Also
according to the docs the string size is maximum 1023 bytes, thus this
patch adds a check to verify that.
I was also able to verify that the check was done the same way in the
legacy qcow fromat (in qemu's code).
If there are no iothreads, then return from qemuProcessDetectIOThreadPIDs
without error; otherwise, the following occurs:
error: Failed to start domain $dom
error: An error occurred, but the cause is unknown
https://bugzilla.redhat.com/show_bug.cgi?id=1101574
Add an option 'iothreadpin' to the <cpuset> to allow for setting the
CPU affinity for each IOThread.
The iothreadspin will mimic the vcpupin with respect to being able to
assign each iothread to a specific CPU, although iothreads ids start
at 1 while vcpu ids start at 0. This matches the iothread naming scheme.
Modify qemuProcessStart() in order to allowing setting affinity to
specific CPU's for IOThreads. The process followed is similar to
that for the vCPU's.
This involves adding a function to fetch the IOThread id's via
qemuMonitorGetIOThreads() and adding them to iothreadpids[] list.
Then making sure all the cgroup data has been properly set up and
finally assigning affinity.
In order to support cpuset setting, introduce qemuSetupCgroupIOThreadsPin
and qemuSetupCgroupForIOThreads to mimic the existing Vcpu API's.
These will support having an 'iotrhreadpin' element in the 'cpuset' in
order to pin named IOThreads to specific CPU's. The IOThread pin names
will follow the IOThread naming scheme starting at 1 (eg "iothread1")
up through an including the def->iothreads value.
When spanning tree protocol is allowed in bridge settings, forward delay
value is set as well (default is 0 if omitted). Until now, there was no
check for delay value validity. Delay makes sense only as a positive
numerical value.
Note: However, even if you provide positive numerical value, brctl
utility only uses values from range <2,30>, so the number provided can
be modified (kernel most likely) to fall within this range.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1125764
Coverity complains about the calculation of the buf & len within
the PROBE macro. So to quiet things down, do the calculation prior
to usage in either write() or qemuMonitorIOWriteWithFD() calls and
then have the PROBE use the calculated values - which works.
Seems when commit id 'ea130e3b' added the checks to ensure each of
the hard_limit, soft_limit, and swap_hard_limit wasn't set at
VIR_DOMAIN_MEMORY_PARAM_UNLIMITED - a copy/paste error of using
the 'hard_limit' for each comparison was done. Adjust the code.
Coverity complains that because of how 'offset' is initialized to
0 (zero), the resulting math and comparison on rem is pointless.
According to the origin commit id '3ec128989', the code is a
replacement for gmtime(), but without the localtime() or GMT
calculations - so just remove this code and add a comment
indicating the removal
Since 98b9acf5aa
This was a false positive where Coverity was complaining that the
remoteDeserializeTypedParameters() could allocate 'params', but
none of the callers could return the allocated memory back to their
caller since on input the param was passed by value. Additionally,
the flow of the code was that if params was NULL on entry, then each
function would return 'nparams' as the number of params entries the
caller would need to allocate in order to call the function again
with 'nparams' and 'params' being set. By the time the deserialize
routine was called params would have something. For other callers
where the 'params' was passed by reference as NULL since it's expected
that the deserialize allocates the memory and then have that passed
back to the original caller to dispose there was no Coverity issue.
As it turns out Coverity didn't quite seem to understand the
relationship between 'nparams' and 'params'; however, if the
!userAllocated path of the deserialize code compared against
limit in any manner, then the Coverity error went away which
was quite strange, but useful.
As it turns out one code path remoteDomainGetJobStats had a
comparison against 'limit' while another remoteConnectGetAllDomainStats
did not assuming that limit would be checked. So I refactored the
code a bit to cause the limit check to occur in deserialize for
both conditions and then only made the check of current returned
size against the incoming *nparams fail the non allocation case.
This means the job code doesn't need to check the limit any more,
while the stats code now does check the limit.
Additionally, to help perhaps decipher which of the various
callers to the deserialize code caused the failure - I used
a #define to pass the __FUNCNAME__ of the caller along so that
error messages could have something like:
error: remoteConnectGetAllDomainStats: too many parameters '2' for nparams '0'
error: Reconnected to the hypervisor
(it's a contrived error just to show the funcname in the error)
The manufacurer and product from USB device itself are usually not particularly
useful -- they tend to be missing, or ugly (all-uppercase, padded with spaces,
etc.). Prefer what's in the usb id database and fall back to descriptors only
if the device is too new to be in database.
https://bugzilla.redhat.com/show_bug.cgi?id=1138887
This patch adds initial migration support to the OpenVZ driver,
using the VIR_DRV_FEATURE_MIGRATION_PARAMS family of migration
functions.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We stupidly modeled block job bandwidth after migration
bandwidth, which in turn was an 'unsigned long' and therefore
subject to 32-bit vs. 64-bit interpretations. To work around
the fact that 10-gigabit interfaces are possible but don't fit
within 32 bits, the original interface took the number scaled
as MiB/sec. But this scaling is rather coarse, and it might
be nice to tune bandwidth finer than in megabyte chunks.
Several of the block job calls that can set speed are fed
through a common interface, so it was easier to adjust them all
at once. Note that there is intentionally no flag for the new
virDomainBlockCopy; there, since the API already uses a 64-bit
type always, instead of a possible 32-bit type, and is brand
new, it was easier to just avoid scaling issues. As with the
previous patch that adjusted the query side (commit db33cc24),
omitting the new flag preserves old behavior, and the
documentation now mentions limits of what happens when a 32-bit
machine is on either client or server side.
* include/libvirt/libvirt.h.in (virDomainBlockJobSetSpeedFlags)
(virDomainBlockPullFlags)
(VIR_DOMAIN_BLOCK_REBASE_BANDWIDTH_BYTES)
(VIR_DOMAIN_BLOCK_COMMIT_BANDWIDTH_BYTES): New enums.
* src/libvirt.c (virDomainBlockJobSetSpeed, virDomainBlockPull)
(virDomainBlockRebase, virDomainBlockCommit): Document them.
* src/qemu/qemu_driver.c (qemuDomainBlockJobSetSpeed)
(qemuDomainBlockPull, qemuDomainBlockRebase)
(qemuDomainBlockCommit, qemuDomainBlockJobImpl): Support new flag.
Signed-off-by: Eric Blake <eblake@redhat.com>
Upstream qemu 1.4 added some drive-mirror tunables not present
when it was first introduced in 1.3. Management apps may want
to set these in some cases (for example, without tuning
granularity down to sector size, a copy may end up occupying
more bytes than the original because an entire cluster is
copied even when only a sector within the cluster is dirty,
although tuning it down results in more CPU time to do the
copy). I haven't personally needed to use the parameters, but
since they exist, and since the new API supports virTypedParams,
we might as well expose them.
Since the tuning parameters aren't often used, and omitted from
the QMP command when unspecified, I think it is safe to rely on
qemu 1.3 to issue an error about them being unsupported, rather
than trying to create a new capability bit in libvirt.
Meanwhile, all versions of qemu from 1.4 to 2.1 have a bug where
a bad granularity (such as non-power-of-2) gives a poor message:
error: internal error: unable to execute QEMU command 'drive-mirror': Invalid parameter 'drive-virtio-disk0'
because of abuse of QERR_INVALID_PARAMETER (which is supposed to
name the parameter that was given a bad value, rather than the
value passed to some other parameter). I don't see that a
capability check will help, so we'll just live with it (and it
has since been improved in upstream qemu).
* src/qemu/qemu_monitor.h (qemuMonitorDriveMirror): Add
parameters.
* src/qemu/qemu_monitor.c (qemuMonitorDriveMirror): Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDriveMirror):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDriveMirror):
Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockCopyCommon): Likewise.
(qemuDomainBlockRebase, qemuDomainBlockCopy): Adjust callers.
* src/qemu/qemu_migration.c (qemuMigrationDriveMirror): Likewise.
* tests/qemumonitorjsontest.c (qemuMonitorJSONDriveMirror): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
The hard part of managing the disk copy is already coded; all
this had to do was convert the XML and virTypedParameters into
the internal representation.
With this patch, all blockcopy operations that used the old
API should also work via the new API. Additional extensions,
such as supporting the granularity tunable or a network rather
than file destination, will be added as later patches.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): New function.
Signed-off-by: Eric Blake <eblake@redhat.com>
In order to implement the new virDomainBlockCopy, the existing
block copy internal implementation needs to be adjusted. The
new function will parse XML into a storage source, and parse
typed parameters into integers, then call into the same common
backend. For now, it's easier to keep the same implementation
limits that only local file destinations are suported, but now
the check needs to be explicit. Similar to qemuDomainBlockJobImpl
consuming 'vm', this code also consumes the caller's 'mirror'
description of the destination.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): Rename...
(qemuDomainBlockCopyCommon): ...and adjust parameters.
(qemuDomainBlockRebase): Adjust caller.
Signed-off-by: Eric Blake <eblake@redhat.com>
When a domain is undefined, there are options to remove it's
managed save state or snapshots. However, there's another file
that libvirt creates per domain: the NVRAM variable store file.
Make sure that the file is not left behind if the domain is
undefined.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If we end up at the cleanup lable before we've VIR_EXPAND_N the list,
then calling virQEMUCapsFreeStringList() with a NULL proplist could
theoretically deref proplist if nproplist was set. Coverity doesn't
seem to acknowledge the relationship between proplist and nproplist
assuming in virQEMUCapsFreeStringList that nproplist could be at
least 1 and thus have a null deref. It only seems to follow the
NULL proplist.
Signed-off-by: John Ferlan <jferlan@redhat.com>
With the virGetGroupList() change in place - Coverity further complains
that if we fail to virFork(), the groups will be leaked - which aha seems
to be the case. Adjust the logic to save off the -errno, free the groups,
and then return the value we saved
Signed-off-by: John Ferlan <jferlan@redhat.com>
This ends up being a very bizarre false positive. With an assist from
eblake, the claim is that mgetgroups() could return a -1 value, but yet
still have a groups buffer allocated, yet the example shown doesn't
seem to prove that.
Rather than fret about it, by adding a well placed sa_assert() on the
returned *list value we can "assure" ourselves that the mgetgroups()
failure path won't signal this condition.
Signed-off-by: John Ferlan <jferlan@redhat.com>
If a (floppy) drive isn't selected for snapshot explicitly and is empty
don't try to snapshot it. For external snapshots this would fail as we
can't generate a name for the snapshot from an empty drive.
Reported-by: Pavel Hrdina <phrdina@redhat.com>
To express empty drive we historically use storage source with empty
path. Unfortunately NBD disks may be declared without a path.
Add a helper to wrap this logic.
The libxl driver was blindly assigning libvirt's
virDomainLifecycleAction to libxl's libxl_action_on_shutdown, when
in fact the various actions take on different values in these enums.
Introduce helpers to properly map the enum values.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Test suites using the port allocator don't want to have different
behaviour depending on whether a port is in use on the host. Add
a VIR_PORT_ALLOCATOR_SKIP_BIND_CHECK which test suites can use
to skip the bind() test. The port allocator will thus only track
ports in use by the test suite process itself. This is fine when
using the port allocator to generate guest configs which won't
actually be launched
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
I've noticed two problem with the automatically created NVRAM varstore
file. The first, even though I run qemu as root:root for some reason I
get Permission denied when trying to open the _VARS.fd file. The
problem is, the upper directory misses execute permissions, which in
combination with us dropping some capabilities result in EPERM.
The next thing is, that if I switch SELinux to enforcing mode, I get
another EPERM because the vars file is not labeled correctly. It is
passed to qemu as disk and hence should be labelled as disk. QEMU may
write to it eventually, so this is different to kernel or initrd.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
With all the changes in my previous foray into this code, I forgot to
remove the libxlDomainEventQueue(driver, event); call inside the
dom == NULL condition.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Coverity notes that if the virConnectListAllDomains returns a negative
value then the loop at the cleanup label that ends on numDomains will
have issues.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Coverity notes that if qemuMonitorGetMachines() returns a negative
nmachines value, then the code at the cleanup label will have issues.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Coverity notes that if the call to virBitmapParse() returns a negative
value, then when we jump to the error label, the call to
virCapabilitiesClearHostNUMACellCPUTopology() will have issues
with the negative nb_cpus
Signed-off-by: John Ferlan <jferlan@redhat.com>
If the virNumaGetNodeCPUs() call fails with -1, then jumping to cleanup
with 'cpus == NULL' and calling virCapabilitiesClearHostNUMACellCPUTopology
will cause issues.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In qemuProcessInitPCIAddresses() if qemuMonitorGetAllPCIAddresses()
returns a negative (or zero) value, then no need to call the
qemuProcessDetectPCIAddresses().
Signed-off-by: John Ferlan <jferlan@redhat.com>
The code compares def->forwarders when deciding to return 0 at a
couple of points, then uses "def->nfwds" as a way to index into
the def->forwarders array. That reference results in Coverity
complaining that def->forwarders being NULL was checked as part
of an arithmetic OR operation where failure could be any one 5
conditions, but that is not checked when entering the loop to
dereference the array. Changing the comparisons to use nfwds
will clear the warnings
Signed-off-by: John Ferlan <jferlan@redhat.com>
If the qemuMigrationEatCookie() fails to set mig, we jump to cleanup:
which will call qemuMigrationCancelDriveMirror() without first checking
if mig == NULL
Signed-off-by: John Ferlan <jferlan@redhat.com>
Perhaps a false positive, but since Coverity doesn't understand the
relationship between the 'count' and the 'strings', rather than leave
the chance the on input 'strings' is NULL and causes a deref - just
check for it and return
Signed-off-by: John Ferlan <jferlan@redhat.com>
If the VIR_STRDUP(exptime,...) fails, then we will jump to cleanup,
no need to check if exptime is set which causes Coverity to issue
a complaint in the virStrToLong_ll call because there wasn't a check
for a NULL value while there was one for the reference right after
the VIR_STRDUP().
Signed-off-by: John Ferlan <jferlan@redhat.com>
If we jump to cleanup before allocating the 'result', then the call
to virBlkioDeviceArrayClear will deref result causing a problem.
Signed-off-by: John Ferlan <jferlan@redhat.com>
If we jump to cleanup before allocating 'result', then the call to
virBlkioDeviceArrayClear() could dereference result
Signed-off-by: John Ferlan <jferlan@redhat.com>
If the virJSONValueNewObject() fails, then rather than going to error
and getting a Coverity false positive since it doesn't seem to understand
the relationship between nkeywords, keywords, and values and seems to
believe calling qemuFreeKeywords will cause a NULL deref - just return NULL
Signed-off-by: John Ferlan <jferlan@redhat.com>
Adjust the parentheses in/for the waitpid loops; otherwise, Coverity
points out:
(1) Event assignment: Assigning: "waitret" = "waitpid(pid, &status, 0) == -1"
(2) Event between: At condition "waitret == -1", the value of "waitret"
must be between 0 and 1.
(3) Event dead_error_condition: The condition "waitret == -1" cannot
be true.
(4) Event dead_error_begin: Execution cannot reach this statement:
"ret = -*__errno_location();".
Signed-off-by: John Ferlan <jferlan@redhat.com>
Coverity complains that when multiplying to 32 bit values that eventually
will be stored in a 64 bit value that it's possible the math could
overflow unless one of the values being multiplied is type cast to
the proper size.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Coverity complains that checking for !domlist after setting doms = domlist
and making a deref of doms just above
It seems the call in question was intended to me made in the case that
'doms' was passed in and not when the virDomainObjListExport() call
allocated domlist and already called virConnectGetAllDomainStatsCheckACL().
Thus rather than check for !domlist - check that "doms != domlist" in
order to avoid the Coverity message.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Handle a few places where Coverity complains about the value being
unused. For two of them (Close cases) - the comments above the close
indicate there is no harm to ignore the error - so added an ignore_value.
For the other condition, added an rc check like other callers.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since cd4d547576
Coverity notes that setting 'ret = -3' prior to the unconditional
setting of 'ret = 0' will cause the value to be UNUSED.
Since the comment indicates that it is expect to allow the code
to continue, just remove the ret = -3 setting.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In qemuDomainSetBlkioParameters(), Coverity points out that the calls
to qemuDomainParseBlkioDeviceStr() are slightly different and points
out there may be a cut-n-paste error.
In the first call (AFFECT_LIVE), the second parameter is "param->field";
however, for the second call (AFFECT_CONFIG), the second parameter is
"params->field". It seems the "param->field" is correct especially since
each path as a setting of "param" to "¶ms[i]". Furthermore, there
were a few more instances of using "params[i]" instead of "param->"
which I cleaned up.
Signed-off-by: John Ferlan <jferlan@redhat.com>
After a4431931 the TAP FDs ale labeled with image label instead
of the process label. On the other hand, the commit was
incomplete as a few lines above, there's still old check for the
process label presence while it should be check for the image
label instead.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
From time to time weird bugreports occur on the list, e.g [1].
Even though the kernel supports setns syscall, there's an older
glibc in the system that misses a wrapper over the syscall.
Hence, after the configure phase we think there's no setns
support in the system, which is obviously wrong. On the other
hand, we can't rely on linux distributions to provide newer glibc
soon. Therefore we need to introduce the wrapper on or own.
1: https://www.redhat.com/archives/libvir-list/2014-September/msg00492.html
Signed-off-by: Stephan Sachse <ste.sachse@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
virProcessTranslateStatus is used on error paths that should not spoil
the returned error. As the errors are ignored, use the quiet versions of
virAsprintf to create the message.
When using split UEFI image, it may come handy if libvirt manages per
domain _VARS file automatically. While the _CODE file is RO and can be
shared among multiple domains, you certainly don't want to do that on
the _VARS file. This latter one needs to be per domain. So at the
domain startup process, if it's determined that domain needs _VARS
file it's copied from this master _VARS file. The location of the
master file is configurable in qemu.conf.
Temporary, on per domain basis the location of master NVRAM file can
be overridden by this @template attribute I'm inventing to the
<nvram/> element. All it does is holding path to the master NVRAM file
from which local copy is created. If that's the case, the map in
qemu.conf is not consulted.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
QEMU now supports UEFI with the following command line:
-drive file=/usr/share/OVMF/OVMF_CODE.fd,if=pflash,format=raw,unit=0,readonly=on \
-drive file=/usr/share/OVMF/OVMF_VARS.fd,if=pflash,format=raw,unit=1 \
where the first line reflects <loader> and the second one <nvram>.
Moreover, these two lines obsolete the -bios argument.
Note that UEFI is unusable without ACPI. This is handled properly now.
Among with this extension, the variable file is expected to be
writable and hence we need security drivers to label it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Up to now, users can configure BIOS via the <loader/> element. With
the upcoming implementation of UEFI this is not enough as BIOS and
UEFI are conceptually different. For instance, while BIOS is ROM, UEFI
is programmable flash (although all writes to code section are
denied). Therefore we need new attribute @type which will
differentiate the two. Then, new attribute @readonly is introduced to
reflect the fact that some images are RO.
Moreover, the OVMF (which is going to be used mostly), works in two
modes:
1) Code and UEFI variable store is mixed in one file.
2) Code and UEFI variable store is separated in two files
The latter has advantage of updating the UEFI code without losing the
configuration. However, in order to represent the latter case we need
yet another XML element: <nvram/>. Currently, it has no additional
attributes, it's just a bare element containing path to the variable
store file.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
After the previous commit, migration statistics on the source and
destination hosts are not equal because the destination updated time
statistics. Let's send the result back so that the same data can be
queried on both sides of the migration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Total time of a migration and total downtime transfered from a source to
a destination host do not count with the transfer time to the
destination host and with the time elapsed before guest CPUs are
resumed. Thus, source libvirtd remembers when migration started and when
guest CPUs were paused. Both timestamps are transferred to destination
libvirtd which uses them to compute total migration time and total
downtime. Obviously, this requires the time to be synchronized between
the two hosts. The reported times are useless otherwise but they would
be equally useless if we didn't do this recomputation so don't lose
anything by doing it.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When migrating a transient domain or with VIR_MIGRATE_UNDEFINE_SOURCE
flag, the domain may disappear from source host. And so will migration
statistics associated with the domain. We need to transfer the
statistics at the end of a migration so that they can be queried at the
destination host.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
virDomainGetJobStats gains new VIR_DOMAIN_JOB_STATS_COMPLETED flag that
can be used to fetch statistics of a completed job rather than a
currently running job.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Job statistics data were tracked in several structures and variables.
Let's make a new qemuDomainJobInfo structure which can be used as a
single source of statistics data as a preparation for storing data about
completed a job.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The new blockcopy API wants to reuse only a subset of the disk
hotplug parser - namely, we only care about the embedded
virStorageSourcePtr inside a <disk> XML. Strange as it may
seem, it was easier to just parse an entire disk definition,
then throw away everything but the embedded source, than it
was to disentangle the source parsing code from the rest of
the overall disk parsing function. All that I needed was a
couple of tweaks and a new internal flag that determines
whether the normally-mandatory target element can be
gracefully skipped, since everything else was already optional.
* src/conf/domain_conf.h (virDomainDiskSourceParse): New
prototype.
* src/conf/domain_conf.c (VIR_DOMAIN_XML_INTERNAL_DISK_SOURCE):
New flag.
(virDomainDiskDefParseXML): Honor flag to make target optional.
(virDomainDiskSourceParse): New function.
Signed-off-by: Eric Blake <eblake@redhat.com>
qemu now checks for invalid address type for a panic device, which is
currently implemented only to use ISA address type, thus rejecting
any other options, except for leaving XML attributes blank, in that case,
defaults are used (this behaviour remains the same from earlier verions).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1138125
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When QEMU fails during incoming migration after we successfully started
it (i.e., during Perform or Finish phase), we report a rather unhelpful
message
Unable to read from monitor: Connection reset by peer
We already have a code that takes error messages from QEMU's error
output but we disable it once QEMU successfully starts. This patch
postpones this until the end of Finish phase during incoming migration
so that we can report a much better error message:
internal error: early end of file from monitor: possible problem:
Unknown savevm section or instance '0000:00:05.0/virtio-balloon' 0
load of migration failed
https://bugzilla.redhat.com/show_bug.cgi?id=1090093
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Return failure right away when the domain object can't be looked up
instead of jumping to cleanup. This allows to remove the condition
before unlocking the domain object.
The code would lookup the snapshot object before acquiring the job. This
could lead to a crash as one thread could delete the snapshot object,
while a second thread already had the reference.
Signed-off-by: Jincheng Miao <jmiao@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Creating snapshots modifies the domain state. Currently we wouldn't
enter the job for certain operations although they would modify the
state. Refactor job handling so that everything is covered by an async
job.
For security type='none' libvirt according to the docs should not
generate seclabel be it for selinux or any model. So, skip the
reservation of labels when type is none.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Fairly straightforward - I got lucky that the generated functions
worked out of the box :)
* src/remote/remote_protocol.x (remote_domain_block_copy_args):
New struct.
(REMOTE_PROC_DOMAIN_BLOCK_COPY): New RPC.
* src/remote/remote_driver.c (remote_driver): Wire it up.
* src/remote_protocol-structs: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com>
The usual portability fixes; and this includes a fix that adds
a new syntax check for double semicolons (commit 28de556 fixed
some, but gnulib found a better check).
* .gnulib: Update to latest.
* src/xenconfig/xen_common.c (xenFormatConfigCommon): Fix offender.
Signed-off-by: Eric Blake <eblake@redhat.com>
Since 9f781da69d
Resolve a libvirtd crash in virStoragePoolSourceFindDuplicate()
when there is an existing SCSI pool defined with adapter type as
'scsi_host' and defining a new SCSI pool with adapter type as
'fc_host' and parent attribute missing or vice versa.
For example, if there is an existing SCSI pool with adapter type
as 'scsi_host' defined using the following XML
<pool type='scsi'>
<name>TEST_SCSI_POOL</name>
<source>
<adapter type='scsi_host' name='scsi_host1'/>
</source>
<target>
<path>/dev/disk/by-path</path>
</target>
</pool>
When defining another SCSI pool with adapter type as 'fc_host' using the
following XML will crash libvirtd
<pool type='scsi'>
<name>TEST_SCSI_FC_POOL</name>
<source>
<adapter type='fc_host' wwnn='1234567890abcdef' wwpn='abcdef1234567890'/>
</source>
<target>
<path>/dev/disk/by-path</path>
</target>
</pool>
Same is true for the reverse case as well where there exists a SCSI pool
with adapter type as 'fc_host' and another SCSI pool is defined with
adapter type as 'scsi_host'.
This happens because for fc_host 'name' is optional attribute whereas for
scsi_host its mandatory. However the check in libvirt for finding duplicate
storage pools didn't take that into account while comparing
Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
To date, anyone performing a block copy and pivot ends up with
the destination being treated as <disk type='file'>. While this
works for data access for a block device, it has at least one
noticeable shortcoming: virDomainGetBlockInfo() reports allocation
differently for block devices visited as files (the size of the
device) than for block devices visited as <disk type='block'>
(the maximum sector used, as reported by qemu); and this difference
is significant when trying to manage qcow2 format on block devices
that can be grown as needed.
Of course, the more powerful virDomainBlockCopy() API can already
express the ability to set the <disk> type. But a new API can't
be backported, while a new flag to an existing API can; and it is
also rather inconvenient to have to resort to the full power of
generating XML when just adding a flag to the older call will do
the trick. So this patch enhances blockcopy to let the user flag
when the resulting XML after the copy must list the device as
type='block'.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_REBASE_COPY_DEV):
New flag.
* src/libvirt.c (virDomainBlockRebase): Document it.
* tools/virsh-domain.c (opts_block_copy, blockJobImpl): Add
--blockdev option.
* tools/virsh.pod (blockcopy): Document it.
* src/qemu/qemu_driver.c (qemuDomainBlockRebase): Allow new flag.
(qemuDomainBlockCopy): Remember the flag, and make sure it is only
used on actual block devices.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Test it.
Signed-off-by: Eric Blake <eblake@redhat.com>
While reviewing the new virDomainBlockCopy API, Peter Krempa
pointed out that our existing design of using MiB/s for block
job bandwidth is rather coarse, especially since qemu tracks
it in bytes/s; so virDomainBlockCopy only accepts bytes/s.
But once the new API is implemented for qemu, we will be in
the situation where it is possible to set a value that cannot
be accurately reflected back to the user, because the existing
virDomainGetBlockJobInfo defaults to the coarser units.
Fortunately, we have an escape hatch; and one that has already
served us well in the past: we can use the flags argument to
specify which scale to use (see virDomainBlockResize for prior
art). This patch fixes the query side of the API; made easier
by previous patches that split the query side out from the
modification code. Later patches will address the virsh
interface, as well retrofitting all other blockjob APIs to
also accept a flag for toggling bandwidth units.
* include/libvirt/libvirt.h.in (_virDomainBlockJobInfo)
(VIR_DOMAIN_BLOCK_COPY_BANDWIDTH): Document sizing issues.
(virDomainBlockJobInfoFlags): New enum.
* src/libvirt.c (virDomainGetBlockJobInfo): Document new flag.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJobInfo): Add parameter.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJobInfo): Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJobInfo):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJobInfo)
(qemuMonitorJSONGetBlockJobInfoOne): Likewise. Don't scale here.
* src/qemu/qemu_migration.c (qemuMigrationDriveMirror): Update
callers.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot)
(qemuDomainBlockJobImpl): Likewise.
(qemuDomainGetBlockJobInfo): Likewise, and support new flag.
Signed-off-by: Eric Blake <eblake@redhat.com>
The previous patch hoisted some bounds checks to the callers;
but someone that is not aware of the hoisted check could now
try passing an integer between LLONG_MAX and ULLONG_MAX. As a
safety measure, add new json conversion modes that let libvirt
error out early instead of pass bad numbers to qemu, if the
caller ever makes a mistake due to later refactoring.
Convert the various blockjob QMP calls to use the new modes,
and switch some of them to be optional (QMP has always supported
an omitted "speed" the same as "speed":0, for everything except
block-job-set-speed).
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONMakeCommandRaw):
Add 'j'/'y' and 'J'/'Y' to error out on negative input.
(qemuMonitorJSONDriveMirror, qemuMonitorJSONBlockCommit)
(qemuMonitorJSONBlockJob): Use it.
Signed-off-by: Eric Blake <eblake@redhat.com>
qemu treats blockjob bandwidth as a 64-bit number, in the units
of bytes/second. But we stupidly modeled block job bandwidth
after migration bandwidth, which in turn was an 'unsigned long'
and therefore subject to 32-bit vs. 64-bit interpretations, and
with a scale of MiB/s. Our code already has to convert between
the two scales, and report overflow as appropriate; although
this conversion currently lives in the monitor code. In fact,
our conversion code limited things to 63 bits, because we
checked against LLONG_MAX and reject what would be negative
bandwidth if treated as signed.
On the bright side, our use of MiB/s means that even with a
32-bit unsigned long, we still have no problem representing a
bandwidth of 2GiB/s, which is starting to be more feasible as
10-gigabit or even faster interfaces are used. And once you
get past the physical speeds of existing interfaces, any larger
bandwidth number behaves the same - effectively unlimited.
But on the low side, the granularity of 1MiB/s tuning is rather
coarse. So the new virDomainBlockJob API decided to go with
a direct 64-bit bytes/sec number instead of the scaled number
that prior blockjob APIs had used. But there is no point in
rounding this number to MiB/s just to scale it back to bytes/s
for handing to qemu.
In order to make future code sharing possible between the old
virDomainBlockRebase and the new virDomainBlockCopy, this patch
moves the scaling and overflow detection into the driver code.
Several of the block job calls that can set speed are fed
through a common interface, so it was easier to adjust all block
jobs at once, for consistency. This patch is just code motion;
there should be no user-visible change in behavior.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob)
(qemuMonitorBlockCommit, qemuMonitorDriveMirror): Change
parameter type and scale.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob)
(qemuMonitorBlockCommit, qemuMonitorDriveMirror): Move scaling
and overflow detection...
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl)
(qemuDomainBlockRebase, qemuDomainBlockCommit): ...here.
(qemuDomainBlockCopy): Use bytes/sec.
Signed-off-by: Eric Blake <eblake@redhat.com>
Another layer of overly-multiplexed code that deserves to be
split into obviously separate paths for query vs. modify.
This continues the cleanup started in commit cefe0ba.
In the process, make some tweaks to simplify the logic when
parsing the JSON reply. There should be no user-visible
semantic changes.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob): Drop parameter.
(qemuMonitorBlockJobInfo): New prototype.
(BLOCK_JOB_INFO): Drop enum.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJob)
(qemuMonitorJSONBlockJobInfo): Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Split...
(qemuMonitorBlockJobInfo): ...into second function.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Move
block info portions...
(qemuMonitorJSONGetBlockJobInfo): ...here, and rename...
(qemuMonitorJSONBlockJobInfo): ...and export.
(qemuMonitorJSONGetBlockJobInfoOne): Alter return semantics.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot)
(qemuDomainBlockJobImpl, qemuDomainGetBlockJobInfo): Adjust
callers.
* src/qemu/qemu_migration.c (qemuMigrationDriveMirror)
(qemuMigrationCancelDriveMirror): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit fba6bc4 introduced support for the 'invtsc' feature,
which blocks migration. We should not include it in the
host-model CPU by default, because it's intended to be used
with migration.
https://bugzilla.redhat.com/show_bug.cgi?id=1138221
https://bugzilla.redhat.com/show_bug.cgi?id=1027096#c8
There are two ways in which security model can make it way into
<seclabel/>. One is as the @model attribute, the second one is
via security_driver knob in qemu.conf. Then, while parsing
<seclabel/> several checks and fix ups of old, stale combinations
are performed. However, iff @model is specified. They are not
done in the latter case. So it's still possible to feed libvirt
with senseless combinations (if qemu.conf is adjusted correctly).
One example of a seclabel that needs some adjustment (in case
security_driver=none in qemu.conf) is:
<seclabel type='dynamic' relabel='yes'/>
The fixup code is copied from virSecurityLabelDefParseXML
(covering the former case) into virSecurityLabelDefsParseXML
(which handles the latter case).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The qemu implementation for virDomainGetBlockJobInfo() has a
minor bug: it grabs the qemu job with intent to QEMU_JOB_MODIFY,
which means it cannot be run in parallel with any other
domain-modifying command. Among others, virDomainBlockJobAbort()
is such a modifying command, and it defaults to being
synchronous, and can wait as long as several seconds to ensure
that the job has actually finished. Due to the job rules, this
means a user cannot obtain status about the job during that
timeframe, even though we know that some client management code
exists which is using a polling loop on status to see when a job
finishes.
This bug has been present ever since blockpull support was first
introduced (commit b976165, v0.9.4 in Jul 2011), all because we
stupidly tried to cram too much multiplexing through a single
helper routine, but was made worse in 97c59b9 (v1.2.7) when
BlockJobAbort was fixed to wait longer. It's time to disentangle
some of the mess in qemuDomainBlockJobImpl, and in the process
relax block job query to use QEMU_JOB_QUERY, since it can safely
be used in parallel with any long running modify command.
Technically, there is one case where getting block job info can
modify domain XML - we do snooping to see if a 2-phase job has
transitioned into the second phase, for an optimization in the
case of old qemu that lacked an event for the transition. I
claim this optimization is safe (the jobs are all about modifying
qemu state, not necessarily xml state); but if it proves to be
a problem, we could use the difference between the capabilities
QEMU_CAPS_BLOCKJOB_{ASYNC,SYNC} to determine whether we even
need snooping, and only request a modifying job in the case of
older qemu.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Move info
handling...
(qemuDomainGetBlockJobInfo): ...here, and relax job type.
(qemuDomainBlockJobAbort, qemuDomainBlockJobSetSpeed)
(qemuDomainBlockRebase, qemuDomainBlockPull): Adjust callers.
Signed-off-by: Eric Blake <eblake@redhat.com>
The existing virDomainBlockRebase code rejected the combination of
_RELATIVE and _COPY flags, but only by accident. It makes sense
to add support for the combination someday, at least for the case
of _SHALLOW and not _REUSE_EXT; but to implement it, libvirt would
have to pre-create the file with a relative backing name, and I'm
not ready to code that in yet.
Meanwhile, the code to forward on to the block copy code is getting
longer, and reorganizing the function to have the block pull done
early makes it easier to add even more block copy prep code.
This patch should have no semantic difference other than the quality
of the error message on the unsupported flag combination. Pre-patch:
error: unsupported flags (0x10) in function qemuDomainBlockCopy
Post-patch:
error: argument unsupported: Relative backing during copy not supported yet
* src/qemu/qemu_driver.c (qemuDomainBlockRebase): Reorder code,
and improve error message of relative copy.
Signed-off-by: Eric Blake <eblake@redhat.com>
Our style overwhelmingly uses hanging braces (the open brace
hangs at the end of the compound condition, rather than on
its own line), with the primary exception of the top level function
body. Fix the few remaining outliers, before adding a syntax
check in a later patch.
* src/interface/interface_backend_netcf.c (netcfStateReload)
(netcfInterfaceClose, netcf_to_vir_err): Correct use of { in
compound statement.
* src/conf/domain_conf.c (virDomainHostdevDefFormatSubsys)
(virDomainHostdevDefFormatCaps): Likewise.
* src/network/bridge_driver.c (networkAllocateActualDevice):
Likewise.
* src/util/virfile.c (virBuildPathInternal): Likewise.
* src/util/virnetdev.c (virNetDevGetVirtualFunctions): Likewise.
* src/util/virnetdevmacvlan.c
(virNetDevMacVLanVPortProfileCallback): Likewise.
* src/util/virtypedparam.c (virTypedParameterAssign): Likewise.
* src/util/virutil.c (virGetWin32DirectoryRoot)
(virFileWaitForDevices): Likewise.
* src/vbox/vbox_common.c (vboxDumpNetwork): Likewise.
* tests/seclabeltest.c (main): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
I'm about to add a syntax check that enforces our documented
HACKING style of always using matching {} on if-else statements.
This patch focuses on drivers that had several issues.
* src/lxc/lxc_fuse.c (lxcProcGetattr, lxcProcReadMeminfo): Correct
use of {}.
* src/lxc/lxc_driver.c (lxcDomainMergeBlkioDevice): Likewise.
* src/phyp/phyp_driver.c (phypConnectNumOfDomainsGeneric)
(phypUUIDTable_Init, openSSHSession, phypStoragePoolListVolumes)
(phypConnectListStoragePools, phypDomainSetVcpusFlags)
(phypStorageVolGetXMLDesc, phypStoragePoolGetXMLDesc)
(phypConnectListDefinedDomains): Likewise.
* src/vbox/vbox_common.c (vboxAttachSound, vboxDumpDisplay)
(vboxDomainRevertToSnapshot, vboxDomainSnapshotDelete): Likewise.
* src/vbox/vbox_tmpl.c (vboxStorageVolGetXMLDesc): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
We lacked of HOME environment variable,
set 'HOME=/' as default.
The kernel sets up $HOME for the init process.
Therefore any init can assume that $HOME is set.
libvirt currently violates that implicit rule.
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
When FIPS mode is on, gnutls_dh_params_generate2 will fail if 1024 is
specified as the prime's number of bits, a bigger value works in both
cases.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Memory is allocated for 'mnt_src' by VIR_STRDUP in the loop. Next
loop it will be allocated again. So we need to free 'mnt_src'
before continue the loop.
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
Commit 0e1a1a8c introduced umask for virCommand, but the variables
used emit a warning on older compilers about shadowed global
declaration.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Add umask to _virCommand, allow user to set umask to command.
Set umask(002) to qemu process to overwrite the default umask
of 022 set by many distros, so that unix sockets created for
virtio-serial has expected permissions.
Fix problem reported here:
https://sourceware.org/bugzilla/show_bug.cgi?id=13078#c11https://bugzilla.novell.com/show_bug.cgi?id=888166
To use virtio-serial device, unix socket created for chardev with
default umask(022) has insufficient permissions.
e.g.:
-device virtio-serial \
-chardev socket,path=/tmp/foo,server,nowait,id=foo \
-device virtserialport,chardev=foo,name=org.fedoraproject.port.0
srwxr-xr-x 1 qemu qemu 0 21. Jul 14:19 /tmp/somefile.sock
Other users in the same group (like real user, test engines, etc)
cannot write to this socket.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently, there is one flag passed in during macvtap creation
(withTap) -- Let's convert this field to an unsigned int flag
field for future expansion.
Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The cleanup in commit cf976d9d used secdef->label to label the tap
FDs, but that is not possible since it's process-only label (svirt_t)
and not a object label (e.g. svirt_image_t). Starting a domain failed
with EPERM, but simply using secdef->imagelabel instead of
secdef->label fixes it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Since 1b807f92, connecting with virsh to an already running session
libvirtd fails with:
$ virsh list --all
error: failed to connect to the hypervisor
error: no valid connection
error: Failed to connect socket to
'/run/user/1000/libvirt/libvirt-sock': Transport endpoint is already
connected
This is caused by a logic error in virNetSocketNewConnectUnix: even if
the connection to the daemon socket succeeded, we still try to spawn the
daemon and then connect to it.
This commit changes the logic to not try to spawn libvirtd if we
successfully connected to its socket.
Most of this commit is whitespace changes, use of -w is recommended to
look at it.
Currently, after calling commands to create a new volumes,
virStorageBackendZFSCreateVol calls virStorageBackendZFSFindVols that
calls virStorageBackendZFSParseVol.
virStorageBackendZFSParseVol checks if a volume already exists by
trying to get it using virStorageVolDefFindByName.
For a just created volume it returns NULL, so volume is reported as
new and appended to pool->volumes. This causes a volume to be listed
twice as storageVolCreateXML appends this new volume to the list as
well.
Fix that by passing a new volume definition to
virStorageBackendZFSParseVol so it could determine if it needs to add
this volume to the list.
In qemuDomainSnapshotCreateDiskActive() if we jumped to cleanup from a
failed actions = virJSONValueNewArray(), then 'cfg' would be NULL.
So just return -1, which in turn removes the need for cleanup:
Coverity complained about the following:
(3) Event ptr_arith:
Performing pointer arithmetic on "cur_fd" in expression "cur_fd++".
130 return virNetServerServiceNewFD(*cur_fd++,
The complaint is that pointer arithmetic taking place instead of the
expected auto increment of the variable... Adding some well placed
parentheses ensures our order of operation.
For virtio-blk-pci disks with the disk iothread attribute that are
running the correct emulator, add the "iothread=iothread#" to the
-device command line in order to enable iothreads for the disk as
long as the command is available, the disk iothread value provided is
valid, and is supported for the disk device being added
Add a new disk "driver" attribute "iothread" to be parsed as the thread
number for the disk to use. In order to more easily facilitate the usage
and configuration of the iothread, a "zero" for the attribute indicates
iothreads are not supported for the device and a positive value indicates
the specific thread to try and use.
Add a new capability to ensure the iothreads feature exists for the qemu
emulator being run - requires the "query-iothreads" QMP command. Using the
domain XML add correspoding command argument in order to generate the
threads. The iothreads will use a name space "iothread#" where, the
future patch to add support for using an iothread to a disk definition to
merely define which of the available threads to use.
Add tests to ensure the xml/argv processing is correct. Note that no
change was made to qemuargv2xmltest.c as processing the -object element
would require knowing more than just iothreads.
Introduce XML to allowing adding iothreads to the domain. These can be
used by virtio-blk-pci devices in order to assign a specific thread to
handle the workload for the device. The iothreads are the official
implementation of the virtio-blk Data Plane that's been in tech preview
for QEMU.
Coverity noted that all callers to libxlDomainEventQueue() could ensure
the second parameter (event) was true before calling except this case.
As I look at the code and how events are used - it seems that prior to
generating an event for the dom == NULL condition, the resume/suspend
event should be queue'd after the virDomainSaveStatus() call which will
goto cleanup and queue the saved event anyway.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Implement the API function for virDomainListGetStats and
virConnectGetAllDomainStats in a modular way and implement the
VIR_DOMAIN_STATS_STATE group of statistics.
Although it may look like the function looks universal I'd rather not
expose it to other drivers as the coming stats groups are likely to do
qemu specific stuff to obtain the stats.
One useless warning, but the other one rather pertinent. On entry
the 'trans' variable is initialized to VIR_DOMAIN_DISK_TRANS_DEFAULT.
When the "trans" was found in the parsing loop it def->geometry.trans
was assigned to the return from virDomainDiskGeometryTransTypeFromString
and then 'trans' was used to do the comparison to see if it was valid.
So remove 'trans' and use def->geometry.trans properly
In libxlDomainMigrationPrepare() if the uri_in is false, then
'hostname' is allocated and used "generically" in the routine,
but not freed. Conversely, if uri_in is true, then a uri is
allocated and hostname is set to the uri->hostname value and
likewise generically used.
At function exit, hostname wasn't free'd in the !uri_in path,
so that was added. To just make it clearer on usage the else
path became the call to virURIFree() although I suppose technically
it didn't have to since it would be a call using (NULL)