When creating a disk image snapshot the libvirt code would blindly copy
the parents label to the newly created image. This runs into problems
when you start a VM from an image hosted on NFS (or other storage system
that doesn't support selinux labels) and the snapshot destination is on
a storage system that does support selinux labels. Libvirt's code in
that case generates a different security label for the image hosted on
NFS. This label is valid only for NFS images and doesn't allow access in
case of a locally stored image.
To fix this issue libvirt needs to refrain from copying security
information in cases where the default domain seclabel is a better
choice.
This patch repurposes the now unused @force argument of
virStorageSourceInitChainElement to denote whether a copy of the
security labelling stuff should be attempted or not. This allows to
fine-control the copy operation for cases where we need to keep the
label of the old disk vs. the cases where we need to keep the label
unset to use the default domain imagelabel.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1151718
I noticed this while working on qemuDomainGetBlockInfo. Assigning
a bool value to an int variable compiles fine, but raises red flags
on the maintenance front as it becomes too easy to assign -1 or 2
or any other non-bool value to the same variable.
* cfg.mk (sc_prohibit_int_assign_bool): New rule.
* src/conf/snapshot_conf.c (virDomainSnapshotRedefinePrep): Fix
offenders.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo)
(qemuDomainSnapshotCreateXML): Likewise.
* src/test/test_driver.c (testDomainSnapshotAlignDisks):
Likewise.
* src/util/vircgroup.c (virCgroupSupportsCpuBW): Likewise.
* src/util/virpci.c (virPCIDeviceBindToStub): Likewise.
* src/util/virutil.c (virIsCapableVport): Likewise.
* tools/virsh-domain-monitor.c (cmdDomMemStat): Likewise.
* tools/virsh-domain.c (cmdBlockResize, cmdScreenshot)
(cmdInjectNMI, cmdSendKey, cmdSendProcessSignal)
(cmdDetachInterface): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
For some reason, commit id '72b4151f' triggered a Coverity uninitialized
'reply' variable check when referenced within the for loop.
It seems Coverity doesn't know that flags will have to be either AFFECT_LIVE
or AFFECT_CONFIG after the virDomainLiveConfigHelperMethod call.
By adding a "sa_assert()" to confirm that fact, Coverity is happy again.
https://bugzilla.redhat.com/show_bug.cgi?id=1164080
After a disk is hotunplugged a subsequent call to qemuDomainGetBlockIoTune
to get the --config settings of that disk will fail because the disk is no
longer found by qemuDiskPathToAlias causing an unexpected failure.
Since only the --live flag needs to have the disk device pointer, move the
fetch inside the (flags & VIR_DOMAIN_AFFECT_LIVE) condition. This will also
affect the results if no flags are provided or the --current flag is provided.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Commit 6e5c79a1 tried to fix deadlock between nwfilter{Define,Undefine}
and starting of guest, but this same deadlock exists for
updating/attaching network device to domain.
The deadlock was introduced by removing global QEMU driver lock because
nwfilter was counting on this lock and ensure that all driver locks are
locked inside of nwfilter{Define,Undefine}.
This patch extends usage of virNWFilterReadLockFilterUpdates to prevent
the deadlock for all possible paths in QEMU driver. LXC and UML drivers
still have global lock.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1143780
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Add support for bps_max and friends in the driver part.
In the part checking if a qemu is running, check if the running binary
support bps_max, if not print an error message, if yes add it to
"info" variable
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1160084
As of b6d4dad1 (1.2.5) libvirt keeps track if domain disks have been
frozen. However, this falls into that set of information which don't
survive domain restart. Therefore, we need to clear the flag upon some
state transitions. Moreover, once we clear the flag we must update the
status file too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since there was a valid note to patch 43b67f2e about the best spot to
check for bandwidth set call while having libvirt daemon run in session
mode, this patch reverts previous changes dealing with bandwith
(also reverts adding variable @cfg in qemuDomainGetNumaParameters which
does not have any use at the moment, but getting and unreferencing
driver's config) in qemu_driver.c and qemu_command.c. There will be
another patch in the series which introduces the fix itself.
https://bugzilla.redhat.com/show_bug.cgi?id=1159219
Users might want to update startupPolicy via the
virDomainUpdateDeviceFlags API too. This patch
implements the feature on config layer.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Particularly in qemuBuildNumaArgStr(), there was a need for the advice
due to memory backing, which needs to know the nodeset it will be pinned
to. With newer qemu this caused the following error when starting
domain:
error: internal error: Advice from numad is needed in case of
automatic numa placement
even when starting perfectly valid domain, e.g.:
...
<vcpu placement='auto'>4</vcpu>
<numatune>
<memory mode='strict' placement='auto'/>
</numatune>
<cpu>
<numa>
<cell id='0' cpus='0' memory='524288'/>
<cell id='1' cpus='1' memory='524288'/>
</numa>
</cpu>
...
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1138545
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Now that all offenders have been cleaned, turn on a syntax-check
rule to prevent future offenders.
* cfg.mk (sc_prohibit_static_zero_init): New rule.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Avoid false
positive.
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1141621
As part of attach processing, assign the device aliases by calling
qemuAssignDeviceAliases during qemuDomainQemuAttach once all the devices
are found after the qemuParseCommandLinePid processing.
This will alleviate a symptom that caused a libvirtd crash during an
attempted device detach.
This patch adds functionality to processNicRxFilterChangedEvent().
The old and new multicast lists are compared and the filters in
the macvtap are programmed to match the guest's filters.
Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
https://bugzilla.redhat.com/show_bug.cgi?id=956506 documents that
given a domain where an internal snapshot parent has an external
snapshot child, we lacked a safety check when trying to use the
--children-only option to snapshot-delete:
$ virsh start dom
$ virsh snapshot-create-as dom internal
$ virsh snapshot-create-as dom external --disk-only
$ virsh snapshot-delete dom external
error: Failed to delete snapshot external
error: unsupported configuration: deletion of 1 external disk snapshots not supported yet
$ virsh snapshot-delete dom internal --children
error: Failed to delete snapshot internal
error: unsupported configuration: deletion of 1 external disk snapshots not supported yet
$ virsh snapshot-delete dom internal --children-only
Domain snapshot internal children deleted
While I'd still like to see patches that actually do proper external
snapshot deletion, we should at least fix the inconsistency in the
meantime. With this patch:
$ virsh snapshot-delete dom internal --children-only
error: Failed to delete snapshot internal
error: unsupported configuration: deletion of 1 external disk snapshots not supported yet
* src/qemu/qemu_driver.c (qemuDomainSnapshotDelete): Fix condition.
Signed-off-by: Eric Blake <eblake@redhat.com>
To prepare for introducing a single global driver, rename the
virDriver struct to virHypervisorDriver and the registration
API to virRegisterHypervisorDriver()
Tuning NUMA or network interface parameters requires root
privileges to manage cgroups. Thus an attempt to set some of these
parameters in session mode on a running domain should be invalid
followed by an error. An example might be memory tuning which raises
an error in such case.
The following behavior in session mode will be present after applying
this patch:
Tuning | SET | GET |
----------|---------------|--------|
NUMA | shut off only | always |
Memory | never | never |
Interface | never | always |
Resolves https://bugzilla.redhat.com/show_bug.cgi?id=1126762
The documentation for the restore hook states that returning an empty
XML is equivalent with copying the input. There was a bug in the code
checking the returned string by checking the string instead of the
contents. Use the new helper to check if the string is empty.
virt-manager on Fedora sets up i686 hosts with "/usr/bin/qemu-kvm" emulator,
which in turn unconditionally execs qemu-system-x86_64 querying capabilities
then fails:
Error launching details: invalid argument: architecture from emulator 'x86_64' doesn't match given architecture 'i686'
Traceback (most recent call last):
File "/usr/share/virt-manager/virtManager/engine.py", line 748, in _show_vm_helper
details = self._get_details_dialog(uri, vm.get_connkey())
File "/usr/share/virt-manager/virtManager/engine.py", line 726, in _get_details_dialog
obj = vmmDetails(conn.get_vm(connkey))
File "/usr/share/virt-manager/virtManager/details.py", line 399, in __init__
self.init_details()
File "/usr/share/virt-manager/virtManager/details.py", line 784, in init_details
domcaps = self.vm.get_domain_capabilities()
File "/usr/share/virt-manager/virtManager/domain.py", line 518, in get_domain_capabilities
self.get_xmlobj().os.machine, self.get_xmlobj().type)
File "/usr/lib/python2.7/site-packages/libvirt.py", line 3492, in getDomainCapabilities
if ret is None: raise libvirtError ('virConnectGetDomainCapabilities() failed', conn=self)
libvirtError: invalid argument: architecture from emulator 'x86_64' doesn't match given architecture 'i686'
Journal:
Oct 16 21:08:26 goatlord.localdomain libvirtd[1530]: invalid argument: architecture from emulator 'x86_64' doesn't match given architecture 'i686'
After set domain's numa parameters for running domain, save the change,
save the change into live xml is needed to survive restarting the libvirtd,
same story with bug 1146511; meanwihle add call
qemuDomainObjBeginJob/qemuDomainObjEndJob in qemuDomainSetNumaParameters
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
After set the blkio parameters for running domain, save the change into
live xml is needed to survive restarting the libvirtd, same story with
bug 1146511, meanwhile add call qemuDomainObjBeginJob/qemuDomainObjEndJob
in qemuDomainSetBlkioParameters
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
This patch fills in the functionality of
processNicRxFilterChangedEvent(). It now checks if it is appropriate
to respond to the NIC_RX_FILTER_CHANGED event (based on device type
and configuration) and takes appropriate action. Currently it checks
if the guest interface has been configured with
trustGuestRxFilters='yes', and if the host side device is macvtap. If
so, and the MAC address on the guest has changed, the MAC address of
the macvtap device is changed to match.
The result of this is that networking from the guest will continue to
work if the mac address of a macvtap-connected network device is
changed from within the guest, as long as trustGuestRxFilters='yes'
(previously changing the MAC address in the guest would break
networking).
NIC_RX_FILTER_CHANGED is sent by qemu any time a NIC driver in the
guest modified the NIC's RX Filter (for example, if the MAC address of
the NIC is changed by the guest).
This patch doesn't do anything useful with that event; it just sets up
all the plumbing to get news of the event into a worker thread with
all proper locking/reference counting, and provide an easy place to
add in desired functionality.
See src/qemu/EVENTHANDLERS.txt for information/instructions on adding
a libvirt-internal handler for a qemu event (using
NIC_RX_FILTER_CHANGED as an example).
Prior patch removed the need for the virConnectPtr in the unplug
detach host path which caused ripple effect to remove in multiple
callers. The previous patch just left things as ATTRIBUTE_UNUSED -
this patch will remove the variable.
check domain's status before call virQEMUCapsGet to report a accurate
error when domain is shut off
Resolve: https://bugzilla.redhat.com/show_bug.cgi?id=1147847
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
Up until now, we set memballoon period in monitor successfully, however
we did not update domain definition structure, thus dumpxml was omitting
period attribute in memballoon element
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1140960
When trying to update bandwidth limits on a running domain, limits get
updated in our internal structures, however XML parser reads
bandwidth limits from network 'actual' definition. Committing this patch
it is now available to update bandwidth 'actual' definition as well,
thus updating domain runtime XML.
Management software wants to be able to allocate disk space on demand.
To support this they need keep track of the space occupation of the
block device. This information is reported by qemu as part of block
stats.
This patch extend the block information in the bulk stats with the
allocation information.
To keep the same behaviour a helper is extracted from
qemuMonitorJSONGetBlockExtent in order to get per-device allocation
information.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
This removes the artificial and unnecessary restriction that
virDomainSetMaxDowntime() only be called while a migration is in
progress.
https://bugzilla.redhat.com/show_bug.cgi?id=1146618
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The current block stats code matched up the disk name with the actual
stats by the order in the data returned from qemu. This unfortunately
isn't right as qemu may return the disks in any order. Fix this by
returning a hash of stats and index them by the disk alias.
For the new VIR_DOMAIN_EVENT_ID_TUNABLE event we have a bunch of
constants added
VIR_DOMAIN_EVENT_CPUTUNE_<blah>
VIR_DOMAIN_EVENT_BLKDEVIOTUNE_<blah>
This naming convention is bad for two reasons
- There is no common prefix unique for the events to both
relate them, and distinguish them from other event
constants
- The values associated with the constants were chosen
to match the names used with virConnectGetAllDomainStats
so having EVENT in the constant name is not applicable in
that respect
This patch proposes renaming the constants to
VIR_DOMAIN_TUNABLE_CPU_<blah>
VIR_DOMAIN_TUNABLE_BLKDEV_<blah>
ie, given them a common VIR_DOMAIN_TUNABLE prefix.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Use the universal tunable event to report changes to user. All
blkdeviotune values are prefixed with "blkdeviotune".
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
When you updated some blkdeviotune values for running domain the values
were stored only internally, but not saved into the live XML so they
won't survive restarting the libvirtd.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Request erroring out from the backing chain traveller and drop qemu's
internal backing chain integrity tester.
The backing chain traveller reports errors by itself with possibly more
detail than qemuDiskChainCheckBroken ever could.
We also need to make sure that we reconnect to existing qemu instances
even at the cost of losing the backing chain info (this really should be
stored in the XML rather than reloaded from disk, but that needs some
work).
Now we have universal tunable event so we can use it for reporting
changes to user. The cputune values will be prefixed with "cputune" to
distinguish it from other tunable events.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Add a new parameter that will allow to return the XML stored in the save
image for further manipulation and adjust the callers. This option will
be used in later patches.
We are not detecting the presence of FIPS from QEMU, but from procfs and
that means it's not QEMU capability. It was decided that we will pass
this flag to QEMU even if it's not supported by old QEMU binaries.
This patch also reverts changes done by commit a21cfb0f to
qemucapabilitestest and implements a new test case in qemuxml2argvtest.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1135431
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Commit f05b6a91 added virQEMUDriverConfigPtr argument to the
virQEMUCapsFillDomainCaps function and it uses forward declaration
of virQEMUDriverConfig and virQEMUDriverConfigPtr that casues clang
build to fail:
gmake[3]: Entering directory `/usr/home/novel/code/libvirt/src'
CC qemu/libvirt_driver_qemu_impl_la-qemu_capabilities.lo
In file included from qemu/qemu_capabilities.c:43:
In file included from qemu/qemu_hostdev.h:27:
qemu/qemu_conf.h:63:37: error: redefinition of typedef 'virQEMUDriverConfig'
is a C11 feature [-Werror,-Wtypedef-redefinition]
typedef struct _virQEMUDriverConfig virQEMUDriverConfig;
^
qemu/qemu_capabilities.h:328:37: note: previous definition is here
typedef struct _virQEMUDriverConfig virQEMUDriverConfig;
^
Fix that by passing loader and nloader config attributes directly
instead of passing complete config.
Clean up all _virDomainMemoryStat.
Signed-off-by: James <james.wangyufei@huawei.com>
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Clean up all _virDomainBlockStats.
Signed-off-by: James <james.wangyufei@huawei.com>
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Clean up all _virDomainInterfaceStats.
Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Live definition was used to look up the disk index while persistent one
was indexed leading to a crash in qemuDomainGetBlockIoTune. Use the
correct def and report a nice error.
Unfortunately it's accessible via read-only connection, though it can
only crash libvirtd in the cases where the guest is hot-plugging disks
without reflecting those changes to the persistent definition. So
avoiding hotplug, or doing hotplug where persistent is always modified
alongside live definition, will avoid the out-of-bounds access.
Introduced in: eca96694a7f992be633d48d5ca03cedc9bbc3c9aa (v0.9.8)
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1140724
Reported-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
This patch implements the VIR_DOMAIN_STATS_BLOCK group of statistics.
To do so, a helper function to get the block stats of all the disks of
a domain is added.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
This patch implements the VIR_DOMAIN_STATS_INTERFACE group of
statistics.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
This patch implements the VIR_DOMAIN_STATS_VCPU group of statistics. To
do so, this patch also extracts a helper to gather the vCPU information.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
This patch implements the VIR_DOMAIN_STATS_CPU_TOTAL group of
statistics.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Future patches which will implement more bulk stats groups for QEMU will
need to access the connection object.
To accommodate that, a few changes are needed:
* enrich internal prototype to pass qemu driver object
* add per-group flag to mark if one collector needs monitor access or not
* If at least one collector of the requested stats needs monitor access
we must start a query job for each domain. The specific collectors
will run nested monitor jobs inside that.
* If the job can't be acquired we pass flags to the collector so
specific collectors that need monitor access can be skipped in order
to gather as much data as is possible.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Check to see if the UEFI binary mentioned in qemu.conf actually
exists, and if so expose it in domcapabilities like
<loader ...>
<value>/path/to/ovmf</value>
</loader>
We introduce some generic domcaps infrastructure for handling
a dynamic list of string values, it may be of use for future bits.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Up till now the virQEMUCapsFillDomainCaps() was type of void as
there was no way for it to fail. This is, however, going to
change in the next commit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit b606bbb4 broke reporting of errors when setting of guest time
fails via the guest agent as the return value is not checked and later
overwritten by the return value qemuMonitorRTCResetReinjection();
Fix this by checking the return value before resetting the RTC
reinjection.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1142294
Modify qemuProcessStart() in order to allowing setting affinity to
specific CPU's for IOThreads. The process followed is similar to
that for the vCPU's.
This involves adding a function to fetch the IOThread id's via
qemuMonitorGetIOThreads() and adding them to iothreadpids[] list.
Then making sure all the cgroup data has been properly set up and
finally assigning affinity.
We stupidly modeled block job bandwidth after migration
bandwidth, which in turn was an 'unsigned long' and therefore
subject to 32-bit vs. 64-bit interpretations. To work around
the fact that 10-gigabit interfaces are possible but don't fit
within 32 bits, the original interface took the number scaled
as MiB/sec. But this scaling is rather coarse, and it might
be nice to tune bandwidth finer than in megabyte chunks.
Several of the block job calls that can set speed are fed
through a common interface, so it was easier to adjust them all
at once. Note that there is intentionally no flag for the new
virDomainBlockCopy; there, since the API already uses a 64-bit
type always, instead of a possible 32-bit type, and is brand
new, it was easier to just avoid scaling issues. As with the
previous patch that adjusted the query side (commit db33cc24),
omitting the new flag preserves old behavior, and the
documentation now mentions limits of what happens when a 32-bit
machine is on either client or server side.
* include/libvirt/libvirt.h.in (virDomainBlockJobSetSpeedFlags)
(virDomainBlockPullFlags)
(VIR_DOMAIN_BLOCK_REBASE_BANDWIDTH_BYTES)
(VIR_DOMAIN_BLOCK_COMMIT_BANDWIDTH_BYTES): New enums.
* src/libvirt.c (virDomainBlockJobSetSpeed, virDomainBlockPull)
(virDomainBlockRebase, virDomainBlockCommit): Document them.
* src/qemu/qemu_driver.c (qemuDomainBlockJobSetSpeed)
(qemuDomainBlockPull, qemuDomainBlockRebase)
(qemuDomainBlockCommit, qemuDomainBlockJobImpl): Support new flag.
Signed-off-by: Eric Blake <eblake@redhat.com>
Upstream qemu 1.4 added some drive-mirror tunables not present
when it was first introduced in 1.3. Management apps may want
to set these in some cases (for example, without tuning
granularity down to sector size, a copy may end up occupying
more bytes than the original because an entire cluster is
copied even when only a sector within the cluster is dirty,
although tuning it down results in more CPU time to do the
copy). I haven't personally needed to use the parameters, but
since they exist, and since the new API supports virTypedParams,
we might as well expose them.
Since the tuning parameters aren't often used, and omitted from
the QMP command when unspecified, I think it is safe to rely on
qemu 1.3 to issue an error about them being unsupported, rather
than trying to create a new capability bit in libvirt.
Meanwhile, all versions of qemu from 1.4 to 2.1 have a bug where
a bad granularity (such as non-power-of-2) gives a poor message:
error: internal error: unable to execute QEMU command 'drive-mirror': Invalid parameter 'drive-virtio-disk0'
because of abuse of QERR_INVALID_PARAMETER (which is supposed to
name the parameter that was given a bad value, rather than the
value passed to some other parameter). I don't see that a
capability check will help, so we'll just live with it (and it
has since been improved in upstream qemu).
* src/qemu/qemu_monitor.h (qemuMonitorDriveMirror): Add
parameters.
* src/qemu/qemu_monitor.c (qemuMonitorDriveMirror): Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDriveMirror):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDriveMirror):
Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockCopyCommon): Likewise.
(qemuDomainBlockRebase, qemuDomainBlockCopy): Adjust callers.
* src/qemu/qemu_migration.c (qemuMigrationDriveMirror): Likewise.
* tests/qemumonitorjsontest.c (qemuMonitorJSONDriveMirror): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
The hard part of managing the disk copy is already coded; all
this had to do was convert the XML and virTypedParameters into
the internal representation.
With this patch, all blockcopy operations that used the old
API should also work via the new API. Additional extensions,
such as supporting the granularity tunable or a network rather
than file destination, will be added as later patches.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): New function.
Signed-off-by: Eric Blake <eblake@redhat.com>
In order to implement the new virDomainBlockCopy, the existing
block copy internal implementation needs to be adjusted. The
new function will parse XML into a storage source, and parse
typed parameters into integers, then call into the same common
backend. For now, it's easier to keep the same implementation
limits that only local file destinations are suported, but now
the check needs to be explicit. Similar to qemuDomainBlockJobImpl
consuming 'vm', this code also consumes the caller's 'mirror'
description of the destination.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): Rename...
(qemuDomainBlockCopyCommon): ...and adjust parameters.
(qemuDomainBlockRebase): Adjust caller.
Signed-off-by: Eric Blake <eblake@redhat.com>
When a domain is undefined, there are options to remove it's
managed save state or snapshots. However, there's another file
that libvirt creates per domain: the NVRAM variable store file.
Make sure that the file is not left behind if the domain is
undefined.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Test suites using the port allocator don't want to have different
behaviour depending on whether a port is in use on the host. Add
a VIR_PORT_ALLOCATOR_SKIP_BIND_CHECK which test suites can use
to skip the bind() test. The port allocator will thus only track
ports in use by the test suite process itself. This is fine when
using the port allocator to generate guest configs which won't
actually be launched
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Coverity notes that if the virConnectListAllDomains returns a negative
value then the loop at the cleanup label that ends on numDomains will
have issues.
Signed-off-by: John Ferlan <jferlan@redhat.com>
If we jump to cleanup before allocating the 'result', then the call
to virBlkioDeviceArrayClear will deref result causing a problem.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Coverity complains that checking for !domlist after setting doms = domlist
and making a deref of doms just above
It seems the call in question was intended to me made in the case that
'doms' was passed in and not when the virDomainObjListExport() call
allocated domlist and already called virConnectGetAllDomainStatsCheckACL().
Thus rather than check for !domlist - check that "doms != domlist" in
order to avoid the Coverity message.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In qemuDomainSetBlkioParameters(), Coverity points out that the calls
to qemuDomainParseBlkioDeviceStr() are slightly different and points
out there may be a cut-n-paste error.
In the first call (AFFECT_LIVE), the second parameter is "param->field";
however, for the second call (AFFECT_CONFIG), the second parameter is
"params->field". It seems the "param->field" is correct especially since
each path as a setting of "param" to "¶ms[i]". Furthermore, there
were a few more instances of using "params[i]" instead of "param->"
which I cleaned up.
Signed-off-by: John Ferlan <jferlan@redhat.com>
virDomainGetJobStats gains new VIR_DOMAIN_JOB_STATS_COMPLETED flag that
can be used to fetch statistics of a completed job rather than a
currently running job.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Job statistics data were tracked in several structures and variables.
Let's make a new qemuDomainJobInfo structure which can be used as a
single source of statistics data as a preparation for storing data about
completed a job.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Return failure right away when the domain object can't be looked up
instead of jumping to cleanup. This allows to remove the condition
before unlocking the domain object.
The code would lookup the snapshot object before acquiring the job. This
could lead to a crash as one thread could delete the snapshot object,
while a second thread already had the reference.
Signed-off-by: Jincheng Miao <jmiao@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Creating snapshots modifies the domain state. Currently we wouldn't
enter the job for certain operations although they would modify the
state. Refactor job handling so that everything is covered by an async
job.
To date, anyone performing a block copy and pivot ends up with
the destination being treated as <disk type='file'>. While this
works for data access for a block device, it has at least one
noticeable shortcoming: virDomainGetBlockInfo() reports allocation
differently for block devices visited as files (the size of the
device) than for block devices visited as <disk type='block'>
(the maximum sector used, as reported by qemu); and this difference
is significant when trying to manage qcow2 format on block devices
that can be grown as needed.
Of course, the more powerful virDomainBlockCopy() API can already
express the ability to set the <disk> type. But a new API can't
be backported, while a new flag to an existing API can; and it is
also rather inconvenient to have to resort to the full power of
generating XML when just adding a flag to the older call will do
the trick. So this patch enhances blockcopy to let the user flag
when the resulting XML after the copy must list the device as
type='block'.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_REBASE_COPY_DEV):
New flag.
* src/libvirt.c (virDomainBlockRebase): Document it.
* tools/virsh-domain.c (opts_block_copy, blockJobImpl): Add
--blockdev option.
* tools/virsh.pod (blockcopy): Document it.
* src/qemu/qemu_driver.c (qemuDomainBlockRebase): Allow new flag.
(qemuDomainBlockCopy): Remember the flag, and make sure it is only
used on actual block devices.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Test it.
Signed-off-by: Eric Blake <eblake@redhat.com>
While reviewing the new virDomainBlockCopy API, Peter Krempa
pointed out that our existing design of using MiB/s for block
job bandwidth is rather coarse, especially since qemu tracks
it in bytes/s; so virDomainBlockCopy only accepts bytes/s.
But once the new API is implemented for qemu, we will be in
the situation where it is possible to set a value that cannot
be accurately reflected back to the user, because the existing
virDomainGetBlockJobInfo defaults to the coarser units.
Fortunately, we have an escape hatch; and one that has already
served us well in the past: we can use the flags argument to
specify which scale to use (see virDomainBlockResize for prior
art). This patch fixes the query side of the API; made easier
by previous patches that split the query side out from the
modification code. Later patches will address the virsh
interface, as well retrofitting all other blockjob APIs to
also accept a flag for toggling bandwidth units.
* include/libvirt/libvirt.h.in (_virDomainBlockJobInfo)
(VIR_DOMAIN_BLOCK_COPY_BANDWIDTH): Document sizing issues.
(virDomainBlockJobInfoFlags): New enum.
* src/libvirt.c (virDomainGetBlockJobInfo): Document new flag.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJobInfo): Add parameter.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJobInfo): Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJobInfo):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJobInfo)
(qemuMonitorJSONGetBlockJobInfoOne): Likewise. Don't scale here.
* src/qemu/qemu_migration.c (qemuMigrationDriveMirror): Update
callers.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot)
(qemuDomainBlockJobImpl): Likewise.
(qemuDomainGetBlockJobInfo): Likewise, and support new flag.
Signed-off-by: Eric Blake <eblake@redhat.com>
qemu treats blockjob bandwidth as a 64-bit number, in the units
of bytes/second. But we stupidly modeled block job bandwidth
after migration bandwidth, which in turn was an 'unsigned long'
and therefore subject to 32-bit vs. 64-bit interpretations, and
with a scale of MiB/s. Our code already has to convert between
the two scales, and report overflow as appropriate; although
this conversion currently lives in the monitor code. In fact,
our conversion code limited things to 63 bits, because we
checked against LLONG_MAX and reject what would be negative
bandwidth if treated as signed.
On the bright side, our use of MiB/s means that even with a
32-bit unsigned long, we still have no problem representing a
bandwidth of 2GiB/s, which is starting to be more feasible as
10-gigabit or even faster interfaces are used. And once you
get past the physical speeds of existing interfaces, any larger
bandwidth number behaves the same - effectively unlimited.
But on the low side, the granularity of 1MiB/s tuning is rather
coarse. So the new virDomainBlockJob API decided to go with
a direct 64-bit bytes/sec number instead of the scaled number
that prior blockjob APIs had used. But there is no point in
rounding this number to MiB/s just to scale it back to bytes/s
for handing to qemu.
In order to make future code sharing possible between the old
virDomainBlockRebase and the new virDomainBlockCopy, this patch
moves the scaling and overflow detection into the driver code.
Several of the block job calls that can set speed are fed
through a common interface, so it was easier to adjust all block
jobs at once, for consistency. This patch is just code motion;
there should be no user-visible change in behavior.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob)
(qemuMonitorBlockCommit, qemuMonitorDriveMirror): Change
parameter type and scale.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob)
(qemuMonitorBlockCommit, qemuMonitorDriveMirror): Move scaling
and overflow detection...
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl)
(qemuDomainBlockRebase, qemuDomainBlockCommit): ...here.
(qemuDomainBlockCopy): Use bytes/sec.
Signed-off-by: Eric Blake <eblake@redhat.com>
Another layer of overly-multiplexed code that deserves to be
split into obviously separate paths for query vs. modify.
This continues the cleanup started in commit cefe0ba.
In the process, make some tweaks to simplify the logic when
parsing the JSON reply. There should be no user-visible
semantic changes.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob): Drop parameter.
(qemuMonitorBlockJobInfo): New prototype.
(BLOCK_JOB_INFO): Drop enum.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJob)
(qemuMonitorJSONBlockJobInfo): Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Split...
(qemuMonitorBlockJobInfo): ...into second function.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Move
block info portions...
(qemuMonitorJSONGetBlockJobInfo): ...here, and rename...
(qemuMonitorJSONBlockJobInfo): ...and export.
(qemuMonitorJSONGetBlockJobInfoOne): Alter return semantics.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot)
(qemuDomainBlockJobImpl, qemuDomainGetBlockJobInfo): Adjust
callers.
* src/qemu/qemu_migration.c (qemuMigrationDriveMirror)
(qemuMigrationCancelDriveMirror): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
The qemu implementation for virDomainGetBlockJobInfo() has a
minor bug: it grabs the qemu job with intent to QEMU_JOB_MODIFY,
which means it cannot be run in parallel with any other
domain-modifying command. Among others, virDomainBlockJobAbort()
is such a modifying command, and it defaults to being
synchronous, and can wait as long as several seconds to ensure
that the job has actually finished. Due to the job rules, this
means a user cannot obtain status about the job during that
timeframe, even though we know that some client management code
exists which is using a polling loop on status to see when a job
finishes.
This bug has been present ever since blockpull support was first
introduced (commit b976165, v0.9.4 in Jul 2011), all because we
stupidly tried to cram too much multiplexing through a single
helper routine, but was made worse in 97c59b9 (v1.2.7) when
BlockJobAbort was fixed to wait longer. It's time to disentangle
some of the mess in qemuDomainBlockJobImpl, and in the process
relax block job query to use QEMU_JOB_QUERY, since it can safely
be used in parallel with any long running modify command.
Technically, there is one case where getting block job info can
modify domain XML - we do snooping to see if a 2-phase job has
transitioned into the second phase, for an optimization in the
case of old qemu that lacked an event for the transition. I
claim this optimization is safe (the jobs are all about modifying
qemu state, not necessarily xml state); but if it proves to be
a problem, we could use the difference between the capabilities
QEMU_CAPS_BLOCKJOB_{ASYNC,SYNC} to determine whether we even
need snooping, and only request a modifying job in the case of
older qemu.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Move info
handling...
(qemuDomainGetBlockJobInfo): ...here, and relax job type.
(qemuDomainBlockJobAbort, qemuDomainBlockJobSetSpeed)
(qemuDomainBlockRebase, qemuDomainBlockPull): Adjust callers.
Signed-off-by: Eric Blake <eblake@redhat.com>
The existing virDomainBlockRebase code rejected the combination of
_RELATIVE and _COPY flags, but only by accident. It makes sense
to add support for the combination someday, at least for the case
of _SHALLOW and not _REUSE_EXT; but to implement it, libvirt would
have to pre-create the file with a relative backing name, and I'm
not ready to code that in yet.
Meanwhile, the code to forward on to the block copy code is getting
longer, and reorganizing the function to have the block pull done
early makes it easier to add even more block copy prep code.
This patch should have no semantic difference other than the quality
of the error message on the unsupported flag combination. Pre-patch:
error: unsupported flags (0x10) in function qemuDomainBlockCopy
Post-patch:
error: argument unsupported: Relative backing during copy not supported yet
* src/qemu/qemu_driver.c (qemuDomainBlockRebase): Reorder code,
and improve error message of relative copy.
Signed-off-by: Eric Blake <eblake@redhat.com>
In qemuDomainSnapshotCreateDiskActive() if we jumped to cleanup from a
failed actions = virJSONValueNewArray(), then 'cfg' would be NULL.
So just return -1, which in turn removes the need for cleanup:
Implement the API function for virDomainListGetStats and
virConnectGetAllDomainStats in a modular way and implement the
VIR_DOMAIN_STATS_STATE group of statistics.
Although it may look like the function looks universal I'd rather not
expose it to other drivers as the coming stats groups are likely to do
qemu specific stuff to obtain the stats.
In qemuDomainRevertToSnapshot(), it will check snap->def->state.
But when the state is PMSUSPENDED/NOSTATE/BLOCKED, it forgets to
call qemuDomainObjEndJob.
https://bugzilla.redhat.com/show_bug.cgi?id=1134154
Bug introduced in commit 1e833899.
Signed-off-by: Jincheng Miao <jmiao@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Let's fix this before we bake in a painful API. Since we know
that we have exactly one non-negative fd on success, we might
as well return the fd directly instead of forcing the user to
pass in a pointer. Furthermore, I found some memory and fd
leaks while reviewing the code - the idea is that on success,
libvirtd will have handed two fds in two different directions:
one to qemu, and one to the RPC client.
* include/libvirt/libvirt.h.in (virDomainOpenGraphicsFD): Drop
unneeded parameter.
* src/driver.h (virDrvDomainOpenGraphicsFD): Likewise.
* src/libvirt.c (virDomainOpenGraphicsFD): Adjust interface to
return fd directly.
* daemon/remote.c (remoteDispatchDomainOpenGraphicsFd): Adjust
semantics.
* src/qemu/qemu_driver.c (qemuDomainOpenGraphicsFD): Likewise,
and plug fd leak.
* src/remote/remote_driver.c (remoteDomainOpenGraphicsFD):
Likewise, and plug memory and fd leak.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit b606bbb41 reminded me that any time we drop locks to run
back-to-back guest interaction commands, we have to check that
the guest didn't disappear in between the two commands. A quick
audit found a couple of spots that were missing this check.
* src/qemu/qemu_driver.c (qemuDomainShutdownFlags)
(qemuDomainSetVcpusFlags): Check that domain is still up.
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1078126
Using 'virsh attach-device --config' (or --persistent) to attach a
file backed lun device will succeed; however, subsequent domain restarts
will result in failure because the configuration of a file backed lun
is not supported.
Although allowing 'illegal configurations' is something that can be
allowed, it may not be practical in this case. Generally, when attaching
a device to a domain means the domain must be running. A way around
this is using the --config (or --persistent) option. When an attach
is done to a running domain, a temporary configuration is modified
first followed by the live update. The live update will make a number
of disk validity checks when building the qemu command to attach the
disk. If any fail, then change is rejected.
Rather than allow a potentially illegal combination, adjust the code
in the configuration path to make the same checks as the running path
will make with respect to disk validity checks. This way we avoid
having the potential for some subsequent start/reboot to fail because
an illegal combination was allowed.
NB: The live path still checks the configuration since it is possible
to just do --live guest modification...
https://bugzilla.redhat.com/show_bug.cgi?id=1103245
An advice appeared there on the qemu-devel list [1]. When a domain is
suspended and then resumed guest kernel is not aware of this. So we've
introduced virDomainSetTime API that resets the time within guest
using qemu-ga. On the other hand, qemu itself is trying to make RTC
beat faster to catch the difference. But if we don't tell qemu that
guest's time was reset via the other method, both mechanisms are
applied resulting in again wrong guest time. In order to avoid summing
both corrections we need to tell qemu that it should not use the RTC
injection if the guest time is set via guest agent.
1: http://www.mail-archive.com/qemu-devel@nongnu.org/msg236435.html
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When a user would try changing the persistent IO tuning settings for a
disk that was hotplugged to a vm in a transient way, the
qemuDomainSetBlockIoTune API would use the same index for both the
live and config disk array. The disk was missing from the config array
though causing a crash of libvirtd.
To fix the issue, determine the indexes separately.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1131819
Pass the source of the changed media instead of a complete disk
definition.
Note that the @disk argument now contains what @olddisk would contain.
The new source is passed as a virStorageSource struct.
When we are changing media (or doing other hotplug operations) we need
to setup cgroups, locking and seclabels on the new disk. This is a
multi-step process where every piece can fail. To simplify dealing with
this introduce qemuDomainPrepareDisk that similarly to
qemuDomainPrepareDiskChainElement initializes/tears down a whole new
disk to be used with the domain.
Additionally the function supports passing a different source struct for
media changes of cdroms that will be refactored later.
Currently, qemu driver uses qemuTranslateDiskSourcePool()
to translate disk volume information. This function is
general enough and could be used for other drivers as well,
so move it to conf/domain_conf.c along with its helpers.
- qemuTranslateDiskSourcePool: move to storage/storage_driver.c
and rename to virStorageTranslateDiskSourcePool,
- qemuAddISCSIPoolSourceHost: move to storage/storage_driver.c
and rename to virStorageAddISCSIPoolSourceHost,
- qemuTranslateDiskSourcePoolAuth: move to storage/storage_driver.c
and rename to virStorageTranslateDiskSourcePoolAuth,
- Update users of qemuTranslateDiskSourcePool to use a
new name.
In commit 45ad1adb I added a nicer message for tunings that need
cgroups when unavailable (unprivileged), but I added this check for
I/O tuning of block devices, which doesn't need cgroups, because it is
done by QEMU, so let's fix that.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Remove the pinning info when removing to CPU, otherwise when the VM will
be started our code will try to pin non-existing vcpus as the definition
wasn't updated.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1129372
During a QEMU live migration several warning messages about job
handling could be written to syslog on the destination host:
"entering monitor without asking for a nested job is dangerous"
The messages are written because the job handling during migration
uses hard coded asyncJob values in several places that are incorrect.
This patch passes the required asyncJob value around and prevents
the warnings as well as any issues that the warnings may be referring
to.
https://bugzilla.redhat.com/show_bug.cgi?id=1130089
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Saving a shutoff VM doesn't make sense and libvirtd crashes while
attempting to do that. Check that the domain is alive after entering
the save async job.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1129207
The virDomainSetInterfaceParameters implementation in qemu over
VIR_DOMAIN_AFFECT_CONFIG doesn't work as expected. When trying to
clear out the bandwidth settings for an interface, it has no
actual effect:
virsh # domiftune --config $domain $interface
inbound.average: 100
inbound.peak : 0
inbound.burst : 0
outbound.average: 10
outbound.peak : 0
outbound.burst : 0
virsh domiftune --config $domain $interface 0 0
virsh # domiftune --config $domain $interface
inbound.average: 100
inbound.peak : 0
inbound.burst : 0
outbound.average: 10
outbound.peak : 0
outbound.burst : 0
But according to virsh man page:
To clear inbound or outbound settings, use --inbound or
--outbound respectfully with average value of zero.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit febf84c2 tried to delay in-memory modification of the actual
domain disk structure until after the qemu event was received.
However, I missed that the code for block pivot had been temporarily
setting disk->src = disk->mirror prior to the qemu command, in order
to label the backing chain of a reused external blockcopy disk;
and calls into qemu while still in that state before finally undoing
things at the cleanup label. Since the qemu event handler then does:
virStorageSourceFree(disk->src);
disk->src = disk->mirror;
we have the sad race that a fast enough qemu event can cause a leak of
the original disk->src, as well as a use-after-free of the disk->mirror
contents, bad enough to crash libvirtd in some of my test runs, even
though the common case of the qemu event being much later won't trip
the race.
I'll go wear the brown paper bag of shame, for introducing a crasher
in between rc1 and rc2 of the freeze for 1.2.7 :( My only
consolation is that virDomainBlockJobAbort requires the domain:write
ACL, so it is not a CVE.
The valgrind report when the race occurs looks like:
==25612== Invalid read of size 4
==25612== at 0x50E7C90: virStorageSourceGetActualType (virstoragefile.c:1948)
==25612== by 0x209C0B18: qemuDomainDetermineDiskChain (qemu_domain.c:2473)
==25612== by 0x209D7F6A: qemuProcessHandleBlockJob (qemu_process.c:1087)
==25612== by 0x209F40C9: qemuMonitorEmitBlockJob (qemu_monitor.c:1357)
...
==25612== Address 0xe4b5610 is 0 bytes inside a block of size 200 free'd
==25612== at 0x4A07577: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==25612== by 0x50839E9: virFree (viralloc.c:582)
==25612== by 0x50E7E51: virStorageSourceFree (virstoragefile.c:2015)
==25612== by 0x209D7EFF: qemuProcessHandleBlockJob (qemu_process.c:1073)
==25612== by 0x209F40C9: qemuMonitorEmitBlockJob (qemu_monitor.c:1357)
* src/qemu/qemu_driver.c (qemuDomainBlockPivot): Don't corrupt
disk->src, and only label chain for blockcopy.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit 232a31b munged job info to report 'active commit' instead of
'commit' when generating events, but forgot to also munge the polling
variant of the command.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Adjust type as
needed.
Signed-off-by: Eric Blake <eblake@redhat.com>
Otherwise this beautiful error would be overwritten when
the function is called with a really high rate number:
2014-07-28 12:51:47.920+0000: 2304: error : virCommandWait:2399 :
internal error: Child process (/sbin/tc class add dev vnet0 parent 1:
classid 1:1 htb rate 4294968kbps) unexpected exit status 1: Illegal "rate"
Usage: ... qdisc add ... htb [default N] [r2q N]
default minor id of class to which unclassified packets are sent {0}
r2q DRR quantums are computed as rate in Bps/r2q {10}
debug string of 16 numbers each 0-3 {0}
... class add ... htb rate R1 [burst B1] [mpu B] [overhead O]
[prio P] [slot S] [pslot PS]
[ceil R2] [cburst B2] [mtu MTU] [quantum Q]
rate rate allocated to this class (class can still borrow)
burst max bytes burst which can be accumulated during idle period {computed}
mpu minimum packet size used in rate computations
overhead per-packet size overhead used in rate computations
linklay adapting to a linklayer e.g. atm
ceil definite upper class rate (no borrows) {rate}
cburst burst but for ceil {computed}
mtu max packet size we create rate map for {1600}
prio priority of leaf; lowe
https://bugzilla.redhat.com/show_bug.cgi?id=1043735
With this in place, I can (finally!) now do:
virsh blockcommit $dom vda --shallow --verbose --pivot
and watch qemu shorten the backing chain by one, followed by
libvirt automatically updating the dumpxml output, effectively
undoing the work of virsh snapshot-commit --no-metadata --disk-only.
Commit is SOOOO much faster than blockpull, when I'm still fairly
close in time to when the temporary qcow2 wrapper file was created
via a snapshot operation!
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Implement live
commit.
Signed-off-by: Eric Blake <eblake@redhat.com>
A future patch is going to wire up qemu active block commit jobs;
but as they have similar events and are canceled/pivoted in the
same way as block copy jobs, it is easiest to track all bookkeeping
for the commit job by reusing the <mirror> element. This patch
adds domain XML to track which job was responsible for creating a
mirroring situation, and adds a job='copy' attribute to all
existing uses of <mirror>. Along the way, it also massages the
qemu monitor backend to read the new field in order to generate
the correct type of libvirt job (even though it requires a
future patch to actually cause a qemu event that can be reported
as an active commit). It also prepares to update persistent XML
to match changes made to live XML when a copy completes.
* docs/schemas/domaincommon.rng: Enhance schema.
* docs/formatdomain.html.in: Document it.
* src/conf/domain_conf.h (_virDomainDiskDef): Add a field.
* src/conf/domain_conf.c (virDomainBlockJobType): String conversion.
(virDomainDiskDefParseXML): Parse job type.
(virDomainDiskDefFormat): Output job type.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Distinguish
active from regular commit.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): Set job type.
(qemuDomainBlockPivot, qemuDomainBlockJobImpl): Clean up job type
on completion.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-mirror-old.xml:
Update tests.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Likewise.
* tests/qemuxml2argvdata/qemuxml2argv-disk-active-commit.xml: New
file.
* tests/qemuxml2xmltest.c (mymain): Drive new test.
Signed-off-by: Eric Blake <eblake@redhat.com>
We were not directly saving the domain XML to file after starting
or finishing a blockcopy. Without the startup write, a libvirtd
restart in the middle of a copy job would forget that the job was
underway. Then at pivot, we were indirectly writing new XML in
reaction to events that occur as we stop and restart the guest CPUs.
But there was a race: since pivot is an async action, it is possible
that libvirtd is restarted before the pivot completes, so if XML
changes during the event, that change was not written. The original
blockcopy code cleared out the <mirror> element prior to restarting
the CPUs, but this is also a race, observed if a user does an async
pivot and a dumpxml before the event occurs. Furthermore, this race
will interfere with active commit in a future patch, because that
code will rely on the <mirror> element at the time of the qemu event
to determine whether to inform the user of a normal commit or an
active commit.
Fix things by saving state any time we modify live XML, while
delaying XML disk modifications until after the event completes. We
still need a to teach libvirtd restarts to examine all existing
<mirror> elements to see if the job completed in the meantime (that
is, if libvirtd misses the event, the updated state still needs to be
updated in live XML), but that will be a later patch, in part because
we also need to to start taking advantage of newer qemu's ability to
keep the job around after completion rather than the current usage
where the job disappears both on error and on success.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): Track XML change
on disk.
(qemuDomainBlockJobImpl, qemuDomainBlockPivot): Move job-end XML
rewrites...
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): ...here.
Signed-off-by: Eric Blake <eblake@redhat.com>
Doing a blockcopy operation across a libvirtd restart is not very
robust at the moment. In particular, we are clearing the <mirror>
element prior to telling qemu to finish the job. Also, thanks to the
ability to request async completion, the user can easily regain
control prior to qemu actually finishing the effort, and they should
be able to poll the domain XML to see if the job is still going.
A future patch will fix things to actually wait until qemu is done
before modifying the XML to reflect the job completion. But since
qemu issues identical BLOCK_JOB_COMPLETE events regardless of whether
the job was cancelled (kept the original disk) or completed (pivoted
to the new disk), we have to track which of the two operations were
used to end the job. Furthermore, we'd like to avoid attempts to
end a job where we are already waiting on an earlier request to qemu
to end the job. Likewise, if we miss the qemu event (perhaps because
it arrived during a libvirtd restart), we still need enough state
recorded to be able to determine how to modify the domain XML once
we reconnect to qemu and manually learn whether the job still exists.
Although this patch doesn't actually fix the problem, it is a
preliminary step that makes it possible to track whether a job
has already begun steps towards completion.
* src/conf/domain_conf.h (virDomainDiskMirrorState): New enum.
(_virDomainDiskDef): Convert bool mirroring to new enum.
* src/conf/domain_conf.c (virDomainDiskDefParseXML)
(virDomainDiskDefFormat): Handle new values.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Adjust
client.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot)
(qemuDomainBlockJobImpl): Likewise.
* docs/schemas/domaincommon.rng (diskMirror): Expose new values.
* docs/formatdomain.html.in (elementsDisks): Document it.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Test it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Use better detection of hugetlbfs mount points. Yes, there can be
multiple mount points each serving different huge page size.
Since we already have ability to override the mount point in the
qemu.conf file, this crazy backward compatibility code is brought in.
Now we allow multiple mount points, so the "hugetlbfs_mount" option
must take an list of strings (mount points). But previously, it was
just a string, so we must accept both types now.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since 24e5cafba6 (thankfully unreleased)
when a VM with an empty disk drive would be started the code would call
stat() on NULL path as a check was missing from the callback rendering
machines unstartable.
Report success when the path is empty (denoting an empty drive).
If user hasn't provided any @emulatorbin, the qemuCaps are
searched by @arch provided (which in fact can be guessed from the
host). However, there's no guarantee that the qemu binary for
@arch will exist. Therefore qemu capabilities may be nonexistent
too. If that's the case, we should throw an error message prior
jumping onto 'cleanup' label as the helper lookup function
remains silent on no search result.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
To integrate the security driver with the storage driver we need to
pass a callback for a function that will chown storage volumes.
Introduce and document the callback prototype.
Up to now, users have to pass two arguments at least: domain virt type
('qemu' vs 'kvm') and one of emulatorbin or architecture. This is not
much user friendly. Nowadays users mostly use KVM and share the host
architecture with the guest. So now, the API (and subsequently virsh
command) can be called with all NULLs (without any arguments).
Before this patch:
# virsh domcapabilities
error: failed to get emulator capabilities
error: virttype_str in qemuConnectGetDomainCapabilities must not be NULL
# virsh domcapabilities kvm
error: failed to get emulator capabilities
error: invalid argument: at least one of emulatorbin or architecture fields must be present
After:
# virsh domcapabilities
<domainCapabilities>
<path>/usr/bin/qemu-system-x86_64</path>
<domain>kvm</domain>
<machine>pc-i440fx-2.1</machine>
<arch>x86_64</arch>
<vcpu max='255'/>
</domainCapabilities>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This patch adds back the virDomainDef typedef into domain_conf and
makes all the numatune_conf functions independent of any virDomainDef
definitions.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1122205
Although the edits were changing in-memory XML, it was not flushed
to disk; so unless some other action changes XML, a libvirtd restart
would lose the changed information.
* src/conf/domain_conf.c (virDomainObjSetMetadata): Add parameter,
to save live status across restarts.
(virDomainSaveXML): Allow for test driver.
* src/conf/domain_conf.h (virDomainObjSetMetadata): Adjust
signature.
* src/bhyve/bhyve_driver.c (bhyveDomainSetMetadata): Adjust caller.
* src/lxc/lxc_driver.c (lxcDomainSetMetadata): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSetMetadata): Likewise.
* src/test/test_driver.c (testDomainSetMetadata): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Before:
virsh # dominfo chx3
State: shut off
Max memory: 92160 KiB
Used memory: 92160 KiB
After:
virsh # dominfo container1
State: shut off
Max memory: 92160 KiB
Used memory: 0 KiB
Similar to qemu cases.
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
Convert the target snapshot state selector to a switch statement
enumerating all possible values. This points out a few mistakes in the
original selector.
The logic of the code is preserved until later patches.
As with the local SCSI passthrough devicesm qemu can't support snapshots
on those as the block ops are handled by the device. This is also true
for iSCSI backing of the disk. Remove the check for the local block
device and just forbid snapshot when the disk is of type 'lun'.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1073368
There were numerous places where numatune configuration (and thus
domain config as well) was changed in different ways. On some
places this even resulted in persistent domain definition not to be
stable (it would change with daemon's restart).
In order to uniformly change how numatune config is dealt with, all
the internals are now accessible directly only in numatune_conf.c and
outside this file accessors must be used.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Since there was already public virDomainNumatune*, I changed the
private virNumaTune to match the same, so all the uses are unified and
public API is kept:
s/vir\(Domain\)\?Numa[tT]une/virDomainNumatune/g
then shrunk long lines, and mainly functions, that were created after
that:
sed -i 's/virDomainNumatuneMemPlacementMode/virDomainNumatunePlacement/g'
And to cope with the enum name, I haad to change the constants as
well:
s/VIR_NUMA_TUNE_MEM_PLACEMENT_MODE/VIR_DOMAIN_NUMATUNE_PLACEMENT/g
Last thing I did was at least a little shortening of already long
name:
s/virDomainNumatuneDef/virDomainNumatune/g
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1119173 documents that
commit eaba79d was flawed in the implementation of the
VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag when it comes to completing
a blockcopy. Basically, the qemu pivot action is async (the QMP
command returns immediately, but the user must wait for the
BLOCK_JOB_COMPLETE event to know that all I/O related to the job
has finally been flushed), but the libvirt command was documented
as synchronous by default. As active block commit will also be
using this code, it is worth fixing now.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Don't skip wait
loop after pivot.
Signed-off-by: Eric Blake <eblake@redhat.com>
Rename linuxDomainInterfaceStats to virNetInterfaceStats in order
to allow adding platform specific implementations without
making consumer worrying about specific implementation to be used.
Also, rename util/virstatslinux.c to util/virstats.c so placing
other platform specific implementations into this file don't
look unexpected from the file name.
4cc1f1a01f introduced a crash when doing a
block copy as virStorageSourceInitChainElement was called on
"disk->mirror" that is still NULL at that point instead of "mirror"
which temporarily holds the mirror source struct until it's fully
initialized. This resulted into a crash as a NULL was dereferenced.
Reported by: Shanzi Yu <shyu@redhat.com>
Commit id '3ea661de' refactored the code to use the 'disk->src->path'
instead of getting the path from virDomainDiskGetSource(). The one
call to qemuOpenFile() didn't use the disk source path, rather it used
the path as passed from the caller (in this case 'vda') - this caused
a failure with the virt-test/tp-libvirt as follows:
$ virsh domblkinfo virt-tests-vm1 vda
error: cannot stat file '/home/virt-test/shared/data/images/jeos-20-64.qcow2': Bad file descriptor
$
The default graphics channel mode is 'any', so as to defaultMode attribute.
If defaultMode and channel mode are all the default value 'any',
qemuConnectDomainXMLToNative will set TLSPort.
But in qemuBuildGraphicsSPICECommandLine, if spice_tls is not enabled, libvirtd
will report an error to tell the user that spice TLS is disabled in qemu.conf.
So qemuConnectDomainXMLToNative should check spice_tls is enabled,
then decide to allocate an tlsPort number to this graphics.
If user specified defaultMode is 'secure', qemuConnectDomainXMLToNative
could allocate tlsPort, and then let qemuBuildGraphicsSPICECommandLine reports
the spice_tls disabled error.
The related bug is:
https://bugzilla.redhat.com/show_bug.cgi?id=1113868
Signed-off-by: Jincheng Miao <jmiao@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Now that cgroups/security driver/locking driver support labelling of
individual images and tolerate network storage we don't have to refrain
from passing all image files to it. This allows removing the checking
code as we already make sure that the snapshot function won't be called
with unsupported options.
Now that security, cgroup and locking APIs support working on individual
images and we track the backing chain security info on a per-image basis
we can finally kill swapping the disk source in virDomainDiskDef and use
the virStorageSource directly.
Until now we were changing information about the disk source via
multiple steps of copying data. Now that we changed to a pointer to
store the disk source we might use it to change the approach to track
the data.
Additionally this will allow proper tracking of the backing chain.
When pivoting to a new disk source after a block commit (and possibly
after a soon-to-be-added active block commit) we changed just a few
fields to the new target. In case we'd copy a network disk to a local
file we'd not change the type properly.
To avoid such problems, switch to tracking of the source via changing of
the complete source struct to the one tracking the mirroring info.
Use the source struct and the corresponding function so that we can
avoid using the path separately. Now that
qemuDomainPrepareDiskChainElementPath isn't use anywhere, we can safely
remove it.
Additionally, the removal fixes a misaligned comment as the removed
function was added under a comment for a different function.
In the future we might need to track state of individual images. Move
the readonly and shared flags to the virStorageSource struct so that we
can keep them in a per-image basis.
Now that we are able to select images from the backing chain via indexed
access we should also convert possible network sources to
qemu-compatible strings before passing them to qemu.
Now that we are able to select images from the backing chain via indexed
access we should also convert possible network sources to
qemu-compatible strings before passing them to qemu.
The qemu block info function relied on working with local storage. Break
this assumption by adding support for remote volumes. Unfortunately we
still need to take a hybrid approach as some of the operations require a
filedescriptor.
Previously you'd get:
$ virsh domblkinfo gl vda
error: cannot stat file '/img10': Bad file descriptor
Now you get some stats:
$ virsh domblkinfo gl vda
Capacity: 10485760
Allocation: 197120
Physical: 197120
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1110198
For the regular dump operation we migrate the VM to a file. This won't
work when the VM has passthrough devices assigned. Rather than reporting
a cryptic error from qemu run our check whether it can be migrated.
This does not influence the memory-only dump that is allowed with
passthrough devices.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=874418
To allow changing the name that is recorded in the top of the current
image chain used in a block pull/rebase operation, we need to specify
the backing name to qemu. This is done via the "backing-file" attribute
to the block-stream commad.
To allow changing the name that is recorded in the overlay of the TOP
image used in a block commit operation, we need to specify the backing
name to qemu. This is done via the "backing-file" attribute to the
block-commit command.
Use the probing functionality added in the last patch to turn on
a capability bit when active commit is present, and gate active
commit on that capability.
For my own reference: the difference between BLOCKJOB_SYNC and
BLOCKJOB_ASYNC is whether qemu generated an event at the
conclusion of blockpull; basically, RHEL 6.2 was the only release
of qemu that has the sync semantics and lacks the event. RHEL
6.3 added blockcopy, but also picked up on the upstream style
of qemu generating events. As no one is likely to backport
active commit to RHEL 6.2, it's safe for blockcommit to always
require async blockjob support.
Modifying qemucapabilitiestest is painful; the .replies files would
be so much easier if they had comments correlating which command
generated the given reply. Maybe I'll fix that up later...
* src/qemu/qemu_capabilities.h (QEMU_CAPS_ACTIVE_COMMIT): New
capability.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Use the new bit
* src/qemu/qemu_capabilities.c (virQEMUCaps): Name the new bit.
(virQEMUCapsProbeQMPCommands): Set it.
* tests/qemucapabilitiesdata/caps_1.3.1-1.replies: Update.
* tests/qemucapabilitiesdata/caps_1.4.2-1.replies: Likewise.
* tests/qemucapabilitiesdata/caps_1.5.3-1.replies: Likewise.
* tests/qemucapabilitiesdata/caps_1.6.0-1.replies: Likewise.
* tests/qemucapabilitiesdata/caps_1.6.50-1.replies: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
So far only information on disks and host devices are exposed in the
capabilities XML. Well, at least something. Even a new test is
introduced. The qemu capabilities are stolen from already existing
qemucapabilities test. There's one tricky point though. Functions that
checks host's KVM and VFIO capabilities, are impossible to mock
currently. So in the test, we are setting the capabilities by hand.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Replace:
if (virBufferError(&buf)) {
virBufferFreeAndReset(&buf);
virReportOOMError();
...
}
with:
if (virBufferCheckError(&buf) < 0)
...
This should not be a functional change (unless some callers
misused the virBuffer APIs - a different error would be reported
then)
So far, we only report an error if formatting the siblings bitmap
in NUMA topology fails.
Be consistent and always report error in virCapabilitiesFormatXML.
We have the following matrix of possible arguments handled by the logic
statement touched by this patch:
| flags & _REUSE_EXT | !(flags & _REUSE_EXT)
-------+--------------------+----------------------
format| (1) | (2)
-------+--------------------+----------------------
!format| (3) | (4)
-------+--------------------+----------------------
In cases 1 and 2 the user provided a format, in cases 3 and 4 not. The
user requests to use a pre-existing image in 1 and 3 and libvirt will
create a new image in 2 and 4.
The difference between cases 3 and 4 is that for 3 the format is probed
from the user-provided image, whereas in 4 we just use the existing disk
format.
The current code would treat cases 1,3 and 4 correctly but in case 2 the
format provided by the user would be ignored.
The particular piece of code was broken in commit 35c7701c64
but since it was introduced a few commits before that it was never
released as working.
Commit 55bbb011b9 introduced a regression
where we forgot to save the persistent domain configuration after an
external snapshot. This would make libvirt forget the snapshots and
effectively revert to the previous state in the following scenario:
1) Start VM
2) Take snapshot
3) Destroy VM
4) Restart libvirtd
Also fix spurious blank line added by patch mentioned above.
When creating a new disk mirror the new struct is stored in a separate
variable until everything went well. The removed hunk would actually
remove existing mirror information for example when the api would be run
if a mirror still exists.
I'm going to add functions that will deal with individual image files
rather than whole disks. Rename the security function to make room for
the new one.
The new VIR_CONNECT_COMPARE_CPU_FAIL_INCOMPATIBLE flag for
virConnectCompareCPU can be used to get an error
(VIR_ERR_CPU_INCOMPATIBLE) describing the incompatibility instead of the
usual VIR_CPU_COMPARE_INCOMPATIBLE return code.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When CPU comparison APIs return VIR_CPU_COMPARE_INCOMPATIBLE, the caller
has no clue why the CPU is considered incompatible with host CPU. And in
some cases, it would be nice to be able to get such info in a client
rather than having to look in logs.
To achieve this, the APIs can be told to return VIR_ERR_CPU_INCOMPATIBLE
error for incompatible CPUs and the reason will be described in the
associated error message.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
As we are doing with the enum structures, a cleanup in "src/qemu/"
directory was done now. All the enums that were defined in the
header files were converted to typedefs in this directory. This
patch includes all the adjustments to remove conflicts when you do
this kind of change. "Enum-to-typedef"'s conversions were made in
"src/qemu/qemu_{capabilities, domain, migration, hotplug}.h".
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
I'm going to add functions that will deal with individual image files
rather than whole disks. Rename the security function to make room for
the new one.
For future work we want to get info for not only the free memory
but overall memory size too. That's why the function must have
new signature too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When the block job event was first added, it was for block pull,
where the active layer of the disk remains the same name. It was
also in a day where we only cared about local files, and so we
always had a canonical absolute file name. But two things have
changed since then: we now have network disks, where determining
a single absolute string does not really make sense; and we have
two-phase jobs (copy and active commit) where the name of the
active layer changes between the first event (ready, on the old
name) and second (complete, on the pivoted name).
Adam Litke reported that having an unstable string between events
makes life harder for clients. Furthermore, all of our API that
operate on a particular disk of a domain accept multiple strings:
not only the absolute name of the active layer, but also the
destination device name (such as 'vda'). As this latter name is
stable, even for network sources, it serves as a better string
to supply in block job events.
But backwards-compatibility demands that we should not change the
name handed to users unless they explicitly request it. Therefore,
this patch adds a new event, BLOCK_JOB_2 (alas, I couldn't think of
any nicer name - but at least Migrate2 and Migrate3 are precedent
for a number suffix). We must double up on emitting both old-style
and new-style events according to what clients have registered for
(see also how IOError and IOErrorReason emits double events, but
there the difference was a larger struct rather than changed
meaning of one of the struct members).
Unfortunately, adding a new event isn't something that can easily
be broken into pieces, so the commit is rather large.
* include/libvirt/libvirt.h.in (virDomainEventID): Add a new id
for VIR_DOMAIN_EVENT_ID_BLOCK_JOB_2.
(virConnectDomainEventBlockJobCallback): Document new semantics.
* src/conf/domain_event.c (_virDomainEventBlockJob): Rename field,
to ensure we catch all clients.
(virDomainEventBlockJobNew): Add parameter.
(virDomainEventBlockJobDispose)
(virDomainEventBlockJobNewFromObj)
(virDomainEventBlockJobNewFromDom)
(virDomainEventDispatchDefaultFunc): Adjust clients.
(virDomainEventBlockJob2NewFromObj)
(virDomainEventBlockJob2NewFromDom): New functions.
* src/conf/domain_event.h: Add new prototypes.
* src/libvirt_private.syms (domain_event.h): Export new functions.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Generate two
different events.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Likewise.
* src/remote/remote_protocol.x
(remote_domain_event_block_job_2_msg): New struct.
(REMOTE_PROC_DOMAIN_EVENT_BLOCK_JOB_2): New RPC.
* src/remote/remote_driver.c
(remoteDomainBuildEventBlockJob2): New handler.
(remoteEvents): Register new event.
* daemon/remote.c (remoteRelayDomainEventBlockJob2): New handler.
(domainEventCallbacks): Register new event.
* tools/virsh-domain.c (vshEventCallbacks): Likewise.
(vshEventBlockJobPrint): Adjust client.
* src/remote_protocol-structs: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com>
The block commit code looks for an explicit base file relative
to the discovered top file; so for a chain of:
base <- snap1 <- snap2 <- snap3
and a command of:
virsh blockcommit $dom vda --base snap2 --top snap1
we got a sane message (here from libvirt 1.0.5):
error: invalid argument: could not find base 'snap2' below 'snap1' in chain for 'vda'
Meanwhile, recent refactoring has slightly reduced the quality of the
libvirt error messages, by losing the phrase 'below xyz':
error: invalid argument: could not find image 'snap2' in chain for 'snap3'
But we had a one-off, where we were not excluding the top file
itself in searching for the base; thankfully qemu still reports
the error, but the quality is worse:
virsh blockcommit $dom vda --base snap2 --top snap2
error: internal error unable to execute QEMU command 'block-commit': Base '/snap2' not found
Fix the one-off in blockcommit by changing the semantics of name
lookup - if a starting point is specified, then the result must
be below that point, rather than including that point. The only
other call to chain lookup was blockpull code, which was already
forcing the lookup to omit the active layer and only needs a
tweak to use the new semantics.
This also fixes the bug exposed in the testsuite, where when doing
a lookup pinned to an intermediate point in the chain, we were
unable to return the name of the parent also in the chain.
* src/util/virstoragefile.c (virStorageFileChainLookup): Change
semantics for non-NULL startFrom.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Adjust caller,
to keep existing semantics.
* tests/virstoragetest.c (mymain): Adjust to expose new semantics.
Signed-off-by: Eric Blake <eblake@redhat.com>
For block devices used as snapshot source the new snapshot code would
set the reuse flag. This inhibits to take snapshot without specially
preparing the block image before taking the snapshot.
Fortunately this is not a regression as only the new way of specifying
snapshot source is affected.
For the followin snapshot XML:
<domainsnapshot>
<disks>
<disk name='vda' type='block'>
<driver type='qcow2'/>
<source dev="/dev/andariel/testsnap" />
</disk>
</disks>
</domainsnapshot>
You'd get:
error: internal error: unable to execute QEMU command 'transaction': Image is not in qcow2 format
After this patch the snapshot is created successfully.
A future patch will add two-phase block commit jobs; as the
mechanism for managing them is similar to managing a block copy
job, existing errors should be made generic enough to occur
for either job type.
* src/conf/domain_conf.c (virDomainHasDiskMirror): Update
comment.
* src/qemu/qemu_driver.c (qemuDomainDefineXML)
(qemuDomainSnapshotCreateXML, qemuDomainRevertToSnapshot)
(qemuDomainBlockJobImpl, qemuDomainBlockCopy): Update error
message.
* src/qemu/qemu_hotplug.c (qemuDomainDetachDiskDevice): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit f586965 accidentally changed the semantics of the
virDomainBlockCommit command; where it previously looked for
an explicit top argument from the top of the chain, it now
starts from the backing file of the top. Of course, until
we allow active commits, the only difference it makes is in
the quality of the error message, but with code for active
commit coming soon, we need to support an explicit mention
of the active layer.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Start looking
from top of chain.
Signed-off-by: Eric Blake <eblake@redhat.com>
When saving domain with relabel=no, the file that gets created must have the
context set anyway. That way restore can be successful without the need of
relabelling the file.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Now that qemu 2.0 allows commit of the active layer, people are
attempting to use virsh blockcommit and getting into a stuck
state, because libvirt is unprepared to handle the two-phase
commit required by qemu.
Stepping back a bit, there are two valid semantics for a
commit operation:
1. Maintain a 'golden' base, and a transient overlay. Make
changes in the overlay, and if everything appears to work,
commit those changes into the base, but still keep the overlay
for the next round of changes; repeat the cycle as desired.
2. Create an external snapshot, then back up the stable state
in the backing file. Once the backup is complete, commit the
overlay back into the base, and delete the temporary snapshot.
Since qemu doesn't know up front which of the two styles is
preferred, a block commit of the active layer merely gets
the job into a synchronized state, and sends an event; then
the user must either cancel (case 1) or complete (case 2),
where qemu then sends a second event that actually ends the
job. However, until commit e6bcbcd, libvirt was blindly
assuming the semantics that apply to a commit of an
intermediate image, where there is only one sane conclusion
(the job automatically ends with fewer elements in the chain);
and getting stuck because it wasn't prepared for qemu to enter
a second phase of the job.
This patch adds a flag to the libvirt API that a user MUST
supply in order to acknowledge that they will be using two-phase
semantics. It might be possible to have a mode where if the
flag is omitted, we automatically do the case 2 semantics on
the user's behalf; but before that happens, I must do additional
patches to track the fact that we are doing an active commit
in the domain XML. Later patches will add support of the flag,
and once 2-phase semantics are working, we can then decide
whether to relax things to allow an omitted flag to cause an
automatic pivot.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_COMMIT_ACTIVE)
(VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT): New enums.
* src/libvirt.c (virDomainBlockCommit): Document two-phase job
when committing active layer, through new flag.
(virDomainBlockJobAbort): Document that pivot also occurs after
active commit.
* tools/virsh-domain.c (vshDomainBlockJob): Cover new job.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Explicitly
reject active copy; later patches will add it in.
Signed-off-by: Eric Blake <eblake@redhat.com>
The current implementation of 'virsh blockcopy' (virDomainBlockRebase)
is limited to copying to a local file name. But future patches want
to extend it to also copy to network disks. This patch converts over
to a virStorageSourcePtr, although it should have no semantic change
visible to the user, in anticipation of those future patches being
able to use more fields for non-file destinations.
* src/conf/domain_conf.h (_virDomainDiskDef): Change type of
mirror information.
* src/conf/domain_conf.c (virDomainDiskDefParseXML): Localize
mirror parsing into new object.
(virDomainDiskDefFormat): Adjust clients.
* src/qemu/qemu_domain.c (qemuDomainDeviceDefPostParse):
Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot)
(qemuDomainBlockJobImpl, qemuDomainBlockCopy): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
As part of the work on backing chains, I'm finding that it would
be easier to directly manipulate chains of pointers (adding a
snapshot merely adjusts pointers to form the correct list) rather
than copy data from one struct to another. This patch converts
domain disk source to be a pointer.
In this patch, the pointer is ALWAYS allocated (thanks in part to
the previous patch forwarding all disk def allocation through a
common point), and all other changse are just mechanical fallout of
the new type; there should be no functional change. It is possible
that we may want to leave the pointer NULL for a cdrom with no
medium in a later patch, but as that requires a closer audit of the
source to ensure we don't fault on a null dereference, I didn't do
it here.
* src/conf/domain_conf.h (_virDomainDiskDef): Change type of src.
* src/conf/domain_conf.c: Adjust all clients.
* src/security/security_selinux.c: Likewise.
* src/qemu/qemu_domain.c: Likewise.
* src/qemu/qemu_command.c: Likewise.
* src/qemu/qemu_conf.c: Likewise.
* src/qemu/qemu_process.c: Likewise.
* src/qemu/qemu_migration.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
* src/lxc/lxc_driver.c: Likewise.
* src/lxc/lxc_controller.c: Likewise.
* tests/securityselinuxlabeltest.c: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
As part of the work on backing chains, I'm finding that it would
be easier to directly manipulate chains of pointers (adding a
snapshot merely adjusts pointers to form the correct list) rather
than copy data from one struct to another. This patch converts
snapshot source to be a pointer.
In this patch, the pointer is ALWAYS allocated (any code that
increases ndisks now also allocates a source pointer for each
new disk), and all other changes are just mechanical fallout of
the new type; there should be no functional change. It is
possible that we may want to leave the pointer NULL for internal
snapshots in a later patch, but as that requires a closer audit
of the source to ensure we don't fault on a null dereference, I
didn't do it here.
* src/conf/snapshot_conf.h (_virDomainSnapshotDiskDef): Change
type of src.
* src/conf/snapshot_conf.c: Adjust all clients.
* src/qemu/qemu_conf.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
This simplifies the usage in {libxl,qemu}DomainGetNumaParameters
and it's needed for consistent error reporting in virBitmapFormat.
Also remove the forgotten ATTRIBUTE_NONNULL marker.
Currently, we don not acquire any job when removing a device after
DEVICE_DELETED event was received from QEMU. This means that if there is
another API running at the time DEVICE_DELETED is delivered and the API
acquired a job, we may happily change the definition of the domain the
API is working with whenever it unlocks the domain object (e.g., to talk
with its monitor). That said, we have to acquire a job before finishing
device removal to make things safe. However, doing so in the main event
loop would cause a deadlock so we need to move most of the event handler
into a separate thread.
Another good reason for both acquiring a job and handling the event in a
separate thread is that we currently remove a device backend immediately
after removing its frontend while we should only remove the backend once
we already received DEVICE_DELETED event. That is, we will have to talk
to QEMU monitor from the event handler.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Currently we don't support mixed (external + internal) snapshots. The
code detecting the snapshot type didn't make sure that the memory image
was consistent with the snapshot type leading into strange error
message:
$ virsh snapshot-create-as --domain VM --diskspec vda,snapshot=internal --memspec snapshot=external,file=/tmp/blah
error: internal error: unexpected code path
Fix the mixed detection code to detect this kind of mistake:
$ virsh snapshot-create-as --domain VM --diskspec vda,snapshot=internal --memspec snapshot=external,file=/tmp/blah
error: unsupported configuration: mixing internal and external targets for a snapshot is not yet supported
A internal snapshot of a active VM with the memory snapshot disabled
explicitly would actually still take the memory snapshot. Reject it
explicitly.
Before:
$ virsh snapshot-create-as --domain VM --diskspec vda,snapshot=internal --memspec snapshot=no
Domain snapshot 1401353155 created
After:
$ virsh snapshot-create-as --domain VM --diskspec vda,snapshot=internal --memspec snapshot=no
error: Operation not supported: internal snapshot of a running VM must include the memory state
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1083345
Even successful start of a VM from a managed save image would spam the
logs with the following message:
Unable to restore from managed state [path]. Maybe the file is
corrupted?
Re-arrange the logic to output the warning only when the image is
corrupted.
The flaw was introduced in commit cfc28c66.
Add argument to return backing file format of a file probed by
virStorageFileGetMetadataFromFD so that it can be used in place of
virStorageFileGetMetadataFromBuf.
qemu 2.0 added the ability to commit the active layer, but slightly
differently than what libvirt had been anticipating in its
implementation of the virDomainBlockCommit call. As a result, if
you attempt to do a 'virsh blockcommit $dom vda', qemu gets into a
state where it is waiting on libvirt to end the job, while libvirt
is waiting on qemu to end the job, and the guest is effectively
hung with regards to further commands for that block device.
I have patches coming down the pipeline that will add full support
for blockcommit of the active layer when coupled with qemu 2.0 or
later; but they depend on Peter's improvements to block job handling
and form enough of a new feature that they are not ready for
inclusion in the 1.2.5 release. So for now, just reject the
attempt, rather than letting the user get stuck. This is no worse
than the behavior of qemu 1.7 rejecting the job.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Reject active
commit.
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently the protocol type with index 0 was NBD which made it hard to
distinguish whether the protocol type was actually assigned. Add a new
protocol type with index 0 to distinguish it explicitly.
Refactor the function to accept a virStorageSourcePtr instead of just
the path, add a check to run it only on local storage and fix callers
(possibly by using a newly introduced wrapper that wraps a path in the
virStorageSource struct for legacy code)
When doing an external checkpoint of a VM with no disk selected we'd
return failure but not set error code. This was a result of ret not
being set to 0 during walking of the disk array.
Rework early failure checking and set the error code to success before
iterating the array of disks so that we return success if no disks are
snapshotted.
Fixes the following symptom (or without --diskspec for diskless VMs)
$ virsh snapshot-create-as snapshot-test --memspec /tmp/asdf --diskspec hda,snapshot=no
error: An error occurred, but the cause is unknown
If neither disks nor memory are selected for snapshot we'd record
metadata in case of external snapshot and do a disk snapshot in case of
external disk snapshot. Forbid this as it doesn't make much sense.
Commit d5c86278 was incomplete; other functions also triggered
compiler warnings about collisions in the use of 'sync'.
* src/qemu/qemu_driver.c (qemuDomainSetTime): Fix another client.
* tools/virsh-domain-monitor.c (cmdDomTime): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
This partially reverts commits b279e52f7 and ea18f8b2.
It turns out our code base is full of:
if ((struct.member = virBlahFromString(str)) < 0)
goto error;
Meanwhile, the C standard says it is up to the compiler whether
an enum is signed or unsigned when all of its declared values
happen to be positive. In my testing (Fedora 20, gcc 4.8.2),
the compiler picked signed, and nothing changed. But others
testing with gcc 4.7 got compiler warnings, because it picked
the enum to be unsigned, but no unsigned value is less than 0.
Even worse:
if ((struct.member = virBlahFromString(str)) <= 0)
goto error;
is silently compiled without warning, but incorrectly treats -1
from a bad parse as a large positive number with no warning; and
without the compiler's help to find these instances, it is a
nightmare to maintain correctly. We could force signed enums
with a dummy negative declaration in each enum, or cast the
result of virBlahFromString back to int after assigning to an
enum value, or use a temporary int for collecting results from
virBlahFromString, but those actions are all uglier than what we
were trying to cure by directly using enum types for struct
values in the first place. It's better off to just live with int
members, and use 'switch ((virFoo) struct.member)' where we want
the compiler to help, than to track down all the conversions from
string to enum and ensure they don't suffer from type problems.
* src/util/virstorageencryption.h: Revert back to int declarations
with comment about enum usage.
* src/util/virstoragefile.h: Likewise.
* src/conf/domain_conf.c: Restore back to casts in switches.
* src/qemu/qemu_driver.c: Likewise.
* src/qemu/qemu_command.c: Add cast rather than revert.
Signed-off-by: Eric Blake <eblake@redhat.com>
For internal structs, we might as well be type-safe and let the
compiler help us with less typing required on our part (getting
rid of casts is always nice). In trying to use enums directly,
I noticed two problems in virstoragefile.h that can't be fixed
without more invasive refactoring: virStorageSource.format is
used as more of a union of multiple enums in storage volume
code (so it has to remain an int), and virStorageSourcePoolDef
refers to pooltype whose enum is declared in src/conf, but where
src/util can't pull in headers from src/conf.
* src/util/virstoragefile.h (virStorageNetHostDef)
(virStorageSourcePoolDef, virStorageSource): Use enums instead of
int for fields of internal types.
* src/qemu/qemu_command.c (qemuParseCommandLine): Cover all values.
* src/conf/domain_conf.c (virDomainDiskSourceParse)
(virDomainDiskSourceFormat): Simplify clients.
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotPrepareDiskExternalBackingInactive)
(qemuDomainSnapshotPrepareDiskExternalOverlayActive)
(qemuDomainSnapshotPrepareDiskInternal): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
One caveat though, qemu-ga is expecting time and returning time
in nanoseconds. With all the buffering and propagation delay, the
time is already wrong once it gets to the qemu-ga, but there's
nothing we can do about it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If the compression program for external snapshot memory image isn't
found we exitted the function without terminating the domain job. This
caused the domain to be unusable.
The problem was introduced in commit 7df5093f.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1097503
https://bugzilla.redhat.com/show_bug.cgi?id=1002813
If qemuDomainBlockResize() is passed a size not on a KiB boundary - that
is passed a size based in bytes (VIR_DOMAIN_BLOCK_RESIZE_BYTES), then
depending on the source format (qcow2 or qed), the value passed must
be on a sector (or 512 byte) boundary. Since other libvirt code quietly
adjusts the capacity values, then do so here as well.
With this patch, virDomainFSFreeze will pass the mountpoints argument
to qemu guest agent. For example,
virDomainFSFreeze(dom, {"/mnt/vol1", "/mnt/vol2"}, 2, 0)
will issue qemu guest agent command:
{"execute":"guest-fsfreeze-freeze",
"arguments":{"mountpoints":["/mnt/vol1","/mnt/vol2"]}}
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Acked-by: Daniel P. Berrange <berrange@redhat.com>
Use qemuDomainSnapshotFSFreeze() and qemuDomainSnapshotFSFThaw() which are
already implemented for snapshot quiescing.
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
"Freezed" is not an English word.
* src/lxc/lxc_driver.c (lxcFreezeContainer): Fix typo.
* src/qemu/qemu_driver.c (qemuDomainSnapshotFSFreeze): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1002813
If qemuDomainBlockResize() is passed a size not on a KiB boundary - that
is passed a size based in bytes (VIR_DOMAIN_BLOCK_RESIZE_BYTES), then
depending on the source format (qcow2 or qed), the value passed must
be on a sector (or 512 byte) boundary. Since other libvirt code quietly
adjusts the capacity values, then do so here as well - of course ensuring
that adjustment still fits.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Adds 'quiesced' status into qemuDomainObjPrivate that tracks whether
FSFreeze is requested in the domain.
It modifies error code from qemuDomainSnapshotFSFreeze and
qemuDomainSnapshotFSThaw, so that a caller can know whether the command is
actually sent to the guest agent. If the error is caused before sending a
freeze command, a counterpart thaw command shouldn't be sent either, not to
confuse fsfreeze status tracking.
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
In "src/util/" there are many enumeration (enum) declarations.
Sometimes, it's better using a typedef for variable types,
function types and other usages. Other enumeration will be
changed to typedef's in the future.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
To avoid memory leak of the "backingStoreRaw" field when reparsing
backing chains a new function is being introduced by this patch that
shall be used to clear backing store information.
The memory leak was introduced in commit 8823272d41.
Convert all remaining clients of readdir to use the new
interface, so that we can ensure (unlikely) errors while
reading a directory are reported.
* src/openvz/openvz_conf.c (openvzAssignUUIDs): Use new
interface.
* src/parallels/parallels_storage.c (parallelsFindVolumes)
(parallelsFindVmVolumes): Report readdir failures.
* src/qemu/qemu_driver.c (qemuDomainSnapshotLoad): Ignore readdir
failures.
* src/secret/secret_driver.c (loadSecrets): Likewise.
* src/qemu/qemu_hostdev.c
(qemuHostdevHostSupportsPassthroughVFIO): Report readdir failures.
* src/xen/xen_inotify.c (xenInotifyOpen): Likewise.
* src/xen/xm_internal.c (xenXMConfigCacheRefresh): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
When a snapshot operation finishes we have to recheck the backing chain
of all disks involved in the snapshot. And we need to do that even if
the operation failed because some of the disks might have changed if
QEMU did not support transactions.
Commit c4206d7 fixed the overflow for running domains. However, we need
a similar check when setting migration speed on inactive domains.
At first look, it may seem the check in c4206d7 is now redundant but
qemuDomainMigrateSetMaxSpeed is not the only caller of
qemuMonitorSetMigrationSpeed so we need to check the bandwidth in both
places.
https://bugzilla.redhat.com/show_bug.cgi?id=1083483
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Each backing store of a given disk is associated with a unique index
(which is also formatted in domain XML) for easier addressing of any
particular backing store. With this patch, any backing store can be
addressed by its disk target and the index. For example, "vdc[4]"
addresses the backing store with index equal to 4 of the disk identified
by "vdc" target. Such shorthand can be used in any API in place for a
backing file path:
virsh blockcommit domain vda --base vda[3] --top vda[2]
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
virStorageFileChainLookup is able to give use virStorageSourcePtr which
contains the pointer to its canonical path. Let's use a more general
virStorageSourcePtr instead of just canonical path.
Former base_canon maps to baseSource->path.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
virStorageFileChainLookup is able to give use virStorageSourcePtr which
contains the pointer to its canonical path. There's no need for the
caller to store both of them.
Former top_meta maps to topSource and top_canon maps to topSource->path.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
To avoid having the root of a backing chain present twice in the list we
need to invert the working of virStorageFileGetMetadataRecurse.
Until now the recursive worker created a new backing chain element from
the name and other information passed as arguments. This required us to
pass the data of the parent in a deconstructed way and the worker
created a new entry for the parent.
This patch converts this function so that it just fills in metadata
about the parent and creates a backing chain element from those. This
removes the duplication of the first element.
To avoid breaking the test suite, virstoragetest now calls a wrapper
that creates the parent structure explicitly and pre-fills it with the
test data with same function signature as previously used.
Switch over to storing of the backing chain as a recursive
virStorageSource structure.
This is a string based move. Currently the first element will be present
twice in the backing chain as currently the retrieval function stores
the parent in the newly detected chain. This will be fixed later.
Remove the obsolete field replaced by data in "path".
The testsuite requires tweaking as the name of the backing file is now
stored one layer deeper in the backing chain linked list.
Remove the pointer from def->cputune.vcpupin after unplugging
the CPU and also free the bitmap contained in the structure
by calling virDomainVcpuPinDel instead of VIR_FREE.
Introduced by commit 0df1a79.
This makes virDomainLookupVcpuPin redundant.
https://bugzilla.redhat.com/show_bug.cgi?id=1088165
The original chain lookup code had to pass in the starting name,
because it was not available in the chain. But now that we have
added fields to the struct, this parameter is redundant.
* src/util/virstoragefile.h (virStorageFileChainLookup): Alter
signature.
* src/util/virstoragefile.c (virStorageFileChainLookup): Adjust
handling of top of chain.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Adjust caller.
* tests/virstoragetest.c (testStorageLookup, mymain): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
The chain lookup function was inconsistent on whether it left
a message in the log when looking up a name that is not found
on the chain (leaving a message for OOM or if name was
relative but not part of the chain), and could litter the log
even when successful (when name was relative but deep in the
chain, use of virFindBackingFile early in the chain would complain
about a file not found). It's easier to make the function
consistently emit a message exactly once on failure, and to let
all callers rely on the clean semantics.
* src/util/virstoragefile.c (virStorageFileChainLookup): Always
report error on failure. Simplify relative lookups.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Avoid
overwriting error.
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently, virCgroupGetPercpuStats is only used by the LXC driver,
filling out the CPUTIME stats. qemuDomainGetPercpuStats does this
and also filles out VCPUTIME stats.
Extend virCgroupGetPercpuStats to also report VCPUTIME stats if
nvcpupids is non-zero. In the LXC driver, we don't have cpupids.
In the QEMU driver, there is at least one cpupid for a running domain,
so the behavior shouldn't change for QEMU either.
Also rename getSumVcpuPercpuStats to virCgroupGetPercpuVcpuSum.
Refactor the function to avoid multiple wrappers splitting identical
fields from the now common metadata struct.
The refactor is done by folding in the wrapper used for disk sources
which allows us to lookup secrets via the secret driver. This may allow
using stored secrets for snapshot disk images too in the future.
Now that we store all metadata about a storage image in a
virStorageSource struct let's use it also to store information needed by
the storage driver to access and do operations on the files.
Right now, virStorageFileMetadata tracks bool backingStoreIsFile
for whether the backing string specified in metadata can be
resolved as a file (covering both block and regular file
resources) or is treated as a network protocol. But when
merging this struct with virStorageSource, it will be easier
to just actually track which type of resource it is, as well
as have a reserved value for the case where the resource type
is unknown (or had an error during probing).
* src/util/virstoragefile.h (virStorageType): Add a placeholder
value, swap order to match similar public enum.
* src/util/virstoragefile.c (virStorage): Update string mapping.
* src/conf/domain_conf.c (virDomainDiskSourceParse)
(virDomainDiskDefParseXML, virDomainDiskDefFormat)
(virDomainDiskSourceFormat): Adjust clients.
* src/conf/snapshot_conf.c (virDomainSnapshotDiskDefParseXML):
Likewise.
* src/qemu/qemu_driver.c
(qemuDomainSnapshotPrepareDiskExternalBackingInactive)
(qemuDomainSnapshotPrepareDiskExternalOverlayActive)
(qemuDomainSnapshotPrepareDiskExternalOverlayInactive)
(qemuDomainSnapshotPrepareDiskInternal)
(qemuDomainSnapshotCreateSingleDiskActive): Likewise.
* src/qemu/qemu_command.c (qemuGetDriveSourceString): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Every caller checked the return value and logged an error
- one if no device with the specified MAC was found,
other if there were multiple devices matching the MAC address
(except for qemuDomainUpdateDeviceConfig which logged the same
message in both cases).
Move the error reporting into virDomainNetFindIdx, since in both cases,
we couldn't find one single match - it's just the error messages that
differ.
Now that we have a common struct, it's time to start using it!
Since external snapshots make a longer backing chain, it is
only natural to use the same struct for the file created by
the snapshot as what we use for <domain> disks.
* src/conf/snapshot_conf.h (_virDomainSnapshotDiskDef): Use common
struct instead of open-coded duplicate fields.
* src/conf/snapshot_conf.c (virDomainSnapshotDiskDefClear)
(virDomainSnapshotDiskDefParseXML, virDomainSnapshotAlignDisks)
(virDomainSnapshotDiskDefFormat)
(virDomainSnapshotDiskGetActualType): Adjust clients.
* src/qemu/qemu_conf.c (qemuTranslateSnapshotDiskSourcePool):
Likewise.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiskGetSourceString)
(qemuDomainSnapshotCreateInactiveExternal)
(qemuDomainSnapshotPrepareDiskExternalOverlayActive)
(qemuDomainSnapshotPrepareDiskExternal)
(qemuDomainSnapshotPrepare)
(qemuDomainSnapshotCreateSingleDiskActive): Likewise.
* src/storage/storage_driver.c
(virStorageFileInitFromSnapshotDef): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
The code in virstoragefile.c is getting more complex as I
consolidate backing chain handling code. But for the setuid
virt-login-shell, we don't need to crawl backing chains. It's
easier to audit things for setuid security if there are fewer
files involved, so this patch moves the one function that
virFileOpen() was actually relying on to also live in virfile.c.
* src/util/virstoragefile.c (virStorageFileIsSharedFS)
(virStorageFileIsSharedFSType): Move...
* src/util/virfile.c (virFileIsSharedFS, virFileIsSharedFSType):
...to here, and rename.
(virFileOpenAs): Update caller.
* src/security/security_selinux.c
(virSecuritySELinuxSetFileconHelper)
(virSecuritySELinuxSetSecurityAllLabel)
(virSecuritySELinuxRestoreSecurityImageLabelInt): Likewise.
* src/security/security_dac.c
(virSecurityDACRestoreSecurityImageLabelInt): Likewise.
* src/qemu/qemu_driver.c (qemuOpenFileAs): Likewise.
* src/qemu/qemu_migration.c (qemuMigrationIsSafe): Likewise.
* src/util/virstoragefile.h: Adjust declarations.
* src/util/virfile.h: Likewise.
* src/libvirt_private.syms (virfile.h, virstoragefile.h): Move
symbols as appropriate.
Signed-off-by: Eric Blake <eblake@redhat.com>
In all other drivers we are doing so. Moreover, we don't want to parse
runtime information in attach (even if the attach is meant as live)
because we are generating the runtime info ourselves. We can't trust
users they supply sane values anyway.
==1140== 9 bytes in 1 blocks are definitely lost in loss record 72 of 1,151
==1140== at 0x4A06C2B: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==1140== by 0x623C758: xmlStrndup (in /usr/lib64/libxml2.so.2.9.1)
==1140== by 0x50FD763: virXMLPropString (virxml.c:483)
==1140== by 0x510F8B7: virDomainDeviceInfoParseXML (domain_conf.c:3685)
==1140== by 0x511ACFD: virDomainChrDefParseXML (domain_conf.c:7535)
==1140== by 0x5121D13: virDomainDeviceDefParse (domain_conf.c:9918)
==1140== by 0x13AE6313: qemuDomainAttachDeviceFlags (qemu_driver.c:6926)
==1140== by 0x13AE65FA: qemuDomainAttachDevice (qemu_driver.c:7005)
==1140== by 0x51C77DA: virDomainAttachDevice (libvirt.c:10231)
==1140== by 0x127FDD: remoteDispatchDomainAttachDevice (remote_dispatch.h:2404)
==1140== by 0x127EC5: remoteDispatchDomainAttachDeviceHelper (remote_dispatch.h:2382)
==1140== by 0x5241F81: virNetServerProgramDispatchCall (virnetserverprogram.c:437)
When doing live attach, we are passing the inactive definition anyway
since we are passing the result of virDomainDeviceDefCopy() which does
inactive copy by default.
Moreover, we are doing the same mistake in qemuhotplugtest.
Just a side note - it makes perfect sense to parse the runtime info
like alias in qemuDomainDetachDevice and qemuDomainUpdateDeviceFlags()
as in some cases the only difference to distinguish two devices can be
just their alias.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently, the Linux kernel treats values of '0' and '1' as
the minimum of 2. Values larger than the maximum are changed
to the maximum.
Re-reading the shares value after setting it reflects this in
the live domain XML.
Currently, <cputune><shares>0</shares></cputune> is treated
as if it were not specified.
Treat is as a valid value if it was explicitly specified
and write it to the cgroups.
qemuDomainSetSchedulerParametersFlags() calls virQEMUDriverGetConfig() twice
and makes the reference counter leak. This removes redundant call.
Problem introduced in commit 45ad1ad
Signed-off-by: Eric Blake <eblake@redhat.com>
It's finally time to start tracking disk backing chains in
<domain> XML. The first step is to start refactoring code
so that we have an object more convenient for representing
each host source resource in the context of a single guest
<disk>. Ultimately, I plan to move the new type into src/util
where it can be reused by virStorageFile, but to make the
transition easier to review, this patch just creates the
new type then fixes everything until it compiles again.
* src/conf/domain_conf.h (_virDomainDiskDef): Split...
(_virDomainDiskSourceDef): ...to new struct.
(virDomainDiskAuthClear): Use new type.
* src/conf/domain_conf.c (virDomainDiskDefFree): Split...
(virDomainDiskSourceDefClear): ...to new function.
(virDomainDiskGetType, virDomainDiskSetType)
(virDomainDiskGetSource, virDomainDiskSetSource)
(virDomainDiskGetDriver, virDomainDiskSetDriver)
(virDomainDiskGetFormat, virDomainDiskSetFormat)
(virDomainDiskAuthClear, virDomainDiskGetActualType)
(virDomainDiskDefParseXML, virDomainDiskSourceDefFormat)
(virDomainDiskDefFormat, virDomainDiskDefForeachPath)
(virDomainDiskDefGetSecurityLabelDef)
(virDomainDiskSourceIsBlockType): Adjust all users.
* src/lxc/lxc_controller.c (virLXCControllerSetupDisk):
Likewise.
* src/lxc/lxc_driver.c (lxcDomainAttachDeviceMknodHelper):
Likewise.
* src/qemu/qemu_command.c (qemuAddRBDHost, qemuParseRBDString)
(qemuParseDriveURIString, qemuParseGlusterString)
(qemuParseISCSIString, qemuParseNBDString)
(qemuDomainDiskGetSourceString, qemuBuildDriveStr)
(qemuBuildCommandLine, qemuParseCommandLineDisk)
(qemuParseCommandLine): Likewise.
* src/qemu/qemu_conf.c (qemuCheckSharedDevice)
(qemuAddISCSIPoolSourceHost, qemuTranslateDiskSourcePool):
Likewise.
* src/qemu/qemu_driver.c (qemuDomainUpdateDeviceConfig)
(qemuDomainPrepareDiskChainElement)
(qemuDomainSnapshotCreateInactiveExternal)
(qemuDomainSnapshotPrepareDiskExternalBackingInactive)
(qemuDomainSnapshotPrepareDiskInternal)
(qemuDomainSnapshotPrepare)
(qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotUndoSingleDiskActive)
(qemuDomainBlockPivot, qemuDomainBlockJobImpl)
(qemuDomainBlockCopy, qemuDomainBlockCommit): Likewise.
* src/qemu/qemu_migration.c (qemuMigrationIsSafe): Likewise.
* src/qemu/qemu_process.c (qemuProcessGetVolumeQcowPassphrase)
(qemuProcessInitPasswords): Likewise.
* src/security/security_selinux.c
(virSecuritySELinuxSetSecurityFileLabel): Likewise.
* src/storage/storage_driver.c (virStorageFileInitFromDiskDef):
Likewise.
* tests/securityselinuxlabeltest.c (testSELinuxLoadDef):
Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
When checking compatibility of a device with a domain definition, we
should know what we're going to do with the device. Because we may need
to check for different things when we're attaching a new device versus
detaching an existing device.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
A device needs to be checked for compatibility with the domain
definition it corresponds to. Specifically, for VIR_DOMAIN_AFFECT_CONFIG
case we should check against persistent def rather than active def.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Wire up all the pieces to send arbitrary qemu events to a
client using libvirt-qemu.so. If the extra bookkeeping of
generating event objects even when no one is listening turns
out to be noticeable, we can try to further optimize things
by adding a counter for how many connections are using events,
and only dump events when the counter is non-zero; but for
now, I didn't think it was worth the code complexity.
* src/qemu/qemu_driver.c
(qemuConnectDomainQemuMonitorEventRegister)
(qemuConnectDomainQemuMonitorEventDeregister): New functions.
* src/qemu/qemu_monitor.h (qemuMonitorEmitEvent): New prototype.
(qemuMonitorDomainEventCallback): New typedef.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONIOProcessEvent):
Report events.
* src/qemu/qemu_monitor.c (qemuMonitorEmitEvent): New function, to
pass events through.
* src/qemu/qemu_process.c (qemuProcessHandleEvent): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Any source file which calls the logging APIs now needs
to have a VIR_LOG_INIT("source.name") declaration at
the start of the file. This provides a static variable
of the virLogSource type.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We allow translation from no_bandwidth to has_bandwidth for a vnic.
However, going in the opposite direction is not implemented. It's not
limitation of the API rather than internal implementation. The problem
is, we correctly detect that user hasn't specified any outbound (say
he wants to clear out outbound). However, this gets overwritten by
current vnic outbound settings. Then, virNetDevBandwidthSet doesn't
change anything. We need to stop overwriting the outbound if users
don't want us to. Same applies for inbound.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Coverity found an issue in lxc_driver and uml_driver that we don't
check the return value of register functions.
I've also updated all other places and unify the way we check the
return value.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Right now we are parsing the XML as though it's live, which for example
will choke on hardcoded XML like:
<seclabel type='dynamic' model='selinux' relabel='yes'/>
Erroring with:
$ sudo virsh domxml-to-native qemu-argv f
error: XML error: security label is missing
All drivers are fixed, but only qemu was tested.
Change any method names with Usb, Pci or Scsi to use
USB, PCI and SCSI since they are abbreviations.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
While investigating https://bugzilla.redhat.com/show_bug.cgi?id=1061827
I noticed that we pass user input unscathed for block-pull, but
always pass a canonical absolute name through for block-commit.
[Note that we probably _ought_ to validate that the user's request
for block-pull actually matches the backing chain, the way we already
do for block-commit - but that's a separate issue. Further note that
the ability to pass user input through unscathed allows backdoors
such as specifying a backing image that is a network URI such as
a gluster disk, instead of forcing things to the local file system;
which is an area still under active investigation on whether libvirt
needs to behave differently for network disks.]
Since qemu may write the name that the user passed in as the backing
file, a user may have a reason to want a relative file name passed
through to qemu, and always munging things to absolute prevents that.
Put another way, if you have the backing chain:
[A] <- [B(back=./A)] <- [C(back=./B)]
and commit B into A (virsh blockcommit $dom vda --base A --top B),
the metadata of C will have to be re-written. But should it be
rewritten as [C(back=./A)] or as [C(back=/path/to/A)]? Still up in
the air is whether qemu's decision should be based on whether B
and/or C had relative paths, or on whether the --base and/or
--top arguments to the command were relative paths; but if we always
pass a canonical name, we've prevented the spelling of the command
arguments from being part of the hueristics that qemu uses.
I also audited the code, and verified that we never call
qemuMonitorBlockCommit() with a NULL base, either before or after
the change to qemu_driver.c.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Preserve user's
spelling, since absolute vs. relative matters to qemu.
* src/qemu/qemu_monitor.h (qemuMonitorBlockCommit): Base is never
null.
* src/qemu/qemu_monitor.c (qemuMonitorBlockCommit): Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockCommit):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockCommit):
Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
The qemu_bridge_filter.c file had some helpers for calling
the ebtablesXXX functions todo bridge filtering. The only
thing these helpers did was to overwrite the original error
message from the ebtables code. For added fun, the callers
of these helpers overwrote the errors yet again. For even
more fun, one of the helpers called another helper and
overwrite its errors too.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Extracting capabilities from QEMU takes a notable amount of time
when all QEMU binaries are installed. Each system emulator
needs about 200-300ms multiplied by 26 binaries == ~5-8 seconds.
This change causes the QEMU driver to save an XML file containing
the content of the virQEMUCaps object instance in the cache
dir eg /var/cache/libvirt/qemu/capabilities/$SHA256(binarypath).xml
or $HOME/.cache/libvirt/qemu/cache/capabilities/$SHA256(binarypath).xml
We attempt to load this and only if it fails, do we fallback to
probing the QEMU binary. The ctime of the QEMU binary and libvirtd
are stored in the cached file and its data discarded if either
of them change.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This fixes a possible double free. In virNetworkAssignDef() if
virBitmapNew() fails, then virNetworkObjFree(network) is called.
However, with network->def pointing to actual @def. So if caller
frees @def again, ...
Moreover, this fixes one possible memory leak too. In
virInterfaceAssignDef() if appending to the list of interfaces
fails, we ought to call virInterfaceObjFree() instead of bare
VIR_FREE().
Although, in order to do that some array size variables needs
to be turned into size_t rather than int.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When domain is started with setting that cannot be done, i.e. those
that require cgroups, there is no error reported and it succeeds
without any message whatsoever.
When setting with API, virsh, an error is reported, but only due to
the fact that no cgroups are mounted (priv->cgroup == NULL).
Given the above it seems reasonable to reject such unsupported
settings.
This patch effectively changes the error message from:
$ virsh -c qemu:///session schedinfo dummy
Scheduler : Unknown
error: Requested operation is not valid: cgroup CPU controller is not mounted
to:
$ virsh -c qemu:///session schedinfo dummy
Scheduler : Unknown
error: Operation not supported: CPU tuning is not available in session mode
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1023366
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1071264
Reverting of external snapshots is not supported currently. The check
that is present doesn't properly check for all aspects that make a
snapshot external. Use virDomainSnapshotIsExternal() to do the check.
As of 0bd2ccdec an empty disk path for virDomainBlockStats (or the one
with Flags) is allowed meaning "get me overall summarized statistics".
However, running 'virsh domblkstat $dom' throws a misleading error:
# ./tools/virsh domblkstat dom
error: Failed to get block stats dom
error: invalid argument: invalid path:
while after this commit
# virsh domblkstat dom
error: Operation not supported: summary statistics are not supported yet
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1038363
If a domain has a different maximum for persistent and live maxmem
or max vcpus, then it is possible to hit cases where libvirt
refuses to adjust the current values or gets halfway through
the adjustment before failing. Better is to determine up front
if the change is possible for all requested flags.
Based on an idea by Geoff Franks.
* src/qemu/qemu_driver.c (qemuDomainSetMemoryFlags): Compute
correct maximum if both live and config are being set.
(qemuDomainSetVcpusFlags): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Add support for gluster backed images as sources for snapshots in the
qemu driver. This will also simplify adding further network backed
volumes as sources for snapshot in case qemu will support them.
Use the new storage driver APIs to delete snapshot backing files in case
of failure instead of directly relying on "unlink". This will help us in
the future when we will be adding network based storage without local
representation in the host.
All the data for getting the actual type is present in the snapshot
config. There is no need to have this function private to the qemu
driver and it will be re-used later in other parts of libvirt
All the data for getting the actual type is present in the domain
config. There is no need to have this function private to the qemu
driver and it will be re-used later in other parts of libvirt
On some platforms like IBM PowerNV the NUMA node numbers can be
non-sequential. For eg. numactl --hardware o/p from such a machine looks
as given below
node distances:
node 0 1 16 17
0: 10 40 40 40
1: 40 10 40 40
16: 40 40 10 40
17: 40 40 40 10
The NUMA nodes are 0,1,16,17
Libvirt uses sequential index as NUMA node numbers and this can
result in crash or incorrect results.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
The code took into account only the global permissions. The domains now
support per-vm DAC labels and per-image DAC labels. Use the most
specific label available.
This commit allows to attach/detach a <filesystem> device in qemu. For
this purpose I'm introducing two new functions: virDomainFSInsert() and
virDomainFSRemove() and adding necessary code in the qemu driver. It
compares filesystems based on their "destination" folder. So if two
filesystems share the same destination, they are considered equal and
the qemu driver would reject the insertion.
Signed-off-by: Matthieu Coudron <mattator@gmail.com>
When attempting a blockcommit from the top layer, the base argument
passed is NULL. This will be dereferenced when attempting a commit with
an empty image chain. Output the real volume path instead:
virsh blockcommit --verbose --path vda --domain DOMNAME --wait
error: invalid argument: top '/path/somefile' in chain for 'vda' has no backing file
instead of:
error: invalid argument: top '(null)' in chain for 'vda' has no backing file
https://bugzilla.redhat.com/show_bug.cgi?id=1058839
Commit f9f56340 for CVE-2014-0028 almost had the right idea - we
need to check the ACL rules to filter which events to send. But
it overlooked one thing: the event dispatch queue is running in
the main loop thread, and therefore does not normally have a
current virIdentityPtr. But filter checks can be based on current
identity, so when libvirtd.conf contains access_drivers=["polkit"],
we ended up rejecting access for EVERY event due to failure to
look up the current identity, even if it should have been allowed.
Furthermore, even for events that are triggered by API calls, it
is important to remember that the point of events is that they can
be copied across multiple connections, which may have separate
identities and permissions. So even if events were dispatched
from a context where we have an identity, we must change to the
correct identity of the connection that will be receiving the
event, rather than basing a decision on the context that triggered
the event, when deciding whether to filter an event to a
particular connection.
If there were an easy way to get from virConnectPtr to the
appropriate virIdentityPtr, then object_event.c could adjust the
identity prior to checking whether to dispatch an event. But
setting up that back-reference is a bit invasive. Instead, it
is easier to delay the filtering check until lower down the
stack, at the point where we have direct access to the RPC
client object that owns an identity. As such, this patch ends
up reverting a large portion of the framework of commit f9f56340.
We also have to teach 'make check' to special-case the fact that
the event registration filtering is done at the point of dispatch,
rather than the point of registration. Note that even though we
don't actually use virConnectDomainEventRegisterCheckACL (because
the RegisterAny variant is sufficient), we still generate the
function for the purposes of documenting that the filtering
takes place.
Also note that I did not entirely delete the notion of a filter
from object_event.c; I still plan on using that for my upcoming
patch series for qemu monitor events in libvirt-qemu.so. In
other words, while this patch changes ACL filtering to live in
remote.c and therefore we have no current client of the filtering
in object_event.c, the notion of filtering in object_event.c is
still useful down the road.
* src/check-aclrules.pl: Exempt event registration from having to
pass checkACL filter down call stack.
* daemon/remote.c (remoteRelayDomainEventCheckACL)
(remoteRelayNetworkEventCheckACL): New functions.
(remoteRelay*Event*): Use new functions.
* src/conf/domain_event.h (virDomainEventStateRegister)
(virDomainEventStateRegisterID): Drop unused parameter.
* src/conf/network_event.h (virNetworkEventStateRegisterID):
Likewise.
* src/conf/domain_event.c (virDomainEventFilter): Delete unused
function.
* src/conf/network_event.c (virNetworkEventFilter): Likewise.
* src/libxl/libxl_driver.c: Adjust caller.
* src/lxc/lxc_driver.c: Likewise.
* src/network/bridge_driver.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
* src/remote/remote_driver.c: Likewise.
* src/test/test_driver.c: Likewise.
* src/uml/uml_driver.c: Likewise.
* src/vbox/vbox_tmpl.c: Likewise.
* src/xen/xen_driver.c: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
The NWFilter code has as a deadlock race condition between
the virNWFilter{Define,Undefine} APIs and starting of guest
VMs due to mis-matched lock ordering.
In the virNWFilter{Define,Undefine} codepaths the lock ordering
is
1. nwfilter driver lock
2. virt driver lock
3. nwfilter update lock
4. domain object lock
In the VM guest startup paths the lock ordering is
1. virt driver lock
2. domain object lock
3. nwfilter update lock
As can be seen the domain object and nwfilter update locks are
not acquired in a consistent order.
The fix used is to push the nwfilter update lock upto the top
level resulting in a lock ordering for virNWFilter{Define,Undefine}
of
1. nwfilter driver lock
2. nwfilter update lock
3. virt driver lock
4. domain object lock
and VM start using
1. nwfilter update lock
2. virt driver lock
3. domain object lock
This has the effect of serializing VM startup once again, even if
no nwfilters are applied to the guest. There is also the possibility
of deadlock due to a call graph loop via virNWFilterInstantiate
and virNWFilterInstantiateFilterLate.
These two problems mean the lock must be turned into a read/write
lock instead of a plain mutex at the same time. The lock is used to
serialize changes to the "driver->nwfilters" hash, so the write lock
only needs to be held by the define/undefine methods. All other
methods can rely on a read lock which allows good concurrency.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add support for specifying various types when doing snapshots. This will
later allow to do snapshots on network backed volumes. Disks of type
'volume' are not supported by snapshots (yet).
Also amend the test suite to check parsing of the various new disk
types that can now be specified.
Currently the qemuDomainGetBlockInfo will return allocation == physical
for most backing stores. For a qcow2 block backed device it's possible
to return the highest lv extent allocated from qemu for an active guest.
That is a value where allocation != physical and one would hope be less.
However, if the guest is not running, then the code falls back to returning
allocation == physical. This turns out to be problematic for rhev which
monitors the size of the backing store. During a migration, before the
VM has been started on the target and while it is deemed inactive on the
source, there's a small window of time where the allocation is returned
as physical triggering the code to extend the file unnecessarily.
Since rhev uses transient domains and this is edge condition for a transient
domain, rather than returning good status and allocation == physical when
this "window of opportunity" exists, this patch will check for a transient
(or non persistent) domain and return a failure to the caller rather than
returning the defaults. For a persistent domain, the defaults will be
returned. The description for the virDomainGetBlockInfo has been updated
to describe the phenomena.
the array params is allocated by VIR_ALLOC_N in
remoteDispatchDomainGetCPUStats. it had been set
to zero. No need to reset it to zero again, and
this reset here is incorrect too, nparams * ncpus
is the array length not the size of params array.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
I noticed that we allow virDomainGetVcpusFlags even for read-only
connections, but that with a flag, it can require guest agent
interaction. It is feasible that a malicious guest could
intentionally abuse the replies it sends over the guest agent
connection to possibly trigger a bug in libvirt's JSON parser,
or withhold an answer so as to prevent the use of the agent
in a later command such as a shutdown request. Although we
don't know of any such exploits now (and therefore don't mind
posting this patch publicly without trying to get a CVE assigned),
it is better to err on the side of caution and explicitly require
full access to any domain where the API requires guest interaction
to operate correctly.
I audited all commands that are marked as conditionally using a
guest agent. Note that at least virDomainFSTrim is documented
as needing a guest agent, but that such use is unconditional
depending on the hypervisor (so the existing domain:fs_trim ACL
should be sufficient there, rather than also requirng domain:write).
But when designing future APIs, such as the plans for obtaining
a domain's IP addresses, we should copy the approach of this patch
in making interaction with the guest be specified via a flag, and
use that flag to also require stricter access checks.
* src/libvirt.c (virDomainGetVcpusFlags): Forbid guest interaction
on read-only connection.
(virDomainShutdownFlags, virDomainReboot): Improve docs on agent
interaction.
* src/remote/remote_protocol.x
(REMOTE_PROC_DOMAIN_SNAPSHOT_CREATE_XML)
(REMOTE_PROC_DOMAIN_SET_VCPUS_FLAGS)
(REMOTE_PROC_DOMAIN_GET_VCPUS_FLAGS, REMOTE_PROC_DOMAIN_REBOOT)
(REMOTE_PROC_DOMAIN_SHUTDOWN_FLAGS): Require domain:write for any
conditional use of a guest agent.
* src/xen/xen_driver.c: Fix clients.
* src/libxl/libxl_driver.c: Likewise.
* src/uml/uml_driver.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
* src/lxc/lxc_driver.c: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1034993
SCSI passthrough disks (<disk .. device="lun">) can't be used as backing
for snapshots. Currently with upstream qemu the vm crashes on such
attempt.
This patch adds a early check to catch an attempt to do such a snapshot
and rejects it right away. qemu will fix the issue but this will let us
control the error message.
We shouldn't access the domain definition while we are in the monitor
section as the domain is unlocked. Additionally after we exit from the
monitor we need to check if the VM is still alive. Not doing so resulted
in a crash if qemu exits while attempting to do an external VM snapshot.
https://bugzilla.redhat.com/show_bug.cgi?id=1046919
If none (KVM, VFIO) of the supported PCI passthrough methods is known to
work on a host, it's better to fail right away with a nice error message
rather than letting attachment fail with a more cryptic message such as
Failed to bind PCI device '0000:07:05.0' to vfio-pci: No such device
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
With this patch, user can setup throttle blkio cgroup
through virsh for qemu domain.
Signed-off-by: Guan Qiang <hzguanqiang@corp.netease.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
The public virConnectRef and virConnectClose API are just thin
wrappers around virObjectRef/virObjectRef, with added object
validation and an error reset. Within our backend drivers, use
of the object validation is just an inefficiency since we always
pass valid objects. More important to think about is what
happens with the error reset; our uses of virConnectRef happened
to be safe (since we hadn't encountered any earlier errors), but
in several cases the use of virConnectClose could lose a real
error.
Ideally, we should also avoid calling virConnectOpen() from
within backend drivers - but that is a known situation that
needs much more design work.
* src/qemu/qemu_process.c (qemuProcessReconnectHelper)
(qemuProcessReconnect): Avoid nested public API call.
* src/qemu/qemu_driver.c (qemuAutostartDomains)
(qemuStateInitialize, qemuStateStop): Likewise.
* src/qemu/qemu_migration.c (doPeer2PeerMigrate): Likewise.
* src/storage/storage_driver.c (storageDriverAutostart):
Likewise.
* src/uml/uml_driver.c (umlAutostartConfigs): Likewise.
* src/lxc/lxc_process.c (virLXCProcessAutostartAll): Likewise.
(virLXCProcessReboot): Likewise, and avoid leaking conn on error.
Signed-off-by: Eric Blake <eblake@redhat.com>
Ever since ACL filtering was added in commit 7639736 (v1.1.1), a
user could still use event registration to obtain access to a
domain that they could not normally access via virDomainLookup*
or virConnectListAllDomains and friends. We already have the
framework in the RPC generator for creating the filter, and
previous cleanup patches got us to the point that we can now
wire the filter through the entire object event stack.
Furthermore, whether or not domain:getattr is honored, use of
global events is a form of obtaining a list of networks, which
is covered by connect:search_domains added in a93cd08 (v1.1.0).
Ideally, we'd have a way to enforce connect:search_domains when
doing global registrations while omitting that check on a
per-domain registration. But this patch just unconditionally
requires connect:search_domains, even when no list could be
obtained, based on the following observations:
1. Administrators are unlikely to grant domain:getattr for one
or all domains while still denying connect:search_domains - a
user that is able to manage domains will want to be able to
manage them efficiently, but efficient management includes being
able to list the domains they can access. The idea of denying
connect:search_domains while still granting access to individual
domains is therefore not adding any real security, but just
serves as a layer of obscurity to annoy the end user.
2. In the current implementation, domain events are filtered
on the client; the server has no idea if a domain filter was
requested, and must therefore assume that all domain event
requests are global. Even if we fix the RPC protocol to
allow for server-side filtering for newer client/server combos,
making the connect:serach_domains ACL check conditional on
whether the domain argument was NULL won't benefit older clients.
Therefore, we choose to document that connect:search_domains
is a pre-requisite to any domain event management.
Network events need the same treatment, with the obvious
change of using connect:search_networks and network:getattr.
* src/access/viraccessperm.h
(VIR_ACCESS_PERM_CONNECT_SEARCH_DOMAINS)
(VIR_ACCESS_PERM_CONNECT_SEARCH_NETWORKS): Document additional
effect of the permission.
* src/conf/domain_event.h (virDomainEventStateRegister)
(virDomainEventStateRegisterID): Add new parameter.
* src/conf/network_event.h (virNetworkEventStateRegisterID):
Likewise.
* src/conf/object_event_private.h (virObjectEventStateRegisterID):
Likewise.
* src/conf/object_event.c (_virObjectEventCallback): Track a filter.
(virObjectEventDispatchMatchCallback): Use filter.
(virObjectEventCallbackListAddID): Register filter.
* src/conf/domain_event.c (virDomainEventFilter): New function.
(virDomainEventStateRegister, virDomainEventStateRegisterID):
Adjust callers.
* src/conf/network_event.c (virNetworkEventFilter): New function.
(virNetworkEventStateRegisterID): Adjust caller.
* src/remote/remote_protocol.x
(REMOTE_PROC_CONNECT_DOMAIN_EVENT_REGISTER)
(REMOTE_PROC_CONNECT_DOMAIN_EVENT_REGISTER_ANY)
(REMOTE_PROC_CONNECT_NETWORK_EVENT_REGISTER_ANY): Generate a
filter, and require connect:search_domains instead of weaker
connect:read.
* src/test/test_driver.c (testConnectDomainEventRegister)
(testConnectDomainEventRegisterAny)
(testConnectNetworkEventRegisterAny): Update callers.
* src/remote/remote_driver.c (remoteConnectDomainEventRegister)
(remoteConnectDomainEventRegisterAny): Likewise.
* src/xen/xen_driver.c (xenUnifiedConnectDomainEventRegister)
(xenUnifiedConnectDomainEventRegisterAny): Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainGetXMLDesc): Likewise.
* src/libxl/libxl_driver.c (libxlConnectDomainEventRegister)
(libxlConnectDomainEventRegisterAny): Likewise.
* src/qemu/qemu_driver.c (qemuConnectDomainEventRegister)
(qemuConnectDomainEventRegisterAny): Likewise.
* src/uml/uml_driver.c (umlConnectDomainEventRegister)
(umlConnectDomainEventRegisterAny): Likewise.
* src/network/bridge_driver.c
(networkConnectNetworkEventRegisterAny): Likewise.
* src/lxc/lxc_driver.c (lxcConnectDomainEventRegister)
(lxcConnectDomainEventRegisterAny): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
CVE-2013-6458
Generally, every API that is going to begin a job should do that before
fetching data from vm->def. However, qemuDomainGetBlockInfo does not
know whether it will have to start a job or not before checking vm->def.
To avoid using disk alias that might have been freed while we were
waiting for a job, we use its copy. In case the disk was removed in the
meantime, we will fail with "cannot find statistics for device '...'"
error message.
CVE-2013-6458
https://bugzilla.redhat.com/show_bug.cgi?id=1043069
When virDomainDetachDeviceFlags is called concurrently to
virDomainBlockStats: libvirtd may crash because qemuDomainBlockStats
finds a disk in vm->def before getting a job on a domain and uses the
disk pointer after getting the job. However, the domain in unlocked
while waiting on a job condition and thus data behind the disk pointer
may disappear. This happens when thread 1 runs
virDomainDetachDeviceFlags and enters monitor to actually remove the
disk. Then another thread starts running virDomainBlockStats, finds the
disk in vm->def, and while it's waiting on the job condition (owned by
the first thread), the first thread finishes the disk removal. When the
second thread gets the job, the memory pointed to be the disk pointer is
already gone.
That said, every API that is going to begin a job should do that before
fetching data from vm->def.
https://bugzilla.redhat.com/show_bug.cgi?id=1047234
Add a range check for supported numa memory placement modes provided by
the user before setting them in the domain definition. Without the check
the user is able to provide a (yet) unknown mode which is then stored in
the domain definition. This potentially causes a NULL dereference when
the defintion is formatted into the XML.
To reproduce run:
virsh numatune DOMNAME --mode 6 --nodeset 0
The XML will then contain:
<numatune>
<memory mode='(null)' nodeset='0'/>
</numatune>
With this fix, the command fails:
error: Unable to change numa parameters
error: invalid argument: unsupported numa_mode: '6'
Add whitespace to separate logical code blocks, reformat error messages
and clean up code flow.
This patch changes error handling in some cases where the the loop would
be continued to jump to cleanup instead and error out rather than modify
the domain any further.
For dead domains that have no memtune limits, we return 0 instead of
"unlimited", this patch fixes it to return PARAM_UNLIMITED.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1035108
When attempting to enable more vCPUs in the guest than is currently
enabled in the guest but less than the maximum count for the VM we
currently reported an unhelpful message:
error: internal error: guest agent reports less cpu than requested
This patch changes it to:
error: invalid argument: requested vcpu count is greater than the count
of enabled vcpus in the domain: 3 > 2
Ever since the subcpusets(vcpu,emulator) were introduced, the parent
cpuset cannot be modified to remove the nodes that are in use by the
subcpusets.
The fix is to break the memory node modification into three steps:
1. assign new nodes into the parent,
2. change the nodes in the child nodes,
3. remove the old nodes on the parent node.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1009880
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This patch resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1035188
Commit f094aaac48 changed the PCI device assignment in qemu domains
to default to using VFIO rather than legacy KVM device assignment
(when VFIO is available). It didn't change which driver was used by
default for virNodeDeviceDetachFlags(), though, so that API (and the
virsh nodedev-detach command) was still binding to the pci-stub
driver, used by legacy KVM assignment, by default.
This patch publicizes (only within the qemu module, though, so no
additions to the symbol exports are needed) the functions that check
for presence of KVM and VFIO device assignment, then uses those
functions to decide what to do when no driver is specified for
virNodeDeviceDetachFlags(); if the vfio driver is loaded, the device
will be bound to vfio-pci, or if legacy KVM assignment is supported on
this system, the device will be bound to pci-stub; if neither method
is available, the detach will fail.
Currently the snapshot code did not check if it actually supports
snapshots on various disk backends for domains. To avoid future problems
add checkers that whitelist the supported configurations.
When doing an internal snapshot on a VM with sheepdog or RBD disks we
would not set a flag to mark the domain is using internal snapshots and
might end up creating a mixed snapshot. Move the setting of the variable
to avoid this problem.
The virsh command 'domxml-to-native' (virConnectDomainXMLToNative())
converts all network devices to "type='ethernet'" in order to make it
more likely that the generated command could be run directly from a
shell (other libvirt network device types end up referencing file
descriptors for tap devices assumed to have been created by libvirt,
which can't be done in this case).
During this conversion, all of the netdev parameters are cleared out,
then specific items are filled in after changing the type. The MAC
address was not one of these preserved items, and the result was that
mac addresses in the generated commandlines were always
00:00:00:00:00:00.
This patch saves the mac address before the conversion, then
repopulates it afterwards, so the proper mac addresses show up in the
commandline.
Signed-off-by: Bing Bu Cao <mars@linux.vnet.ibm.com>
Signed-off-by: Laine Stump <laine@laine.org>
For attach/detach of controller devices, we rename the functions to
remove 'PCI' from their title. The actual separation of PCI-specific
operations will be handled in the next patch.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Most of our code base uses space after comma but not before;
fix the remaining uses before adding a syntax check.
* src/qemu/qemu_cgroup.c: Consistently use commas.
* src/qemu/qemu_command.c: Likewise.
* src/qemu/qemu_conf.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
* src/qemu/qemu_monitor.c: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
If the managedsave image is corrupted, e.g. the XML part is, we fail to
parse it and throw an error, e.g.:
error: Failed to start domain jms8
error: XML error: missing security model when using multiple labels
This is okay, as we can't really start the machine and avoid undefined
qemu behaviour. On the other hand, the error message doesn't give a
clue to users what should they do. The consensus here would be to thrown
a warning to logs saying "Hey, you've got a corrupted file".
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The following sequence
1. Define a persistent QMEU guest
2. Start the QEMU guest
3. Stop libvirtd
4. Kill the QEMU process
5. Start libvirtd
6. List persistent guests
At the last step, the previously running persistent guest
will be missing. This is because of a race condition in the
QEMU driver startup code. It does
1. Load all VM state files
2. Spawn thread to reconnect to each VM
3. Load all VM config files
Only at the end of step 3, does the 'virDomainObjPtr' get
marked as "persistent". There is therefore a window where
the thread reconnecting to the VM will remove the persistent
VM from the list.
The easy fix is to simply switch the order of steps 2 & 3.
In addition to this though, we must only attempt to reconnect
to a VM which had a non-zero PID loaded from its state file.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Most of the usage of getuid()/getgid() is in cases where we are
considering what privileges we have. As such the code should be
using the effective IDs, not real IDs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1019053
When we migrate vms concurrently, there's a chance that libvirtd on
destination assigns the same port for different migrations, which will
lead to migration failure during prepare phase on destination. So we use
virPortAllocator here to solve the problem.
Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
'const fooPtr' is the same as 'foo * const' (the pointer won't
change, but it's contents can). But in general, if an interface
is trying to be const-correct, it should be using 'const foo *'
(the pointer is to data that can't be changed).
Fix up offenders in src/qemu.
* src/qemu/qemu_bridge_filter.h (networkAllowMacOnPort)
(networkDisallowMacOnPort): Use intended type.
* src/qemu/qemu_bridge_filter.c (networkAllowMacOnPort)
(networkDisallowMacOnPort): Likewise.
* src/qemu/qemu_command.c (qemuBuildTPMBackendStr)
(qemuBuildTPMDevStr, qemuBuildCpuArgStr)
(qemuBuildObsoleteAccelArg, qemuBuildMachineArgStr)
(qemuBuildSmpArgStr, qemuBuildNumaArgStr): Likewise.
* src/qemu/qemu_conf.c (qemuSharedDeviceEntryCopy): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSaveImageStartVM): Likewise.
* src/qemu/qemu_hostdev.c
(qemuDomainHostdevNetConfigVirtPortProfile): Likewise.
* src/qemu/qemu_monitor_json.c
(qemuMonitorJSONAttachCharDevCommand): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
The regular save image code has the support to compress images using a
specified algorithm. This was not implemented for external checkpoints
although it shares most of the backend code.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1017227
The regular save image code has the support to compress images using a
specified algorithm. This was not implemented for managed save although
it shares most of the backend code.
This configuration knob is there to override default listen address for
-incoming for all qemu domains.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=994364
Whenever we check for ABI stability, we have new xml (e.g. provided by
user, or obtained from snapshot, whatever) which we compare to old xml
and see if ABI won't break. However, if the new xml was produced via
virDomainGetXMLDesc(..., VIR_DOMAIN_XML_MIGRATABLE) it lacks some
devices, e.g. 'pci-root' controller. Hence, the ABI stability check
fails even though it is stable. Moreover, we can't simply fix
virDomainDefCheckABIStability because removing the correct devices is
task for the driver. For instance, qemu driver wants to remove the usb
controller too, while LXC driver doesn't. That's why we need special
qemu wrapper over virDomainDefCheckABIStability which removes the
correct devices from domain XML, produces MIGRATABLE xml and calls the
check ABI stability function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The virConnectPtr is passed around loads of nwfilter code in
order to provide it as a parameter to the callback registered
by the virt drivers. None of the virt drivers use this param
though, so it serves no purpose.
Avoiding the need to pass a virConnectPtr means that the
nwfilterStateReload method no longer needs to open a bogus
QEMU driver connection. This addresses a race condition that
can lead to a crash on startup.
The nwfilter driver starts before the QEMU driver and registers
some callbacks with DBus to detect firewalld reload. If the
firewalld reload happens while the QEMU driver is still starting
up though, the nwfilterStateReload method will open a connection
to the partially initialized QEMU driver and cause a crash.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1011330 (case A)
While activeScsiHostdevs and webSocketPorts were allocated in
qemuStateInitialize, they were not freed in qemuStateCleanup.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The return value of virDomainControllerFind >=0 means that
the specific controller was found.
But some functions invoke it and treat 0 as not found.
This patch fix these incorrect invocation.
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
virDomainSetBlockIoTuneEnsureACL was incorrectly called after we already
started a job. As a result of this, the job was not cleaned up when an
access driver had forbidden the action.
If the ABI compatibility check with the "migratable" user XML is
successful, we would leak the originally parsed XML from the user that
would not be used in this case.
Reported by Ján Tomko.
The function implemented common behavior that can be reused for other
hypervisor drivers that use the virDomainObj data structures. Factor out
the core into a separate helper func.
The function implemented common behavior that can be reused for other
hypervisor drivers that use the virDomainObj data structures. Factor out
the core into a separate helper func.
In the original implementation of external checkpoints I've mistakenly
used the live definition to be stored in the save image. The normal
approach is to use the "migratable" definition. This was discovered when
commit 07966f6a8b changed the behavior to
use a converted XML from the user to do the compatibility check to fix
problem when using the regular machine saving.
As the previous patch added a compatibility layer, we can now change the
type of the XML in the image.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1008340
External checkpoints have a bug in the implementation where they use the
normal definition instead of the "migratable" one. This causes errors
when the snapshot is being reverted using the workaround method via
qemuDomainRestoreFlags() with a custom XML. This issue was introduced
when commit 07966f6a8b changed the code to
compare "migratable" XMLs from the user as we should have used
migratable in the image too.
This patch adds a compatibility layer, so that fixing the snapshot code
won't make existing snapshots fail to load.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1008340
When reverting a live internal snapshot with a live guest the ABI
compatiblity check was comparing a "migratable" definition with a normal
one. This resulted in the check failing with:
revert requires force: Target device address type none does not match source pci
This patch generates a "migratable" definition from the actual one to
check against the definition from the snapshot to avoid this problem.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1006886
Osier Yang pointed out that ever since commit 31cb030, the
signature of qemuDomainObjEndJob was changed to return a bool.
While comparison against 0 or > 0 still gives the right results,
it looks fishy; we also had one place that was comparing < 0
which is effectively dead code.
* src/qemu/qemu_migration.c (qemuMigrationPrepareAny): Fix dead
code bug.
(qemuMigrationBegin): Use more canonical form of bool check.
* src/qemu/qemu_driver.c (qemuAutostartDomain)
(qemuDomainCreateXML, qemuDomainSuspend, qemuDomainResume)
(qemuDomainShutdownFlags, qemuDomainReboot, qemuDomainReset)
(qemuDomainDestroyFlags, qemuDomainSetMemoryFlags)
(qemuDomainSetMemoryStatsPeriod, qemuDomainInjectNMI)
(qemuDomainSendKey, qemuDomainGetInfo, qemuDomainScreenshot)
(qemuDomainSetVcpusFlags, qemuDomainGetVcpusFlags)
(qemuDomainRestoreFlags, qemuDomainGetXMLDesc)
(qemuDomainCreateWithFlags, qemuDomainAttachDeviceFlags)
(qemuDomainUpdateDeviceFlags, qemuDomainDetachDeviceFlags)
(qemuDomainBlockResize, qemuDomainBlockStats)
(qemuDomainBlockStatsFlags, qemuDomainMemoryStats)
(qemuDomainMemoryPeek, qemuDomainGetBlockInfo)
(qemuDomainAbortJob, qemuDomainMigrateSetMaxDowntime)
(qemuDomainMigrateGetCompressionCache)
(qemuDomainMigrateSetCompressionCache)
(qemuDomainMigrateSetMaxSpeed)
(qemuDomainSnapshotCreateActiveInternal)
(qemuDomainRevertToSnapshot, qemuDomainSnapshotDelete)
(qemuDomainQemuMonitorCommand, qemuDomainQemuAttach)
(qemuDomainBlockJobImpl, qemuDomainBlockCopy)
(qemuDomainBlockCommit, qemuDomainOpenGraphics)
(qemuDomainGetBlockIoTune, qemuDomainGetDiskErrors)
(qemuDomainPMSuspendForDuration, qemuDomainPMWakeup)
(qemuDomainQemuAgentCommand, qemuDomainFSTrim): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Failure to attach to a domain during 'virsh qemu-attach' left
the list of domains in an odd state:
$ virsh qemu-attach 4176
error: An error occurred, but the cause is unknown
$ virsh list --all
Id Name State
----------------------------------------------------
2 foo shut off
$ virsh qemu-attach 4176
error: Requested operation is not valid: domain is already active as 'foo'
$ virsh undefine foo
error: Failed to undefine domain foo
error: Requested operation is not valid: cannot undefine transient domain
$ virsh shutdown foo
error: Failed to shutdown domain foo
error: invalid argument: monitor must not be NULL
It all stems from leaving the list of domains unmodified on
the initial failure; we should follow the lead of createXML
which removes vm on failure (the actual initial failure still
needs to be fixed in a later patch, but at least this patch
gets us to the point where we aren't getting stuck with an
unremovable "shut off" transient domain).
While investigating, I also found a leak in qemuDomainCreateXML;
the two functions should behave similarly. Note that there are
still two unusual paths: if dom is not allocated, the user will
see an OOM error even though the vm remains registered (but oom
errors already indicate tricky cleanup); and if the vm starts
and then quits again all before the job ends, it is possible
to return a non-NULL dom even though the dom will no longer be
useful for anything (but this at least lets the user know their
short-lived vm ran).
* src/qemu/qemu_driver.c (qemuDomainCreateXML): Don't leak vm on
failure to obtain job.
(qemuDomainQemuAttach): Match cleanup of qemuDomainCreateXML.
Signed-off-by: Eric Blake <eblake@redhat.com>
The VIR_FREE() macro will cast away any const-ness. This masked a
number of places where we passed a 'const char *' string to
VIR_FREE. Fortunately in all of these cases, the variable was not
in fact const data, but a heap allocated string. Fix all the
variable declarations to reflect this.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=999352
Since commit v1.0.5-56-g449e6b1 (Pull parsing of migration xml up into
QEMU driver APIs) any attempt to rename a domain during migration fails
with the following error message:
internal error Incoming cookie data had unexpected name DOM vs DOM2
This is because migration cookies always use the original domain name
and the mentioned commit failed to propagate the name back to
qemuMigrationPrepareAny.
When cpu hotplug fails without reporting an error, we would fail the
command but update the count of vCPUs anyways.
Commit 761fc48136 fixed the case when CPU
hot-unplug failed silently, but forgot to fix up the value in this case.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1000357
The virDomainOpenGraphics method accepts a UNIX socket FD from
the client app. It must set the label on this FD otherwise QEMU
will be prevented from receiving it with recvmsg.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Re-arrange the code so that the returned bitmap is always initialized to
NULL even on early failures and return an error message as some callers
are already expecting it. Fix up the rest not to shadow the error.
Currently the virConnectBaselineCPU API does not expose the CPU features
that are part of the CPU's model. This patch adds a new flag,
VIR_CONNECT_BASELINE_CPU_EXPAND_FEATURES, that causes the API to explicitly
list all features that are part of that model.
Signed-off-by: Don Dugger <donald.d.dugger@intel.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Function qemuOpenFile() haven't had any idea about seclabels applied
to VMs only, so in case the seclabel differed from the "user:group"
from configuration, there might have been issues with opening files.
Make qemuOpenFile() VM-aware, but only optionally, passing NULL
argument means skipping VM seclabel info completely.
However, all current qemuOpenFile() calls look like they should use VM
seclabel info in case there is any, so convert these calls as well.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=869053
Due to a goto statement missed when refactoring in 2771f8b74c
when acquiring of a domain job failed the error path was not taken. This
resulted into a crash afterwards as an extra reference was removed from a
domain object leading to it being freed. An attempt to list the domains
leaded to a crash of the daemon afterwards.
https://bugzilla.redhat.com/show_bug.cgi?id=928672
Convert the remaining methods in vircgroup.c to report errors
instead of returning errno values.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In case libvirtd is asked to unplug a device but the device is actually
unplugged later when libvirtd is not running, we need to detect that and
remove such device when libvirtd starts again and reconnects to running
domains.
Implement the new API that will handle setting the balloon driver statistics
collection period in order to enable or disable the collection dynamically.
If users haven't configured guest agent then qemuAgentCommand() will
dereference a NULL 'mon' pointer, which causes crash of libvirtd when
using agent based cpu (un)plug.
With the patch, when the qemu-ga service isn't running in the guest,
a expected error "error: Guest agent is not responding: Guest agent
not available for now" will be raised, and the error "error: argument
unsupported: QEMU guest agent is not configured" is raised when the
guest hasn't configured guest agent.
GDB backtrace:
(gdb) bt
#0 virNetServerFatalSignal (sig=11, siginfo=<value optimized out>, context=<value optimized out>) at rpc/virnetserver.c:326
#1 <signal handler called>
#2 qemuAgentCommand (mon=0x0, cmd=0x7f39300017b0, reply=0x7f394b090910, seconds=-2) at qemu/qemu_agent.c:975
#3 0x00007f39429507f6 in qemuAgentGetVCPUs (mon=0x0, info=0x7f394b0909b8) at qemu/qemu_agent.c:1475
#4 0x00007f39429d9857 in qemuDomainGetVcpusFlags (dom=<value optimized out>, flags=9) at qemu/qemu_driver.c:4849
#5 0x00007f3957dffd8d in virDomainGetVcpusFlags (domain=0x7f39300009c0, flags=8) at libvirt.c:9843
How to reproduce?
# To start a guest without guest agent configuration
# then run the following cmdline
# virsh vcpucount foobar --guest
error: End of file while reading data: Input/output error
error: One or more references were leaked after disconnect from the hypervisor
error: Failed to reconnect to the hypervisor
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=984821
Signed-off-by: Alex Jia <ajia@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
There are two levels on which a device may be hotplugged: config
and live. The config level requires just an insert or remove from
internal domain definition structure, which is exactly what this
patch does. There is currently no implementation for a chardev
update action, as there's not much to be updated. But more
importantly, the only thing that can be updated is path or socket
address by which chardevs are distinguished. So the update action
is currently not supported.
With current code, error reporting for unsupported devices for hot plug,
unplug and update is total mess. The VIR_ERR_CONFIG_UNSUPPORTED error
code is reported instead of VIR_ERR_OPERATION_UNSUPPORTED. Moreover, the
error messages are not helping to find the root cause (lack of
implementation).
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add monitor callback API domainGuestPanic, that implements
'destroy', 'restart' and 'preserve' events of the 'on_crash'
in the XML when domain crashed.
After abf75aea24 the compiler screams:
qemu/qemu_driver.c: In function 'qemuNodeDeviceDetachFlags':
qemu/qemu_driver.c:10693:9: error: 'domain' may be used uninitialized in this function [-Werror=maybe-uninitialized]
pci = virPCIDeviceNew(domain, bus, slot, function);
^
qemu/qemu_driver.c:10693:9: error: 'bus' may be used uninitialized in this function [-Werror=maybe-uninitialized]
qemu/qemu_driver.c:10693:9: error: 'slot' may be used uninitialized in this function [-Werror=maybe-uninitialized]
qemu/qemu_driver.c:10693:9: error: 'function' may be used uninitialized in this function [-Werror=maybe-uninitialized]
Since the other functions qemuNodeDeviceReAttach and qemuNodeDeviceReset
looks exactly the same, I've initialized the variables there as well.
However, I am still wondering why those functions don't matter to gcc
while the first one does.
The driver arg to virPCIDeviceDetach is no longer used (the name of the stub driver is now set in the virPCIDevice object, and virPCIDeviceDetach retrieves it from there). Remove it.
virPCIDeviceDetach would previously sometimes consume the input device
object (to put it on the inactive list) and sometimes not. Avoiding
memory leaks required checking beforehand to see if the device was
already on the list, and freeing the device object in the caller only
if there wasn't already an identical object on the inactive list.
This patch makes it consistent - virPCIDeviceDetach will *never*
consume the input virPCIDevice object; if it needs to put one on the
inactive list, it will create a copy and put *that* on the list. This
way the caller knows that it is always their responsibility to free
the device object they created.
Previously stubDriver was always set from a string literal, so it was
okay to use a const char * that wasn't freed when the virPCIDevice was
freed. This will not be the case in the near future, so it is now a
char* that is allocated in virPCIDeviceSetStubDriver() and freed
during virPCIDeviceFree().
As a consequence of the cgroup layout changes from commit '632f78ca', the
qemuDomainGetSchedulerParameters[Flags]()' and qemuGetSchedulerType() APIs
failed to return data for a non running domain. This can be seen through
a 'virsh schedinfo <domain>' command which returns:
Scheduler : Unknown
error: Requested operation is not valid: cgroup CPU controller is not mounted
Prior to that change a non running domain would return:
Scheduler : posix
cpu_shares : 0
vcpu_period : 0
vcpu_quota : 0
emulator_period: 0
emulator_quota : 0
This patch will restore the capability to return configuration only data
for a non running domain regardless of whether cgroups are available.
Paolo Bonzini pointed out that it's actually possible to migrate a qemu
instance that was paused due to I/O error and it will be able to work on
the destination if the storage is accessible.
This patch introduces flag VIR_MIGRATE_ABORT_ON_ERROR that cancels the
migration in case an I/O error happens while it's being performed and
allows migration without this flag. This flag can be possibly used for
other error reasons that may be introduced in the future.
Convert input XML to migratable before using it in
qemuDomainSaveImageOpen.
XML in the save image is migratable, i.e. doesn't contain implicit
controllers. If these controllers were in a non-default order in the
input XML, the ABI check would fail. Removing and re-adding these
controllers fixes it.
https://bugzilla.redhat.com/show_bug.cgi?id=834196
This patch fixes changes done in commit 29c1e913e4
that was pushed without implementing review feedback.
The flag introduced by the patch is changed to VIR_DOMAIN_VCPU_GUEST and
documentation makes the difference between regular hotplug and this new
functionality more explicit.
The virsh options that enable the use of the new flag are changed to
"--guest" and the documentation is fixed too.
Currently, there's a path to use the ncpuinfo variable uninitialized,
which leads to a compiler warning:
qemu/qemu_driver.c: In function 'qemuDomainGetVcpusFlags':
qemu/qemu_driver.c:4573:9: error: 'ncpuinfo' may be used
uninitialized in this function [-Werror=maybe-uninitialized]
for (i = 0; i < ncpuinfo; i++) {
^
The work was done at the time of snapshot xmlstring parsing
if (offline && def->memory &&
def->memory != VIR_DOMAIN_SNAPSHOT_LOCATION_NONE) {
virReportError(...);
}
If snapshot creation failed for example due to invalid use of the
"REUSE_EXTERNAL" flag, libvirt killed access to the original image file
instead of the new image file. On machines with selinux this kills the
whole VM as the selinux context is enforced immediately.
* qemu_driver.c:qemuDomainSnapshotUndoSingleDiskActive():
- Kill access to the new image file instead of the old one.
Partially resolves: https://bugzilla.redhat.com/show_bug.cgi?id=906639
This is a recurring problem for cygwin :)
For example, see commit 23a4df88.
qemu/qemu_driver.c: In function 'qemuStateInitialize':
qemu/qemu_driver.c:691:13: error: format '%d' expects type 'int', but argument 8 has type 'uid_t' [-Wformat]
* src/qemu/qemu_driver.c (qemuStateInitialize): Add casts.
* daemon/remote.c (remoteDispatchAuthList): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Function qemuDomainSetBlockIoTune() was checking QEMU capabilities
even when !(flags & VIR_DOMAIN_AFFECT_LIVE) and the domain was
shutoff, resulting in the following problem:
virsh # domstate asdf; blkdeviotune asdf vda --write-bytes-sec 100
shut off
error: Unable to change block I/O throttle
error: unsupported configuration: block I/O throttling not supported with this QEMU binary
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=965016
Since 0d70656afd, it starts to access the sysfs files to build
the qemu command line (by virSCSIDeviceGetSgName, which is to find
out the scsi generic device name by adpater🚌target:unit), there
is no way to work around, qemu wants to see the scsi generic device
like "/dev/sg6" anyway.
And there might be other places which need to access sysfs files
when building qemu command line in future.
Instead of increasing the arguments of qemuBuildCommandLine, this
introduces a new callback for qemuBuildCommandLine, and thus tests
can register their own callbacks for sysfs test input files accessing.
* src/qemu/qemu_command.h: (New callback struct
qemuBuildCommandLineCallbacks;
extern buildCommandLineCallbacks)
* src/qemu/qemu_command.c: (wire up the callback struct)
* src/qemu/qemu_driver.c: (Use the new syntax of qemuBuildCommandLine)
* src/qemu/qemu_hotplug.c: Likewise
* src/qemu/qemu_process.c: Likewise
* tests/testutilsqemu.[ch]: (Helper testSCSIDeviceGetSgName;
callback struct testCallbacks;)
* tests/qemuxml2argvtest.c: (Use testCallbacks)
* src/tests/qemuxmlnstest.c: (Like above)
Commit 632f78c introduced a regression which causes schedinfo being
unable to set some parameters. When migrating to priv->cgroup there
was missing variable left out and due to passed NULL to underlying
function, the setting failed.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=963592
This changes the helpers qemu{Add,Remove}SharedDisk into
qemu{Add,Remove}SharedDevice, as most of the code in the helpers
can be reused for scsi host device.
To track the shared scsi host device, first it finds out the
device path (e.g. /dev/s[dr]*) which is mapped to the sg device,
and use device ID of the found device path (/dev/s[dr]*) as the
hash key. This is because of the device ID is not unique between
between /dev/s[dr]* and /dev/sg*, e.g.
% sg_map
/dev/sg0 /dev/sda
/dev/sg1 /dev/sr0
% ls -l /dev/sda
brw-rw----. 1 root disk 8, 0 May 2 19:26 /dev/sda
%ls -l /dev/sg0
crw-rw----. 1 root disk 21, 0 May 2 19:26 /dev/sg0
"Shared disk" is not only the thing we should care about after "scsi
hostdev" is introduced. A same scsi device can be used as "disk" for
one domain, and as "scsi hostdev" for another domain at the same time.
That's why this patch renames qemu_driver->sharedDisks. Related functions
and structs are also renamed.
Adding a VNC WebSocket support for QEMU driver. This functionality is
in upstream qemu from commit described as v1.3.0-982-g7536ee4, so the
capability is being recognized based on QEMU version for now.
Although virtio-scsi supports SCSI PR (Persistent Reservations),
the device on host may do not support it. To avoid losing data,
Just like PCI and USB pass through devices, only one live guest
is allowed per SCSI host pass through device."
Signed-off-by: Han Cheng <hanc.fnst@cn.fujitsu.com>
It is possible to build a kernel without swap cgroup controls
present. This causes a fatal error when querying memory
parameters. Treat missing swap controls as meaning "unlimited".
The fatal error remains if the user tries to actually change
the limit.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=851411https://bugzilla.redhat.com/show_bug.cgi?id=955500
The first problem was that virFileOpenAs was returning fd (-1) in one
of the error cases rather than ret (-errno), so the caller thought
that the error was EPERM rather than ENOENT.
The second problem was that some log messages in the general purpose
qemuOpenFile() function would always say "Failed to create" even if
the caller hadn't included O_CREAT (i.e. they were trying to open an
existing file).
This fixes virFileOpenAs to jump down to the error return (which
returns ret instead of fd) in the previously mentioned incorrect
failure case of virFileOpenAs(), removes all error logging from
virFileOpenAs() (since the callers report it), and modifies
qemuOpenFile to appropriately use "open" or "create" in its log
messages.
NB: I seriously considered removing logging from all callers of
virFileOpenAs(), but there is at least one case where the caller
doesn't want virFileOpenAs() to log any errors, because it's just
going to try again (qemuOpenFile()). We can't simply make a silent
variation of virFileOpenAs() though, because qemuOpenFile() can't make
the decision about whether or not it wants to retry until after
virFileOpenAs() has already returned an error code.
Likewise, I also considered changing virFileOpenAs() to return -1 with
errno set on return, and may still do that, but only as a separate
patch, as it obscures the intent of this patch too much.
The LXC, QEMU, and LibXL drivers have all merged their handling of
the attach/update/modify device APIs into one large
'xxxxDomainModifyDeviceFlags'
which then does a 'switch()' based on the actual API being invoked.
While this saves some lines of code, it is not really all that
significant in the context of the driver API impls as a whole.
This merger of the handling of different APIs creates pain when
wanting to automated analysis of the code and do things which
are specific to individual APIs. The slight duplication of code
from unmerged the API impls, is preferrable to allow for easier
automated analysis.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the parsing of XML is pushed down into the various
migration helper APIs. This makes it difficult to insert the
correct access control checks, since one helper API services
many public APIs. Pull the parsing of XML up to the top level
of the QEMU driver APIs
The individual hypervisor drivers were directly referencing
APIs in virnodesuspend.c in their virDriverPtr struct. Separate
these methods, so there is always a wrapper in the hypervisor
driver. This allows the unused virConnectPtr args to be removed
from the virnodesuspend.c file. Again this will ensure that
ACL checks will only be performed on invocations that are
directly associated with public API usage.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The individual hypervisor drivers were directly referencing
APIs in src/nodeinfo.c in their virDriverPtr struct. Separate
these methods, so there is always a wrapper in the hypervisor
driver. This allows the unused virConnectPtr args to be
removed from the nodeinfo.c file. Again this will ensure that
ACL checks will only be performed on invocations that are
directly associated with public API usage.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the virGetHostname() API has a bogus virConnectPtr
parameter. This is because virtualization drivers directly
reference this API in their virDriverPtr tables, tieing its
API design to the public virConnectGetHostname API design.
This also causes problems for access control checks since
these must only be done for invocations from the public
API, not internal invocation.
Remove the bogus virConnectPtr parameter, and make each
hypervisor driver provide a dedicated function for the
driver API impl. This will allow access control checks
to be easily inserted later.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When attempting to generate the native command line from an XML file
that uses graphics port auto allocation, the generated commandline
wouldn't be valid.
This patch adds fake autoallocation of ports as done when starting the
actual machine.
The source code base needs to be adapted as well. Some files
include virutil.h just for the string related functions (here,
the include is substituted to match the new file), some include
virutil.h without any need (here, the include is removed), and
some require both.
virPCIDeviceReattach and virPCIDeviceUnbindFromStub (called by
virPCIDeviceReattach) had previously required the name of the stub
driver as input. This is unnecessary, because the name of the driver
the device is currently bound to can be found by looking at the link:
/sys/bus/pci/dddd:bb:ss.ff/driver
Instead of requiring that the name of the expected stub driver name
and only unbinding if that one name is matched, we no longer take a
driver name in the arglist for either of these
functions. virPCIDeviceUnbindFromStub just compares the name of the
currently bound driver to a list of "well known" stubs (right now
contains "pci-stub" and "vfio-pci" for qemu, and "pciback" for xen),
and only performs the unbind if it's one of those devices.
This allows virsh nodedevice-reattach to work properly across a
libvirtd restart, and fixes a couple of cases where we were
erroneously still hard-coding "pci-stub" as the drive name.
For some unknown reason, virPCIDeviceReattach had been calling
modprobe on the stub driver prior to unbinding the device. This was
problematic because we no longer know the name of the stub driver in
that function. However, it is pointless to probe for the stub driver
at that time anyway - because the device is bound to the stub driver,
we are guaranteed that it is already loaded, and so that call to
modprobe has been removed.
The differences from virNodeDeviceDettach are very minor:
1) Check that the flags are 0.
2) Set the virPCIDevice's stubDriver according to the driverName that
is passed in.
3) Call virPCIDeviceDetach with a NULL stubDriver, indicating it
should get the name of the stub driver from the virPCIDevice
object.
Ensure that all drivers implementing public APIs use a
naming convention for their implementation that matches
the public API name.
eg for the public API virDomainCreate make sure QEMU
uses qemuDomainCreate and not qemuDomainStart
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure that the driver struct field names match the public
API names. For an API virXXXX we must have a driver struct
field xXXXX. ie strip the leading 'vir' and lowercase any
leading uppercase letters.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Refactoring done in 19c6ad9ac7 didn't
correctly take into account the order cgroup limit modification needs to
be done in. This resulted into errors when decreasing the limits.
The operations need to take place in this order:
decrease hard limit
change swap hard limit
or
change swap hard limit
increase hard limit
This patch also fixes the check if the hard_limit is less than
swap_hard_limit to print better error messages. For this purpose I
introduced a helper function virCompareLimitUlong to compare limit
values where value of 0 is equal to unlimited. Additionally the check is
now applied also when the user does not provide all of the tunables
through the API and in that case the currently set values are used.
This patch resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=950478
Wrong use of the parentheses causes "rc" always having a boolean value,
either "1" or "0", and thus we can't get the detailed error message
when it fails:
Before (I only have 1 node):
% virsh numatune f18 --nodeset 12
error: Unable to change numa parameters
error: unable to set numa tunable: Unknown error -1
After:
virsh numatune f18 --nodeset 12
error: Unable to change numa parameters
error: unable to set numa tunable: Invalid argument
Detected by a simple Shell script:
for i in $(git ls-files -- '*.[ch]'); do
awk 'BEGIN {
fail=0
}
/# *include.*\.h/{
match($0, /["<][^">]*[">]/)
arr[substr($0, RSTART+1, RLENGTH-2)]++
}
END {
for (key in arr) {
if (arr[key] > 1) {
fail=1
printf("%d %s\n", arr[key], key)
}
}
if (fail == 1)
exit 1
}' $i
if test $? != 0; then
echo "Duplicate header(s) in $i"
fi
done;
A later patch will add the syntax-check to avoid duplicate
headers.
Rename all the virCgroupForXXX methods to use the form
virCgroupNewXXX since they are all constructors. Also
make sure the output parameter is the last one in the
list, and annotate all pointers as non-null. Fix up
all callers, and make sure they use true/false not 0/1
for the boolean parameters
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of calling virCgroupForDomain every time we need
the virCgrouPtr instance, just do it once at Vm startup
and cache a reference to the object in qemuDomainObjPrivatePtr
until shutdown of the VM. Removing the virCgroupPtr from
the QEMU driver state also means we don't have stale mount
info, if someone mounts the cgroups filesystem after libvirtd
has been started
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Report the errors as:
Domain not found: no domain with matching uuid '41414141-4141-4141-4141-414141414141' (crashtest)
instead of:
Domain not found: no domain with matching uuid '41414141-4141-4141-4141-414141414141'
Use the helper to lookup the domain object in the remaining places.
This patch also fixes error reporting when the domain was not found in several
functions that were printing the raw UUID buffer instead of the formatted
string. The offending functions were:
qemuDomainGetInterfaceParameters
qemuDomainSetInterfaceParameters
qemuGetSchedulerParametersFlags
qemuSetSchedulerParametersFlags
qemuDomainGetNumaParameters
qemuDomainSetNumaParameters
qemuDomainGetMemoryParameters
qemuDomainSetMemoryParameters
qemuDomainGetBlkioParameters
qemuDomainSetBlkioParameters
qemuDomainGetCPUStats
To support "shareable" for volume type disk, we have to translate
the source before trying to add the shared disk entry. To achieve
the goal, this moves the helper qemuTranslateDiskSourcePool into
src/qemu/qemu_conf.c, and introduce an internal only member (voltype)
for struct _virDomainDiskSourcePoolDef, to record the underlying
volume type for use when building the drive string.
Later patch will support "shareable" volume type disk.
For both "live" and "config" changes of vcpupin and emulatorpin, an
all clear bitmap doesn't make sense, and it can just cause corruptions.
E.g (similar for emulatorpin).
% virsh vcpupin hame 0 8,^8 --config
% virsh vcpupin hame
VCPU: CPU Affinity
----------------------------------
0:
1: 0-63
2: 0-63
3: 0-63
% virsh dumpxml hame | grep cpuset
<vcpupin vcpu='0' cpuset=''/>
% virsh start hame
error: Failed to start domain hame
error: An error occurred, but the cause is unknown
When setting processor count for a domain using the API libvirt enforced
a maximum processor count, while it isn't enforced when taking the XML path.
This patch removes the check to match the XML.
Currently when getting an instance of virCgroupPtr we will
create the path in all cgroup controllers. Only at the virt
driver layer are we attempting to filter controllers. This
is bad because the mere act of creating the dirs in the
controllers can have a functional impact on the kernel,
particularly for performance.
Update the virCgroupForDriver() method to accept a bitmask
of controllers to use. Only create dirs in the controllers
that are requested. When creating cgroups for domains,
respect the active controller list from the parent cgroup
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch removes the defaultDiskDriverName from the virCaps
structure. This particular default value is used only in the qemu driver
so this patch uses the recently added callback to fill the driver name
if it's needed instead of propagating it through virCaps.
This patch adds instrumentation that will allow hypervisor drivers to
fill and validate domain and device definitions after parsed by the XML
parser.
With this patch, after the XML is parsed, a callback to the driver is
issued requesting to fill and validate driver specific details of the
configuration. This allows to use sensible defaults and checks on a per
driver basis at the time the XML is parsed.
Two callback pointers are stored in the new virDomainXMLConf object:
* virDomainDeviceDefPostParseCallback (devicesPostParseCallback)
- called for a single device parsed and for every single device in a
domain config. A virDomainDeviceDefPtr is passed along with the
domain definition and virCaps.
* virDomainDefPostParseCallback, (domainPostParseCallback)
- A callback that is meant to process the domain config after it's
parsed. A virDomainDefPtr is passed along with virCaps.
Both types of callbacks support arbitrary opaque data passed for the
callback functions.
Errors may be reported in those callbacks resulting in a XML parsing
failure.
This patch is the result of running:
for i in $(git ls-files | grep -v html | grep -v \.po$ ); do
sed -i -e "s/virDomainXMLConf/virDomainXMLOption/g" -e "s/xmlconf/xmlopt/g" $i
done
and a few manual tweaks.
Mimic the fix done in 02b9097274 to fix crash by
accessing an already freed structure. Also copy the explaining comment why the
pointer can't be accessed any more.
The VIR_ERR_NO_SUPPORT error code is reserved for cases where an
API is not implemented in a driver. It definitely should not be
used when an API execution fails due to unsupported operation.
We should record the new disk src in the shared disk table for
updating disk (CD-ROM or Floppy) API. Fortunately, we only allow
to update the disk source now, otherwise we might also want to
set the unpriv_sgio setting.
Intend to reduce the redundant code,use virNumaSetupMemoryPolicy
to replace virLXCControllerSetupNUMAPolicy and
qemuProcessInitNumaMemoryPolicy.
This patch also moves the numa related codes to the
file virnuma.c and virnuma.h
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
We didn't yet expose the virtio device attach and detach functionality
for s390 domains as the device hotplug was very limited with the old
virtio-s390 bus. With the CCW bus there's full hotplug support for
virtio devices in QEMU, so we are adding this to libvirt too.
Since the virtio hotplug isn't limited to PCI anymore, we change the
function names from xxxPCIyyy to xxxVirtioyyy, where we handle all
three virtio bus types.
Signed-off-by: J.B. Joret <jb@linux.vnet.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
With our recent patch (1715c83b5f) we thrive to get the correct
number of maximal VCPUs. However, we are using a constant from
linux/kvm.h which may be not defined in every distro. Hence, we
should guard usage of the constant with ifdef preprocessor
directive. This was introduced in kernel:
commit 8c3ba334f8588e1d5099f8602cf01897720e0eca
Author: Sasha Levin <levinsasha928@gmail.com>
Date: Mon Jul 18 17:17:15 2011 +0300
KVM: x86: Raise the hard VCPU count limit
The patch raises the hard limit of VCPU count to 254.
This will allow developers to easily work on scalability
and will allow users to test high VCPU setups easily without
patching the kernel.
To prevent possible issues with current setups, KVM_CAP_NR_VCPUS
now returns the recommended VCPU limit (which is still 64) - this
should be a safe value for everybody, while a new KVM_CAP_MAX_VCPUS
returns the hard limit which is now 254.
$ git desc 8c3ba334f
v3.1-rc7-48-g8c3ba33
The virCaps structure gathered a ton of irrelevant data over time that.
The original reason is that it was propagated to the XML parser
functions.
This patch aims to create a new data structure virDomainXMLConf that
will contain immutable data that are used by the XML parser. This will
allow two things we need:
1) Get rid of the stuff from virCaps
2) Allow us to add callbacks to check and add driver specific stuff
after domain XML is parsed.
This first attempt removes pointers to private data allocation functions
to this new structure and update all callers and function that require
them.
When there are two concurrent threads, we may dereference a NULL
pointer, even though it has been checked before:
1. Thread1: starts executing qemuDomainBlockStatsFlags() with nparams != 0.
It finds given disk and successfully pass check for disk->info.alias
not being NULL.
2. Thread2: starts executing qemuDomainDetachDeviceFlags() on the very same
disk as Thread1 is working on.
3. Thread1: gets to qemuDomainObjBeginJob() where it sets a job on a
domain.
4. Thread2: also tries to set a job. However, we are not guaranteed which
thread wins. So assume it's Thread2 who can continue.
5. Thread2: does the actual detach and frees disk->info.alias
6. Thread2: quits the job
7. Thread1: now successfully acquires the job, and accesses a NULL pointer.
19c6ad9a (qemu: Refactor qemuDomainSetMemoryParameters) introduced
a new macro, VIR_GET_LIMIT_PARAMETER(PARAM, VALUE). But if statement
in the macro is not correct and so set_XXXX flags are set to false
in the wrong. As a result, libvirt ignores all memtune parameters.
This patch fixes the conditional expression to work correctly.
Signed-off-by: Satoru Moriya <satoru.moriya@hds.com>
Currently, qemuDomainShutdownFlags() chooses the agent method of
shutdown whenever the agent is configured. However, this
assumption is not enough as the guest agent may be unresponsive
at the moment. So unless guest agent method has been explicitly
requested, we should fall back to the ACPI method.
The unitialized local variable qemuVersion can cause an random value
to be returned for the hypervisor version, observable with virsh version.
Introduced by commit b46f7f4a0b
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
This change tried to fix a crash with changing CDROM media but
failed to actually do so
commit d0172d2b1b
Author: Osier Yang <jyang@redhat.com>
Date: Tue Feb 19 20:27:45 2013 +0800
qemu: Remove the shared disk entry if the operation is ejecting or updating
It was still accessing disk->src, when the entire 'disk' object
has been free'd already. Even if it weren't free'd, accessing
the 'src' value of virDomainDiskDef is not allowed without
first validating disk->type is file or block. Just remove the
broken code entirely.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The new TypedParam helper APIs allow to simplify this function
significantly.
This patch integrates the fix in 75e5bec97b
by correctly ordering the setting functions instead of reordering the
parameters.
To avoid having to hold the qemu driver lock while iterating through
close callbacks and calling them. This fixes a real deadlock when a
domain which is being migrated from another host gets autodestoyed as a
result of broken connection to the other host.
The max value of number of cpus to compute(id) should not
be equal or greater than max cpu number.
The bug ocurrs when id value is equal to max cpu number which
leads to the off-by-one error in the following for loop.
# virsh cpu-stats guest --start 1
error: Failed to virDomainGetCPUStats()
error: internal error cpuacct parse error
Currently, if lzop decompression binary produces a warning, it
doesn't exit with zero status but 2 instead. Terrifying, but
true. However, warnings may be ignored using '--ignore-warn'
command line argument. Moreover, in which case, the exit status
will be zero.
For both AttachDevice and UpdateDevice APIs, if the disk device
is 'cdrom' or 'floppy', the operations could be ejecting, updating,
and inserting. For either ejecting or updating, the shared disk
entry of the original disk src has to be removed, because it's
not useful anymore.
And since the original disk def will be changed, new disk def passed
as argument will be free'ed in qemuDomainChangeEjectableMedia, so
we need to copy the orignal disk def before
qemuDomainChangeEjectableMedia, to use it for qemuRemoveSharedDisk.
The disk def could be free'ed by qemuDomainChangeEjectableMedia,
which can thus cause crash if we reference the disk pointer. On
the other hand, we have to remove the added shared disk entry from
the table on error codepath.
The hash entry is changed from "ref" to {ref, @domains}. With this, the
caller can simply call qemuRemoveSharedDisk, without afraid of removing
the entry belongs to other domains. qemuProcessStart will obviously
benifit from it on error codepath (which calls qemuProcessStop to do
the cleanup).
Based on moving various checking into qemuAddSharedDisk, this
avoids the caller using it in wrong ways. Also this adds two
new checking for qemuCheckSharedDisk (disk device not 'lun'
and kernel doesn't support unpriv_sgio simply returns 0).