Detected by ccc-analyzer, reported by Alex Jia.
qemuProcessStart always calls qemuProcessWaitForMonitor with a
non-negative position, but qemuProcessAttach always calls with -1.
In the latter case, there is no log file we can scrape, so we
also should not be trying to scrape the logs if the qemu process
died at the very end.
* src/qemu/qemu_process.c (qemuProcessWaitForMonitor): Don't try
to read from log in qemuProcessAttach case.
Value stored to 'ret' is never read, so remove this dead assignment.
* src/qemu/qemu_monitor_text.c: kill dead assignment.
Signed-off-by: Alex Jia <ajia@redhat.com>
Value stored to 'ret' is never read, in fact, 'cleanup' section will
directly return -1 when function is fail, so remove this dead assignment.
* src/qemu/qemu_process.c: kill dead assignment.
Signed-off-by: Alex Jia <ajia@redhat.com>
Quite a few leaks detected by coverity. For chr, the leaks were
close enough to the allocations to plug in place; for disk, the
leaks were separated from the allocation by enough other lines with
intermediate failure cases that I refactored the cleanup instead.
* src/qemu/qemu_command.c (qemuParseCommandLine): Plug leaks.
Warning detected by Coverity. No need for the NULL check, and
removing it silences the warning without any semantic change.
* src/qemu/qemu_migration.c (qemuMigrationFinish): All entries to
endjob had non-NULL vm.
Coverity detected that 5 of 6 callers of virJSONValueArrayGet checked
for a NULL return; and that by not checking we risk a null deref
during an error. The error is unlikely since the prior call to
virJSONValueArraySize would probably have already caught any botched
JSON array parse, but better safe than sorry.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONGetBlockJobInfo):
Check for NULL.
(qemuMonitorJSONExtractPtyPaths): Fix typo.
Revert 6a1f5f568f. Now that libvirt_iohelper takes fds by
inheritance rather than by open() (commit 1eb66479), there is
no longer a race where the parent can unlink() a file prior to
the iohelper open()ing the same file. From there, it makes
more sense to have the callers both create and unlink, rather
than the caller create and the stream unlink, since the latter
was only needed when iohelper had to do the unlink.
* src/fdstream.h (virFDStreamOpenFile, virFDStreamCreateFile):
Callers are responsible for deletion.
* src/fdstream.c (virFDStreamOpenFileInternal): Don't leak created
file on failure.
(virFDStreamOpenFile, virFDStreamCreateFile): Drop parameter.
* src/lxc/lxc_driver.c (lxcDomainOpenConsole): Update callers.
* src/qemu/qemu_driver.c (qemuDomainScreenshot)
(qemuDomainOpenConsole): Likewise.
* src/storage/storage_driver.c (storageVolumeDownload)
(storageVolumeUpload): Likewise.
* src/uml/uml_driver.c (umlDomainOpenConsole): Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainScreenshot): Likewise.
* src/xen/xen_driver.c (xenUnifiedDomainOpenConsole): Likewise.
The previous qemu patch could end up calling unlink(tmp) before
tmp was the name of a valid file (unlinking a fileXXXXXX template
instead), or calling unlink(tmp) twice on success (once here,
and once at the end of the stream). Meanwhile, vbox also suffered
from the same leaked tmp file bug.
* src/qemu/qemu_driver.c (qemuDomainScreenshot): Don't unlink on
success, or on invalid name.
* src/vbox/vbox_tmpl.c (vboxDomainScreenshot): Don't leak temp file.
Currently, we attempt to run sync job and async job at the same time. It
means that the monitor commands for two jobs can be run in any order.
In the function qemuDomainObjEnterMonitorInternal():
if (priv->job.active == QEMU_JOB_NONE && priv->job.asyncJob) {
if (qemuDomainObjBeginNestedJob(driver, obj) < 0)
We check whether the caller is an async job by priv->job.active and
priv->job.asynJob. But when an async job is running, and a sync job is
also running at the time of the check, then priv->job.active is not
QEMU_JOB_NONE. So we cannot check whether the caller is an async job
in the function qemuDomainObjEnterMonitorInternal(), and must instead
put the burden on the caller to tell us when an async command wants
to do a nested job.
Once the burden is on the caller, then only async monitor enters need
to worry about whether the VM is still running; for sync monitor enter,
the internal return is always 0, so lots of ignore_value can be dropped.
* src/qemu/THREADS.txt: Reflect new rules.
* src/qemu/qemu_domain.h (qemuDomainObjEnterMonitorAsync): New
prototype.
* src/qemu/qemu_process.h (qemuProcessStartCPUs)
(qemuProcessStopCPUs): Add parameter.
* src/qemu/qemu_migration.h (qemuMigrationToFile): Likewise.
(qemuMigrationWaitForCompletion): Make static.
* src/qemu/qemu_domain.c (qemuDomainObjEnterMonitorInternal): Add
parameter.
(qemuDomainObjEnterMonitorAsync): New function.
(qemuDomainObjEnterMonitor, qemuDomainObjEnterMonitorWithDriver):
Update callers.
* src/qemu/qemu_driver.c (qemuDomainSaveInternal)
(qemudDomainCoreDump, doCoreDump, processWatchdogEvent)
(qemudDomainSuspend, qemudDomainResume, qemuDomainSaveImageStartVM)
(qemuDomainSnapshotCreateActive, qemuDomainRevertToSnapshot):
Likewise.
* src/qemu/qemu_process.c (qemuProcessStopCPUs)
(qemuProcessFakeReboot, qemuProcessRecoverMigration)
(qemuProcessRecoverJob, qemuProcessStart): Likewise.
* src/qemu/qemu_migration.c (qemuMigrationToFile)
(qemuMigrationWaitForCompletion, qemuMigrationUpdateJobStatus)
(qemuMigrationJobStart, qemuDomainMigrateGraphicsRelocate)
(doNativeMigrate, doTunnelMigrate, qemuMigrationPerformJob)
(qemuMigrationPerformPhase, qemuMigrationFinish)
(qemuMigrationConfirm): Likewise.
* src/qemu/qemu_hotplug.c: Drop unneeded ignore_value.
whether or not previous return value is -1, the following codes will be
executed for a inactive guest in src/qemu/qemu_driver.c:
ret = virDomainSaveConfig(driver->configDir, persistentDef);
and if everything is okay, 'ret' is assigned to 0, the previous 'ret'
will be overwritten, this patch will fix this issue.
* src/qemu/qemu_driver.c: avoid return value is overwritten when give a argument
in out of blkio weight range for a inactive guest.
* how to reproduce?
% virsh blkiotune ${guestname} --weight 10
% echo $?
Note: guest must be inactive, argument 10 in out of blkio weight range,
and can get a error information by checking libvirtd.log, however,
virsh hasn't raised any error information, and return value is 0.
https://bugzilla.redhat.com/show_bug.cgi?id=726304
Signed-off-by: Alex Jia <ajia@redhat.com>
whether or not previous return value is -1, the following codes will be
executed for a inactive guest in qemuDomainSetMemoryParameters:
ret = virDomainSaveConfig(driver->configDir, persistentDef);
and if everything is okay, 'ret' is assigned to 0, the previous 'ret'
will be overwritten, this patch will fix this issue.
* src/qemu/qemu_driver.c: avoid return value is overwritten when set
min_guarante value to a inactive guest.
* how to reproduce?
% virsh memtune ${guestname} --min_guarante 1024
% echo $?
Note: guest must be inactive, in fact, 'min_guarante' hasn't been implemented
in memory tunable, and I can get the error when check actual libvirtd.log,
however, virsh hasn't raised any error information, and return value is 0.
Signed-off-by: Alex Jia <ajia@redhat.com>
Introduced by f9a837da73, the condition is not changed after
the else clause is removed. So now it quit with "domain is not
running" when the domain is running. However, when the domain is
not running, it reports "no job is active".
How to reproduce:
1)
% virsh start $domain
% virsh domjobabort $domain
error: Requested operation is not valid: domain is not running
2)
% virsh destroy $domain
% virsh domjobabort $domain
error: Requested operation is not valid: no job is active on the domain
3)
% virsh save $domain /tmp/$domain.save
Before above commands finished, try to abort job in another terminal
% virsh domabortjob $domain
error: Requested operation is not valid: domain is not running
Using a macro ensures that all the code is looking for the same
prefix.
* src/conf/domain_conf.h (VIR_NET_GENERATED_PREFIX): New macro.
* src/conf/domain_conf.c (virDomainNetDefParseXML): Use it.
* src/uml/uml_conf.c (umlConnectTapDevice): Likewise.
* src/qemu/qemu_command.c (qemuNetworkIfaceConnect): Likewise.
Suggested by Laine Stump.
The goal here is that save-image-dumpxml fed back to
save-image-define should not change the save file; anywhere that
this is not the case is probably a bug in domain_conf.c.
* src/qemu/qemu_driver.c (qemuDomainSaveImageGetXMLDesc)
(qemuDomainSaveImageDefineXML): New functions.
(qemuDomainSaveImageOpen): Add parameter.
(qemuDomainRestoreFlags, qemuDomainObjRestore): Adjust clients.
With this, it is possible to update the path to a disk backing
image on either the save or restore action, without having to
binary edit the XML embedded in the state file.
This also modifies virDomainSave to output a smaller xml (only
the inactive xml, which is all the more virDomainRestore parses),
while still guaranteeing padding for most typical abi-compatible
xml replacements, necessary so that the next patch for
virDomainSaveImageDefineXML will not cause unnecessary
modifications to the save image file.
* src/qemu/qemu_driver.c (qemuDomainSaveInternal): Add parameter,
only use inactive state, and guarantee padding.
(qemuDomainSaveImageOpen): Add parameter.
(qemuDomainSaveFlags, qemuDomainManagedSave)
(qemuDomainRestoreFlags, qemuDomainObjRestore): Update callers.
The domain XML now understands the <listen> subelement of its
<graphics> element (including when listen type='network'), and the
network driver has an internal API that will turn a network name into
an IP address, so the final logical step is to put the glue into the
qemu driver so that when it is starting up a domain, if it finds
<listen type='network' network='xyz'/> in the XML, it will call the
network driver to get an IPv4 address associated with network xyz, and
tell qemu to listen for vnc (or spice) on that address rather than the
default address (localhost).
The motivation for this is that a large installation may want the
guests' VNC servers listening on physical interfaces rather than
localhost, so that users can connect directly from the outside; this
requires sending qemu the appropriate IP address to listen on. But
this address will of course be different for each host, and if a guest
might be migrated around from one host to another, it's important that
the guest's config not have any information embedded in it that is
specific to one particular host. <listen type='network.../> can solve
this problem in the following manner:
1) on each host, define a libvirt network of the same name,
associated with the interface on that host that should be used
for listening (for example, a simple macvtap network: <forward
mode='bridge' dev='eth0'/>, or host bridge network: <forward
mode='bridge'/> <bridge name='br0'/>
2) in the <graphics> element of each guest's domain xml, tell vnc to
listen on the network name used in step 1:
<graphics type='vnc' port='5922'>
<listen type='network'network='example-net'/>
</graphics>
(all the above also applies for graphics type='spice').
Once it's plugged in, the <listen> element will be an optional
replacement for the "listen" attribute that graphics elements already
have. If the <listen> element is type='address', it will have an
attribute called 'address' which will contain an IP address or dns
name that the guest's display server should listen on. If, however,
type='network', the <listen> element should have an attribute called
'network' that will be set to the name of a network configuration to
get the IP address from.
* docs/schemas/domain.rng: updated to allow the <listen> element
* docs/formatdomain.html.in: document the <listen> element and its
attributes.
* src/conf/domain_conf.[hc]:
1) The domain parser, formatter, and data structure are modified to
support 0 or more <listen> subelements to each <graphics>
element. The old style "legacy" listen attribute is also still
accepted, and will be stored internally just as if it were a
separate <listen> element. On output (i.e. format), the address
attribute of the first <listen> element of type 'address' will be
duplicated in the legacy "listen" attribute of the <graphic>
element.
2) The "listenAddr" attribute has been removed from the unions in
virDomainGRaphicsDef for graphics types vnc, rdp, and spice.
This attribute is now in the <listen> subelement (aka
virDomainGraphicsListenDef)
3) Helper functions were written to provide simple access
(both Get and Set) to the listen elements and their attributes.
* src/libvirt_private.syms: export the listen helper functions
* src/qemu/qemu_command.c, src/qemu/qemu_hotplug.c,
src/qemu/qemu_migration.c, src/vbox/vbox_tmpl.c,
src/vmx/vmx.c, src/xenxs/xen_sxpr.c, src/xenxs/xen_xm.c
Modify all these files to use the listen helper functions rather
than directly referencing the (now missing) listenAddr
attribute. There can be multiple <listen> elements to a single
<graphics>, but the drivers all currently only support one, so all
replacements of direct access with a helper function indicate index
"0".
* tests/* - only 3 of these are new files added explicitly to test the
new <listen> element. All the others have been modified to reflect
the fact that any legacy "listen" attributes passed in to the domain
parse will be saved in a <listen> element (i.e. one of the
virDomainGraphicsListenDefs), and during the domain format function,
both the <listen> element as well as the legacy attributes will be
output.
qemuMigrationUpdateJobStatus (called in a loop by migration
and save tasks) uses qemuDomainObjEnterMonitorWithDriver;
however, that function ended up starting a nested job without
releasing the driver.
Since no one else is making nested calls, we can inline the
internal functions to properly track driver_locked.
* src/qemu/qemu_domain.h (qemuDomainObjBeginNestedJob)
(qemuDomainObjBeginNestedJobWithDriver)
(qemuDomainObjEndNestedJob): Drop unused prototypes.
* src/qemu/qemu_domain.c (qemuDomainObjEnterMonitorInternal):
Reflect driver lock to nested job.
(qemuDomainObjBeginNestedJob)
(qemuDomainObjBeginNestedJobWithDriver)
(qemuDomainObjEndNestedJob): Drop unused functions.
As written in virStorageFileGetMetadataFromFD decription, caller
must free metadata after use. Qemu driver miss this and therefore
leak metadata which can grow to huge mem leak if somebody query
for blockInfo a lot.
The error in getCompressionType will never be reported, change
the errors codes into warning (VIR_WARN("%s", _(foo)); doesn't break
syntax-check rule), and also improve the docs in qemu.conf to tell
user the truth.
Make MIGRATION_OUT use the new helper methods.
This also introduces new protection to migration v3 process: the
migration job is held from Begin to Confirm to avoid changes to a domain
during migration (esp. between Begin and Perform phases). This change is
automatically applied to p2p and tunneled migrations. For normal
migration, this requires support from a client. In other words, if an
old (pre 0.9.4) client starts normal migration of a domain, the domain
will not be protected against changes between Begin and Perform steps.
Without this, a configure built by autoconf 2.59 was broken when
trying to detect which compiler warning flags were supported.
* .gnulib: Update to latest, for warnings.m4 fix.
* bootstrap.conf: Add fclose explicitly, to match recent gnulib
implicit dependency changes.
* src/qemu/qemu_conf.c (includes): Drop unused include.
* src/uml/uml_conf.c (include): Likewise.
Reported by Daniel P. Berrange.
Every DomainNetDef has a bandwidth, as does every portgroup.
Whenever a DomainNetDef of type NETWORK is about to be used, a call is
made to networkAllocateActualDevice(). This function chooses the "best"
bandwidth object and places it in the DomainActualNetDef.
From that point on, whenever some code needs to use the bandwidth data
for the interface, it's retrieved with virDomainNetGetActualBandwidth(),
which will always return the "best" info as determined in the
previous step.
The cpu bandwidth is applied at the vcpu group level. We should apply it
at the vm group level too, because the vm may do heavy I/O, and it will affect
the other vm.
We apply cpu bandwidth at the vcpu and the vm group level, so we must ensure
that max(child_quota) <= parent_quota when we modify cpu bandwidth.
Now that virDomainSetVcpusFlags knows about VIR_DOMAIN_AFFECT_CURRENT,
so should virDomainGetVcpusFlags.
Unfortunately, the virsh counterpart 'virsh vcpucount' has already
commandeered --current for a different meaning, so teaching virsh
to expose this in the next patch will require a bit of care.
* src/libvirt.c (virDomainGetVcpusFlags): Allow
VIR_DOMAIN_AFFECT_CURRENT.
* src/libxl/libxl_driver.c (libxlDomainGetVcpusFlags): Likewise.
* src/qemu/qemu_driver.c (qemudDomainGetVcpusFlags): Likewise.
* src/test/test_driver.c (testDomainGetVcpusFlags): Likewise.
* src/xen/xen_driver.c (xenUnifiedDomainGetVcpusFlags): Likewise.
Although most functions in libvirt return 0 on success and < 0 on
failure, there are a few functions lingering around that return errno
(a positive value) on failure, and sometimes code calling those
functions incorrectly assumes the <0 standard. I noticed one of these
the other day when auditing networkStartDhcpDaemon after Guido Gunther
found a place where success was improperly returned on failure (that
patch has been acked and is pending a push). The problem was that it
expected the return value from virFileReadPid to be < 0 on failure,
but it was actually positive (it was also neglected to set the return
code in this case, similar to the bug found by Guido).
This all led to the fact that *all* of the virFile*Pid functions in
util.c are returning errno on failure. This patch remedies that
problem by changing them all to return -errno on failure, and makes
any necessary changes to callers of the functions. (In the meantime, I
also properly set the return code on failure of virFileReadPid in
networkStartDhcpDaemon).
In the XML file we now have
<cputune>
<shares>1024</shares>
<period>90000</period>
<quota>0</quota>
</cputune>
But the schedinfo parameter are being named
cpu_shares: 1024
cfs_period: 90000
cfs_quota: 0
The period/quota is per-vcpu value, so these new tunables should be named
'vcpu_period' and 'vcpu_quota'.
When an operation started by virDomainBlockPull completes (either with
success or with failure), raise an event to indicate the final status.
This API allow users to avoid polling on virDomainGetBlockJobInfo if
they would prefer to use an event mechanism.
* daemon/remote.c: Dispatch events to client
* include/libvirt/libvirt.h.in: Define event ID and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle the new event
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for block_stream completion and emit a libvirt block pull event
* src/remote/remote_driver.c: Receive and dispatch events to application
* src/remote/remote_protocol.x: Wire protocol definition for the event
* src/remote_protocol-structs: structure definitions for protocol verification
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for BLOCK_STREAM_COMPLETED event
from QEMU monitor
The virDomainBlockPull* family of commands are enabled by the
following HMP/QMP commands: 'block_stream', 'block_job_cancel',
'info block-jobs' / 'query-block-jobs', and 'block_job_set_speed'.
* src/qemu/qemu_driver.c src/qemu/qemu_monitor_text.[ch]: implement disk
streaming by using the proper qemu monitor commands.
* src/qemu/qemu_monitor_json.[ch]: implement commands using the qmp monitor
When auto-dumping a domain on crash events, or autostarting a domain
with managed save state, let the user configure whether to imply
the bypass cache flag.
* src/qemu/qemu.conf (auto_dump_bypass_cache, auto_start_bypass_cache):
Document new variables.
* src/qemu/libvirtd_qemu.aug (vnc_entry): Let augeas parse them.
* src/qemu/qemu_conf.h (qemud_driver): Store new preferences.
* src/qemu/qemu_conf.c (qemudLoadDriverConfig): Parse them.
* src/qemu/qemu_driver.c (processWatchdogEvent, qemuAutostartDomain):
Honor them.
Wire together the previous patches to support file system cache
bypass during API save/restore requests in qemu.
* src/qemu/qemu_driver.c (qemuDomainSaveInternal, doCoreDump)
(qemudDomainObjStart, qemuDomainSaveImageOpen, qemuDomainObjRestore)
(qemuDomainObjStart): Add parameter.
(qemuDomainSaveFlags, qemuDomainManagedSave, qemudDomainCoreDump)
(processWatchdogEvent, qemudDomainStartWithFlags, qemuAutostartDomain)
(qemuDomainRestoreFlags): Update callers.
For all hypervisors that support save and restore, the new API
now performs the same functions as the old.
VBox is excluded from this list, because its existing domainsave
is broken (there is no corresponding domainrestore, and there
is no control over the filename used in the save). A later
patch should change vbox to use its implementation for
managedsave, and teach start to use managedsave results.
* src/libxl/libxl_driver.c (libxlDomainSave): Move guts...
(libxlDomainSaveFlags): ...to new function.
(libxlDomainRestore): Move guts...
(libxlDomainRestoreFlags): ...to new function.
* src/test/test_driver.c (testDomainSave, testDomainSaveFlags)
(testDomainRestore, testDomainRestoreFlags): Likewise.
* src/xen/xen_driver.c (xenUnifiedDomainSave)
(xenUnifiedDomainSaveFlags, xenUnifiedDomainRestore)
(xenUnifiedDomainRestoreFlags): Likewise.
* src/qemu/qemu_driver.c (qemudDomainSave, qemudDomainRestore):
Rename and move guts.
(qemuDomainSave, qemuDomainSaveFlags, qemuDomainRestore)
(qemuDomainRestoreFlags): ...here.
(qemudDomainSaveFlag): Rename...
(qemuDomainSaveInternal): ...to this, and update callers.
The network driver needs to assign physical devices for use by modes
that use macvtap, keeping track of which physical devices are in use
(and how many instances, when the devices can be shared). Three calls
are added:
networkAllocateActualDevice - finds a physical device for use by the
domain, and sets up the virDomainActualNetDef accordingly.
networkNotifyActualDevice - assumes that the domain was already
running, but libvirtd was restarted, and needs to be notified by each
already-running domain about what interfaces they are using.
networkReleaseActualDevice - decrements the usage count of the
allocated physical device, and frees the virDomainActualNetDef to
avoid later accidentally using the device.
bridge_driver.[hc] - the new APIs. When WITH_NETWORK is false, these
functions are all #defined to be "0" in the .h file (effectively
becoming a NOP) to prevent link errors.
qemu_(command|driver|hotplug|process).c - add calls to the above APIs
in the appropriate places.
tests/Makefile.am - we need to include libvirt_driver_network.la
whenever libvirt_driver_qemu.la is linked, to avoid unreferenced
symbols (in functions that are never called by the test
programs...)
This is the one function outside of domain_conf.c that plays around
with (even modifying) the internals of the virDomainNetDef, and thus
can't be fixed up simply by replacing direct accesses to the fields of
the struct with the GetActual*() access functions.
In this case, we need to check if the defined type is "network", and
if it is *then* check the actual type; if the actual type is "bridge",
then we can at least put the bridgename in a place where it can be
used; otherwise (if type isn't "bridge"), we behave exactly as we used
to - just null out *everything*.
The qemu driver accesses fields in the virDomainNetDef directly, but
with the advent of the virDomainActualNetDef, some pieces of
information may be found in a different place (the ActualNetDef) if
the network connection is of type='network' and that network is of
forward type='bridge|private|vepa|passthrough'. The previous patch
added functions to mask this difference from callers - they hide the
decision making process and just pick the value from the proper place.
This patch uses those functions in the qemu driver as a first step in
making qemu work with the new network types. At this point, the
virDomainActualNetDef is guaranteed always NULL, so the GetActualX()
function will return exactly what the def->X that's being replaced
would have returned (ie bisecting is not compromised).
There is one place (in qemu_driver.c) where the internal details of
the NetDef are directly manipulated by the code, so the GetActual
functions cannot be used there without extra additional code; that
file will be treated in a separate patch.
The virtPortProfile in the domain interface struct is now a separately
allocated object *pointed to by* (rather than contained in) the main
virDomainNetDef object. This is done to make it easier to figure out
when a virtualPortProfile has/hasn't been specified in a particular
config.
Otherwise, an ABI mismatch gives error messages attributing the target
xml string as current, and the current domain state as the new xml.
* src/qemu/qemu_migration.c (qemuMigrationBegin): Use correct
argument order.
Since libvirt is multi-threaded, we should use FD_CLOEXEC as much
as possible in the parent, and only relax fds to inherited after
forking, to avoid leaking an fd created in one thread to a fork
run in another thread. This gets us closer to that ideal, by
making virCommand automatically clear FD_CLOEXEC on fds intended
for the child, as well as avoiding a window of time with non-cloexec
pipes created for capturing output.
* src/util/command.c (virExecWithHook): Use CLOEXEC in parent. In
child, guarantee that all fds to pass to child are inheritable.
(getDevNull): Use CLOEXEC.
(prepareStdFd): New helper function.
(virCommandRun, virCommandRequireHandshake): Use pipe2.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Simplify caller.
This patch implements cfs_period and cfs_quota's modification.
We can use the command 'virsh schedinfo' to query or modify cfs_period and
cfs_quota.
If you query period or quota from config file, the value 0 means it does not set
in the config file.
If you set period or quota to config file, the value 0 means that delete current
setting from config file.
If you modify period or quota while vm is running, the value 0 means that use
current value.
Starting/ending jobs when closing the connection may reset any
error which was reported earlier in p2p migration. We must
save the original error before doing so. This means we can also
just call virConnectClose as normal, instead of virUnrefConnect
* src/qemu/qemu_migration.c: Preserve errors in p2p migration
There were two API in driver.c that were silently masking flags
bits prior to calling out to the drivers, and several others
that were explicitly masking flags bits. This is not
forward-compatible - if we ever have that many flags in the
future, then talking to an old server that masks out the
flags would be indistinguishable from talking to a new server
that can honor the flag. In general, libvirt.c should forward
_all_ flags on to drivers, and only the drivers should reject
unknown flags.
In the case of virDrvSecretGetValue, the solution is to separate
the internal driver callback function to have two parameters
instead of one, with only one parameter affected by the public
API. In the case of virDomainGetXMLDesc, it turns out that
no one was ever mixing VIR_DOMAIN_XML_INTERNAL_STATUS with
the dumpxml path in the first place; that internal flag was
only used in saving and restoring state files, which happened
to be in functions internal to a single file, so there is no
mixing of the internal flag with a public flags argument.
Additionally, virDomainMemoryStats passed a flags argument
over RPC, but not to the driver.
* src/driver.h (VIR_DOMAIN_XML_FLAGS_MASK)
(VIR_SECRET_GET_VALUE_FLAGS_MASK): Delete.
(virDrvSecretGetValue): Separate out internal flags.
(virDrvDomainMemoryStats): Provide missing flags argument.
* src/driver.c (verify): Drop unused check.
* src/conf/domain_conf.h (virDomainObjParseFile): Delete
declaration.
(virDomainXMLInternalFlags): Move...
* src/conf/domain_conf.c: ...here. Delete redundant include.
(virDomainObjParseFile): Make static.
* src/libvirt.c (virDomainGetXMLDesc, virSecretGetValue): Update
clients.
(virDomainMemoryPeek, virInterfaceGetXMLDesc)
(virDomainMemoryStats, virDomainBlockPeek, virNetworkGetXMLDesc)
(virStoragePoolGetXMLDesc, virStorageVolGetXMLDesc)
(virNodeNumOfDevices, virNodeListDevices, virNWFilterGetXMLDesc):
Don't mask unknown flags.
* src/interface/netcf_driver.c (interfaceGetXMLDesc): Reject
unknown flags.
* src/secret/secret_driver.c (secretGetValue): Update clients.
* src/remote/remote_driver.c (remoteSecretGetValue)
(remoteDomainMemoryStats): Likewise.
* src/qemu/qemu_process.c (qemuProcessGetVolumeQcowPassphrase):
Likewise.
* src/qemu/qemu_driver.c (qemudDomainMemoryStats): Likewise.
* daemon/remote.c (remoteDispatchDomainMemoryStats): Likewise.
The regression is introduced by Commit da1eba6b, the new
codes with this commit doesn't reset "ret" to "-1" when
it fails on parsing the device XML (live device attachment)
This patch changes the codes to reset the "ret" and "-1",
and also changes the codes so that it don't modify "ret"
for condition checking.
How to reproduce:
% cat test.xml
<disk type='oops' device='disk'>
<driver name='qemu' type='raw'/>
<source file='/var/lib/libvirt/images/test.img'/>
<target dev='vda' bus='virtio'/>
</disk>
% virsh attach-device $domain test.xml
Device attached successfully
The device attachment failed actually with error "unknown disk type 'oops'",
however, it reports success.
Commit f548480b broke migration v3 on qemu, because the driver
passed flags on through to qemu_migration even though
qemu_migration wasn't using those flags.
* src/qemu/qemu_migration.h (QEMU_MIGRATION_FLAGS): New define.
* src/qemu/qemu_driver.c: Simplify all migration callbacks.
* src/qemu/qemu_migration.c (qemuMigrationConfirm): Fix regression.
The previous patches only cleaned up ATTRIBUTE_UNUSED flags cases;
auditing the drivers found other places where flags was being used
but not validated. In particular, domainGetXMLDesc had issues with
clients accepting a different set of flags than the common
virDomainDefFormat helper function.
* src/conf/domain_conf.c (virDomainDefFormat): Add common flag check.
* src/uml/uml_driver.c (umlDomainAttachDeviceFlags)
(umlDomainDetachDeviceFlags): Reject unknown
flags.
* src/vbox/vbox_tmpl.c (vboxDomainGetXMLDesc)
(vboxDomainAttachDeviceFlags)
(vboxDomainDetachDeviceFlags): Likewise.
* src/qemu/qemu_driver.c (qemudDomainMemoryPeek): Likewise.
(qemuDomainGetXMLDesc): Document common flag handling.
* src/libxl/libxl_driver.c (libxlDomainGetXMLDesc): Likewise.
* src/lxc/lxc_driver.c (lxcDomainGetXMLDesc): Likewise.
* src/openvz/openvz_driver.c (openvzDomainGetXMLDesc): Likewise.
* src/phyp/phyp_driver.c (phypDomainGetXMLDesc): Likewise.
* src/test/test_driver.c (testDomainGetXMLDesc): Likewise.
* src/vmware/vmware_driver.c (vmwareDomainGetXMLDesc): Likewise.
* src/xenapi/xenapi_driver.c (xenapiDomainGetXMLDesc): Likewise.
This patch extends qemudDomainSetVcpusFlags() function to support
VIR_DOMAIN_AFFECT_CURRENT flag.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
When qemuMonitorCloseFileHandle is called in error path, we need to
preserve the original error since a possible further error when running
closefd monitor command is not very useful to users.
When creating new qemu process we saved domain status XML only after the
process was fully setup and running. In case libvirtd was killed before
the whole process finished, once libvirtd started again it didn't know
anything about the new process and we end up with an orphaned qemu
process. Let's save the domain status XML as soon as we know the PID so
that libvirtd can kill the process on restart.
The compiler might optimize based on our declaration that something
is unused. Putting that declaration in the header risks getting
out of sync with the actual implementation, so it belongs better
only in the .c files. We were mostly compliant, and a new syntax
check will help us in the future.
* cfg.mk (sc_avoid_attribute_unused_in_header): New syntax check.
* src/nodeinfo.h (nodeGetCPUStats, nodeGetMemoryStats): Delete
attribute already present in .c file.
* src/qemu/qemu_domain.h (qemuDomainEventFlush): Likewise.
* src/util/virterror_internal.h (virReportErrorHelper): Parameters
are actually used by .c file.
* src/xenxs/xen_sxpr.h (xenFormatSxprDisk): Adjust prototype.
* src/xenxs/xen_sxpr.c (xenFormatSxprDisk): Delete unused argument.
(xenFormatSxpr): Adjust caller.
* src/xen/xend_internal.c (xenDaemonAttachDeviceFlags)
(xenDaemonUpdateDeviceFlags): Likewise.
Suggested by Daniel Veillard.
Continuation of commit 313ac7fd, and enforce things with a syntax
check.
Technically, virNetServerClientCalculateHandleMode is not printing
a mode_t, but rather a collection of VIR_EVENT_HANDLE_* bits;
however, these bits are < 8, so there is no different in the
output, and that was the easiest way to silence the new syntax check.
* cfg.mk (sc_flags_debug): New syntax check.
(exclude_file_name_regexp--sc_flags_debug): Add exemptions.
* src/fdstream.c (virFDStreamOpenFileInternal): Print flags in
hex, mode_t in octal.
* src/libvirt-qemu.c (virDomainQemuMonitorCommand)
(virDomainQemuAttach): Likewise.
* src/locking/lock_driver_nop.c (virLockManagerNopInit): Likewise.
* src/locking/lock_driver_sanlock.c (virLockManagerSanlockInit):
Likewise.
* src/locking/lock_manager.c: Likewise.
* src/qemu/qemu_migration.c: Likewise.
* src/qemu/qemu_monitor.c: Likewise.
* src/rpc/virnetserverclient.c
(virNetServerClientCalculateHandleMode): Print mode with %o.
When monitor is entered with qemuDomainObjEnterMonitorWithDriver, the
correct method for leaving and unlocking the monitor is
qemuDomainObjExitMonitorWithDriver.
Most of the code in these two functions is supposed to be identical but
currently it isn't (which is natural since the code is duplicated).
Let's move common parts of these functions into qemuMigrationPrepareAny.
This also fixes qemuMigrationPrepareTunnel which didn't store received
lockState in the domain object.
Asynchronous jobs may take long time to finish and may consist of
several phases which we need to now about to help with recovery/rollback
after libvirtd restarts.
Query commands are safe to be called during long running jobs (such as
migration). This patch makes them all work without the need to
special-case every single one of them.
The patch introduces new job.asyncCond condition and associated
job.asyncJob which are dedicated to asynchronous (from qemu monitor
point of view) jobs that can take arbitrarily long time to finish while
qemu monitor is still usable for other commands.
The existing job.active (and job.cond condition) is used all other
synchronous jobs (including the commands run during async job).
Locking schema is changed to use these two conditions. While asyncJob is
active, only allowed set of synchronous jobs is allowed (the set can be
different according to a particular asyncJob) so any method that
communicates to qemu monitor needs to check if it is allowed to be
executed during current asyncJob (if any). Once the check passes, the
method needs to normally acquire job.cond to ensure no other command is
running. Since domain object lock is released during that time, asyncJob
could have been started in the meantime so the method needs to recheck
the first condition. Then, normal jobs set job.active and asynchronous
jobs set job.asyncJob and optionally change the list of allowed job
groups.
Since asynchronous jobs only set job.asyncJob, other allowed commands
can still be run when domain object is unlocked (when communicating to
remote libvirtd or sleeping). To protect its own internal synchronous
commands, the asynchronous job needs to start a special nested job
before entering qemu monitor. The nested job doesn't check asyncJob, it
only acquires job.cond and sets job.active to block other jobs.
EnterMonitor and ExitMonitor methods are very similar to their
*WithDriver variants; consolidate them into EnterMonitorInternal and
ExitMonitorInternal to avoid (mainly future) code duplication.
The LXC and UML drivers can both make use of auditing. Move
the qemu_audit.{c,h} files to src/conf/domain_audit.{c,h}
* src/conf/domain_audit.c: Rename from src/qemu/qemu_audit.c
* src/conf/domain_audit.h: Rename from src/qemu/qemu_audit.h
* src/Makefile.am: Remove qemu_audit.{c,h}, add domain_audit.{c,h}
* src/qemu/qemu_audit.h, src/qemu/qemu_cgroup.c,
src/qemu/qemu_command.c, src/qemu/qemu_driver.c,
src/qemu/qemu_hotplug.c, src/qemu/qemu_migration.c,
src/qemu/qemu_process.c: Update for changed audit API names
Given a PID, the QEMU driver reads /proc/$PID/cmdline and
/proc/$PID/environ to get the configuration. This is fed
into the ARGV->XML convertor to build an XML configuration
for the process.
/proc/$PID/exe is resolved to identify the full command
binary path
After checking for name/uuid uniqueness, an attempt is
made to connect to the monitor socket. If successful
then 'info status' and 'info kvm' are issued to determine
whether the CPUs are running and if KVM is enabled.
* src/qemu/qemu_driver.c: Implement virDomainQemuAttach
* src/qemu/qemu_process.h, src/qemu/qemu_process.c: Add
qemuProcessAttach to connect to the monitor of an
existing QEMU process
When attaching to an external QEMU process, it is neccessary
to check if the process is using KVM or not. This can be done
using a monitor command
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
API for checking if KVM is enabled
To enable attaching to externally launched QEMU, we need
to be able to reverse engineer a guest XML config based
on the argv for a PID in /proc
* src/qemu/qemu_command.c, src/qemu/qemu_command.h: Add
qemuParseCommandLinePid which extracts QEMU config from
argv in /proc, given a PID number
When converting QEMU argv into a virDomainDefPtr, also extract
the pidfile, monitor character device config and the monitor
mode.
* src/qemu/qemu_command.c, src/qemu/qemu_command.h: Extract
pidfile & monitor config from QEMU argv
* src/qemu/qemu_driver.c, tests/qemuargv2xmltest.c: Add extra
params when calling qemuParseCommandLineString
Avoid re-formatting the pidfile path everytime we need it. Create
it once when starting the guest, and preserve it until the guest
is shutdown.
* src/libvirt_private.syms, src/util/util.c,
src/util/util.h: Add virFileReadPidPath
* src/qemu/qemu_domain.h: Add pidfile field
* src/qemu/qemu_process.c: Store pidfile path in qemuDomainObjPrivate
The drivers were accepting domain configs without checking if those
were actually meant for them. For example the LXC driver happily
accepts configs with type QEMU.
Add a check for the expected domain types to the virDomainDefParse*
functions.
If virDomainSaveConfig() failed, we will return NULL to source,
and the vm is still available to restart during confirm() step in
v3 protocol. So we should kill it off in qemuMigrationFinish().
In v2 protocol, we should not set vm to NULL, because we hold
a reference of vm and should unrefernce it.
This patch creates new <bios> element which, at this time has only the
attribute useserial='yes|no'. This attribute allow users to use
Serial Graphics Adapter and see BIOS messages from the very first moment
domain boots up. Therefore, users can choose boot medium, set PXE, etc.
This option accepts 3 values:
-keep, to keep current client connected (Spice+VNC)
-disconnect, to disconnect client (Spice)
-fail, to fail setting password if there is a client connected (Spice)
Add libvirt support for MicroBlaze architecture as a QEMU target. Based on mips/mipsel pattern.
Signed-off-by: John Williams <john.williams@petalogix.com>
Some callers expected virFileMakePath to set errno, some expected
it to return an errno value. Unify this to return 0 on success and
-1 on error. Set errno to report detailed error information.
Also optimize virFileMakePath if stat fails with an errno different
from ENOENT.
add a new API pciDeviceReAttachInit() in pci.c to initialize state values for nodedev reattach
Initialize three state value of device driver to 1. This is just for a new call to
qemudNodeDeviceReAttach()
Although most functions with flags check to verify no application is
passing in flag bits that are currently undefined, for some reason
this function wasn't.
virFileMakePath returns an errno value on error, that will never
be negative. An virFileMakePath error would have been ignored here,
instead of being reported correctly.
Add a new attribute to the <seclabel> XML to allow resource
relabelling to be enabled with static label usage.
<seclabel model='selinux' type='static' relabel='yes'>
<label>system_u:system_r:svirt_t:s0:c392,c662</label>
</seclabel>
* docs/schemas/domain.rng: Add relabel attribute
* src/conf/domain_conf.c, src/conf/domain_conf.h: Parse
the 'relabel' attribute
* src/qemu/qemu_process.c: Unconditionally clear out the
'imagelabel' attribute
* src/security/security_apparmor.c: Skip based on 'relabel'
attribute instead of label type
* src/security/security_selinux.c: Skip based on 'relabel'
attribute instead of label type and fill in <imagelabel>
attribute if relabel is enabled.
Normally the dynamic labelling mode will always use a base
label of 'svirt_t' for VMs. Introduce a <baselabel> field
in the <seclabel> XML to allow this base label to be changed
eg
<seclabel type='dynamic' model='selinux'>
<baselabel>system_u:object_r:virt_t:s0</baselabel>
</seclabel>
* docs/schemas/domain.rng: Add <baselabel>
* src/conf/domain_conf.c, src/conf/domain_conf.h: Parsing
of base label
* src/qemu/qemu_process.c: Don't reset 'model' attribute if
a base label is specified
* src/security/security_apparmor.c: Refuse to support base label
* src/security/security_selinux.c: Use 'baselabel' when generating
label, if available
Detected by Coverity. qemuDomainEventQueue requires a non-NULL
pointer; most callers silently drop the event if we encountered
and OOM situation trying to create the event.
* src/qemu/qemu_migration.c (qemuMigrationFinish): Check for OOM.
Coverity warns if the majority of callers check a function for
errors, but a few don't; but in qemu_audit and qemu_domain, the
choice to not check for failures was safe. In qemu_command, the
failure to generate a uuid can only occur on a bad pointer.
* src/qemu/qemu_audit.c (qemuAuditCgroup): Ignore failure to get
cgroup controller.
* src/qemu/qemu_domain.c (qemuDomainObjEnterMonitor)
(qemuDomainObjEnterMonitorWithDriver): Ignore failure to get
timestamp.
* src/qemu/qemu_command.c (qemuParseCommandLine): Check for error.
The qemudDomainSaveFlag method will call EndJob on the 'vm'
object it is passed in. This can result in the 'vm' object
being free'd if the last reference is removed. Thus no caller
of 'qemudDomainSaveFlag' must *ever* reference 'vm' again
upon return.
Unfortunately qemudDomainSave and qemuDomainManagedSave
both call 'virDomainObjUnlock', which can result in a
crash. This is non-deterministic since it involves a race
with the monitor I/O thread.
Fix this by making qemudDomainSaveFlag responsible for
calling virDomainObjUnlock instead.
* src/qemu/qemu_driver.c: Fix potential use after free
when saving guests
The 'char control[CMSG_SPACE(sizeof(int))];' was not being
wiped, so could potentially contain uninitialized bytes.
While this was harmless in this case, it caused complaints
from valgrind
* src/qemu/qemu_monitor.c: memset 'control' variable
in qemuMonitorIOWriteWithFD
The event handler functions do not free the virJSONValuePtr
object. Every event received from a VM thus caused a memory
leak
* src/qemu/qemu_monitor_json.c: Fix leak of event object
The 'function' field in the PCI address was not correctly
initialized, so it was building the wrong address address
string and so not removing all functions from the in use
list.
* src/qemu/qemu_command.c: Fix initialization of PCI function
If we pass VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG to
qemuGetSchedulerParametersFlags() or *nparams is less than 1,
we will unlock qemu_driver without locking it. It's very dangerous.
We should lock qemu_driver after calling virCheckFlags().
Although we create a temporary file, it is owned by root:root and have
rights 0600. In case qemu does not run under root, it is unable to write
to that file and thus we transfer 0B sized file.
Allow a 'configFile' parameter to be passed into the lock
drivers to provide configuration. Wire up the QEMU driver
to pass in file names '/etc/libvirt/qemu-$NAME.conf
eg qemu-sanlock.conf
* src/locking/lock_driver.h, src/locking/lock_driver_nop.c,
src/locking/lock_driver_sanlock.c, src/locking/lock_manager.c,
src/locking/lock_manager.h: Add configFile parameter
* src/qemu/qemu_conf.c: Pass in configuration file path to
lock driver plugins
The libvirt sanlock plugin is intentionally leaking a file
descriptor to QEMU. To enable QEMU to use this FD under
SELinux, it must be labelled correctly. We dont want to use
the svirt_image_t for this, since QEMU must not be allowed
to actually use the FD. So instead we label it with svirt_t
using virSecurityManagerSetProcessFDLabel
* src/locking/domain_lock.c, src/locking/domain_lock.h,
src/locking/lock_driver.h, src/locking/lock_driver_nop.c,
src/locking/lock_driver_sanlock.c, src/locking/lock_manager.c,
src/locking/lock_manager.h: Optionally pass an FD back to
the hypervisor for security driver labelling
* src/qemu/qemu_process.c: label the lock manager plugin
FD with the process label
The virSecurityManagerSetFDLabel method is used to label
file descriptors associated with disk images. There will
shortly be a need to label other file descriptors in a
different way. So the current name is ambiguous. Rename
the method to virSecurityManagerSetImageFDLabel to clarify
its purpose
* src/libvirt_private.syms,
src/qemu/qemu_migration.c, src/qemu/qemu_process.c,
src/security/security_apparmor.c, src/security/security_dac.c,
src/security/security_driver.h, src/security/security_manager.c,
src/security/security_manager.h, src/security/security_selinux.c,
src/security/security_stack.c: s/FDLabel/ImageFDLabel/
We already have a public virDomainPinVcpu, which implies that
Pin and Vcpu are treated as separate words. Unreleased commit
e261987c introduced virDomainGetVcpupinInfo as the first public
API that used Vcpupin, although we had prior internal uses of
that spelling. For consistency, change the spelling to be two
words everywhere, regardless of whether pin comes first or last.
* daemon/remote.c: Treat vcpu and pin as separate words.
* include/libvirt/libvirt.h.in: Likewise.
* src/conf/domain_conf.c: Likewise.
* src/conf/domain_conf.h: Likewise.
* src/driver.h: Likewise.
* src/libvirt.c: Likewise.
* src/libvirt_private.syms: Likewise.
* src/libvirt_public.syms: Likewise.
* src/libxl/libxl_driver.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
* src/remote/remote_driver.c: Likewise.
* src/xen/xend_internal.c: Likewise.
* tools/virsh.c: Likewise.
* src/remote/remote_protocol.x: Likewise.
* src/remote_protocol-structs: Likewise.
Suggested by Matthias Bolte.
This patch implements the code to address the new API (virDomainGetVcpupinInfo)
in the qemu driver.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Use NUMA's older nodemask_t (fixed-size map) rather than the newer
'struct bitmask' (variable-size) in order to still compile on RHEL 5,
with its numactl-devel-0.9.8.
* src/qemu/qemu_process.c [HAVE_NUMA]: Prefer back-compat mode.
(qemuProcessInitNumaMemoryPolicy): Use older nodemask_t.
The qemuMigrationPrepareDirect/PrepareTunnel methods accidentally
set the domain job to QEMU_JOB_MIGRATION_OUT when it should have
been QEMU_JOB_MIGRATION_IN. This didn't have any ill-effect, but
it is none-the-less wrong.
* src/qemu/qemu_migration.c: Fix job type
The code emitting taint warnings was mistakenly thinking
that guests run from the QEMU session driver were tainted
for having high privileges. This is of course nonsense
since the session driver is always unprivileged
* src/qemu/qemu_domain.c: Don't warn for high privileges in
non-privileged QEMU
If an application is using libvirt + KVM as a piece of its
internal infrastructure to perform a specific task, it can
be desirable to guarentee the VM dies when the virConnectPtr
disconnects from libvirtd. This ensures the app can't leak
any VMs it was using. Adding VIR_DOMAIN_START_AUTOKILL as
a flag when starting guests enables this to be done.
* include/libvirt/libvirt.h.in: All VIR_DOMAIN_START_AUTOKILL
* src/qemu/qemu_driver.c: Support automatic killing of guests
upon connection close
* tools/virsh.c: Add --autokill flag to 'start' and 'create'
commands
Migration is a multi-step process
1. Begin(src)
2. Prepare(dst)
3. Perform(src)
4. Finish(dst)
5. Confirm(src)
At step 2, a QEMU process is lauched in the destination to
accept the incoming migration. Occasionally the process
that is controlling the migration workflow aborts, and fails
to call step 4, Finish. This leaves a QEMU process running
on the target (albeit with paused CPUs). Unfortunately because
step 2 actives a job on the QEMU process, it is unkillable by
normal means.
By registering the VM for autokill against the src virConnectPtr
in step 2, we can ensure that the guest is forcefully killed off
if the connection is closed without step 4 being invoked
* src/qemu/qemu_migration.c: Register autokill in PrepareDirect
and PrepareTunnel. Unregister autokill on successful run
of Finish
* src/qemu/qemu_process.c: Unregister autokill when stopping a
process
Sometimes it is useful to be able to automatically destroy a guest when
a connection is closed. For example, kill an incoming migration if
the client managing the migration dies. This introduces a map between
guest 'uuid' strings and virConnectPtr objects. When a connection is
closed, any associated guests are killed off.
* src/qemu/qemu_conf.h: Add autokill hash table to qemu driver
* src/qemu/qemu_process.c, src/qemu/qemu_process.h: Add APIs
for performing autokill of guests associated with a connection
* src/qemu/qemu_driver.c: Initialize autodestroy map
For controlled shutdown we issue a 'system_powerdown' command
to the QEMU monitor. This triggers an ACPI event which (most)
guest OS wire up to a controlled shutdown. There is no equiv
ACPI event to trigger a controlled reboot. This patch attempts
to fake a reboot.
- In qemuDomainObjPrivatePtr we have a bool fakeReboot
flag.
- The virDomainReboot method sets this flag and then
triggers a normal 'system_powerdown'.
- The QEMU process is started with '-no-shutdown'
so that the guest CPUs pause when it powers off the
guest
- When we receive the 'POWEROFF' event from QEMU JSON
monitor if fakeReboot is not set we invoke the
qemuProcessKill command and shutdown continues
normally
- If fakeReboot was set, we spawn a background thread
which issues 'system_reset' to perform a warm reboot
of the guest hardware. Then it issues 'cont' to
start the CPUs again
* src/qemu/qemu_command.c: Add -no-shutdown flag if
we have JSON support
* src/qemu/qemu_domain.h: Add 'fakeReboot' flag to
qemuDomainObjPrivate struct
* src/qemu/qemu_driver.c: Fake reboot using the
system_powerdown command if JSON support is available
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
binding for system_reset command
* src/qemu/qemu_process.c: Reset the guest & start CPUs if
fakeReboot is set
GCC complained about a C99 for-loop declaration outside of C99 mode
when compiling on RHEL 5.
* src/qemu/qemu_driver.c (qemudDomainPinVcpuFlags): Avoid C99 for
loop, since gcc 4.1.2 hates it.
For virtio disks and interfaces, qemu allows users to enable or disable
ioeventfd feature. This means, qemu can execute domain code, while
another thread waits for I/O event. Basically, in some cases it is win,
in some loss. This feature is available via 'ioeventfd' attribute in disk
and interface <driver> element. It accepts 'on' and 'off'. Leaving this
attribute out defaults to hypervisor decision.
The following patch addresses the problem that when a PASSTHROUGH
mode DIRECT NIC connection is made the MAC address of the NIC is
not automatically set and reset to the configured VM MAC and
back again.
The attached patch fixes this problem by setting and resetting the MAC
while remembering the previous setting while the VM is running.
This also works if libvirtd is restarted while the VM is running.
the patch passes make syntax-check
Since we virEventRegisterDefaultImpl is now a public API, callers need
a way to invoke the default registered Handle and Timeout functions. We
already have general functions for these internally, so promote
them to the public API.
v2:
Actually add APIs to libvirt.h
Pinning to all physical cpus means resetting, hence it is preferable to
delete vcpupin setting of XML.
This patch changes qemu driver to delete vcpupin setting by invoking
virDomainVcpupinDel API when pinning the specified virtual cpu to
all host physical cpus.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Implemented as setting NUMA policy between fork and exec as a hook,
using libnuma. Only support memory tuning on domain process currently.
For the nodemask out of range, will report soft warning instead of
hard error in libvirt layer. (Kernel will be silent as long as one
of set bit in the nodemask is valid on the host. E.g. For a host
has two NUMA nodes, kernel will be silent for nodemask "01010101").
So, soft warning is the only thing libvirt can do, as one might want
to specify the numa policy prior to a node that doesn't exist yet,
however, it may come as hotplug soon.
If the 'mac_filter' configuration parameter is enabled, and there
is a failure to enable filtering, no error is reported back to
the caller. Also fix some bogus whitespace indentation for
hugetlbfs_mount
* src/qemu/qemu_conf.c: Add missing error reporting
Prefer bootindex=N option for -device over the old way -boot ORDER
possibly accompanied with boot=on option for -drive. This gives us full
control over which device will actually be used for booting guest OS.
Moreover, if qemu doesn't support boot=on, this is the only way to boot
of certain disks in some configurations (such as virtio disks when used
together IDE disks) without transforming domain XML to use per device
boot elements.
When an operation started by virDomainBlockPullAll completes (either with
success or with failure), raise an event to indicate the final status. This
allows an API user to avoid polling on virDomainBlockPullInfo if they would
prefer to use the event mechanism.
* daemon/remote.c: Dispatch events to client
* include/libvirt/libvirt.h.in: Define event ID and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle the new event
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for block_stream completion and emit a libvirt block pull event
* src/remote/remote_driver.c: Receive and dispatch events to application
* src/remote/remote_protocol.x: Wire protocol definition for the event
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for BLOCK_STREAM_COMPLETED event
from QEMU monitor
Signed-off-by: Adam Litke <agl@us.ibm.com>
The virDomainBlockPull* family of commands are enabled by the
'block_stream' and 'info block_stream' qemu monitor commands.
* src/qemu/qemu_driver.c src/qemu/qemu_monitor_text.[ch]: implement disk
streaming by using the stream and info stream text monitor commands
* src/qemu/qemu_monitor_json.[ch]: implement commands using the qmp monitor
Signed-off-by: Adam Litke <agl@us.ibm.com>
Acked-by: Daniel P. Berrange <berrange@redhat.com>
From a security pov copy and paste between the guest and the client is not
always desirable. So we need to be able to enable/disable this. The best place
to do this from an administration pov is on the hypervisor, so the qemu cmdline
is getting a spice disable-copy-paste option, see bug 693645. Example qemu
invocation:
qemu -spice port=5932,disable-ticketing,disable-copy-paste
https://bugzilla.redhat.com/show_bug.cgi?id=693661
If qemu supports -chardev, our char frontend aliases are ex. 'charserial0'
not just 'serial0'. Typically we don't use this code path because the
pty's are scraped from stdout.
Qemu once supported following memory stats which will returned by
"query_balloon":
stat_put(dict, "actual", actual);
stat_put(dict, "mem_swapped_in", dev->stats[VIRTIO_BALLOON_S_SWAP_IN]);
stat_put(dict, "mem_swapped_out", dev->stats[VIRTIO_BALLOON_S_SWAP_OUT]);
stat_put(dict, "major_page_faults", dev->stats[VIRTIO_BALLOON_S_MAJFLT]);
stat_put(dict, "minor_page_faults", dev->stats[VIRTIO_BALLOON_S_MINFLT]);
stat_put(dict, "free_mem", dev->stats[VIRTIO_BALLOON_S_MEMFREE]);
stat_put(dict, "total_mem", dev->stats[VIRTIO_BALLOON_S_MEMTOT]);
But it later disabled all the stats except "actual" by commit
07b0403dfc2b2ac179ae5b48105096cc2d03375a.
libvirt doesn't parse "actual", so user will always see a empty result
with "virsh dommemstat $domain". Even qemu haven't disabled the stats,
we should support parsing "actual".
There is the case where cpu affinites for vcpu of qemu doesn't work
correctly. For example, if only one vcpupin setting entry is provided
and its setting is not for vcpu0, it doesn't work.
# virsh dumpxml VM
...
<vcpu>4</vcpu>
<cputune>
<vcpupin vcpu='3' cpuset='9-11'/>
</cputune>
...
# virsh start VM
Domain VM started
# virsh vcpuinfo VM
VCPU: 0
CPU: 31
State: running
CPU time: 2.5s
CPU Affinity: yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
VCPU: 1
CPU: 12
State: running
CPU time: 0.9s
CPU Affinity: yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
VCPU: 2
CPU: 30
State: running
CPU time: 1.5s
CPU Affinity: yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
VCPU: 3
CPU: 13
State: running
CPU time: 1.7s
CPU Affinity: yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
This patch fixes this problem.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
This patch deprecates following enums:
VIR_DOMAIN_MEM_CURRENT
VIR_DOMAIN_MEM_LIVE
VIR_DOMAIN_MEM_CONFIG
VIR_DOMAIN_VCPU_LIVE
VIR_DOMAIN_VCPU_CONFIG
VIR_DOMAIN_DEVICE_MODIFY_CURRENT
VIR_DOMAIN_DEVICE_MODIFY_LIVE
VIR_DOMAIN_DEVICE_MODIFY_CONFIG
And modify internal codes to use virDomainModificationImpact.
The below patch decreases the response time of libvirt to errors reported by Qemu upon startup by checking whether the qemu process is still alive while polling for the local socket to show up.
This patch also introduces a special handling of signal for the Win32 part of virKillProcess.
If qemu supports multi function PCI device, the format of the PCI address passed
to qemu is "bus=pci.0,multifunction=on,addr=slot.function".
If qemu does not support multi function PCI device, the format of the PCI address
passed to qemu is "bus=pci.0,addr=slot".
Hot pluging/unpluging multi PCI device is not supported now. So the function
of hotplugged PCI device must be 0. When we hot unplug it, we should set release
all functions in the slot.
We save all used PCI address in the hash table. The key is generated by domain,
bus and slot now. We will support multi function PCI device, so the key should
be generated by domain, bus, slot and function.
We do not support to hot unplug multi function PCI device now. If the device is
one function of multi function PCI device, we shoul not allow to hot unplugg
it.
Detected by Coverity. All existing callers happen to be in
range, so this isn't too serious.
* src/qemu/qemu_cgroup.c (qemuCgroupControllerActive): Check
bounds before dereference.
When peer-2-peer migration was invoked by a client supporting
v3, but where the target server only supported v2, we'd not
correctly shutdown the guest.
* src/qemu/qemu_migration.c: Ensure guest is shutdown in
v2 peer 2 peer migration
The v2 migration protocol doesn't use cookies, so we should not
be raising an error if the cookie parameters are NULL.
* src/qemu/qemu_migration.c: Don't raise error if cookie is NULL
The error code for virKillProcess is returned in the errno variable
not the return value. THis mistake caused the logs to be filled with
errors when shutting down QEMU processes
* src/qemu/qemu_process.c: Fix process kill check.
This commit is safe precisely because there has been no release
for any of the enum values being deleted (they were added post-0.9.1).
After the 0.9.2 release, we can then take advantage of
virDomainModificationImpact in more places.
* include/libvirt/libvirt.h.in (virDomainModificationImpact): New
enum.
(virDomainSchedParameterFlags, virMemoryParamFlags): Delete, since
these were never released, and the new enum works fine here.
* src/libvirt.c (virDomainGetMemoryParameters)
(virDomainSetMemoryParameters)
(virDomainGetSchedulerParametersFlags)
(virDomainSetSchedulerParametersFlags): Update documentation.
* src/qemu/qemu_driver.c (qemuDomainSetMemoryParameters)
(qemuDomainGetMemoryParameters, qemuSetSchedulerParametersFlags)
(qemuSetSchedulerParameters, qemuGetSchedulerParametersFlags)
(qemuGetSchedulerParameters): Adjust clients.
* tools/virsh.c (cmdSchedinfo, cmdMemtune): Likewise.
Based on ideas by Daniel Veillard and Hu Tao.
Detected by Coverity. This leaked a cpumap on every iteration
of the loop. Leak introduced in commit 1cc4d02 (v0.9.0).
* src/qemu/qemu_process.c (qemuProcessSetVcpuAffinites): Plug
leak, and hoist allocation outside loop.
In v3 migration, once migration is completed, the VM needs
to be left in a paused state until after Finish3 has been
executed on the target. Only then will the VM be killed
off. When using non-JSON QEMU monitor though, we don't
receive any 'STOP' event from QEMU, so we need to manually
set our state offline & thus release lock manager leases.
It doesn't hurt to run this on the JSON case too, just in
case the event gets lost somehow
* src/qemu/qemu_migration.c: Explicitly set VM state to
paused when migration completes
The change 18c2a59206 caused
some regressions in behaviour of virDomainBlockStats
and virDomainBlockInfo in the QEMU driver.
The virDomainBlockInfo API stopped working for inactive
guests if querying a block device.
The virDomainBlockStats API did not promptly report
an error if the guest was not running in some cases.
* src/qemu/qemu_driver.c: Fix inactive guest handling
in BlockStats/Info APIs
The qemuAuditDisk calls in disk hotunplug operations were being
passed 'ret >= 0', but the code which sets ret to 0 was not yet
executed, and the error path had already jumped to the 'cleanup'
label. This meant hotunplug failures were never audited, and
hotunplug success was audited as a failure
* src/qemu/qemu_hotplug.c: Fix auditing of hotunplug
Commit 4454a9efc7 introduced bad
behaviour on the VIR_EVENT_HANDLE_ERROR condition. This condition
is only hit when an invalid FD is used in poll() (typically due
to a double-close bug). The QEMU monitor code was treating this
condition as non-fatal, and thus libvirt would poll() in a fast
loop forever burning 100% CPU. VIR_EVENT_HANDLE_ERROR must be
handled in the same way as VIR_EVENT_HANDLE_HANGUP, killing the
QEMU instance.
* src/qemu/qemu_monitor.c: Treat VIR_EVENT_HANDLE_ERROR as EOF
* src/conf/domain_conf.c, src/conf/domain_conf.h: APIs for
inserting/finding/removing virDomainLeaseDefPtr instances
* src/qemu/qemu_driver.c: Wire up hotplug/unplug for leases
* src/qemu/qemu_hotplug.h, src/qemu/qemu_hotplug.c: Support
for hotplug and unplug of leases
Some lock managers associate state with leases, allowing a process
to temporarily release its leases, and re-acquire them later, safe
in the knowledge that no other process has acquired + released the
leases in between.
This is already used between suspend/resume operations, and must
also be used across migration. This passes the lockstate in the
migration cookie. If the lock manager uses lockstate, then it
becomes compulsory to use the migration v3 protocol to get the
cookie support.
* src/qemu/qemu_driver.c: Validate that migration v2 protocol is
not used if lock manager needs state transfer
* src/qemu/qemu_migration.c: Transfer lock state in migration
cookie XML
The QEMU integrates with the lock manager instructure in a number
of key places
* During startup, a lock is acquired in between the fork & exec
* During startup, the libvirtd process acquires a lock before
setting file labelling
* During shutdown, the libvirtd process acquires a lock
before restoring file labelling
* During hotplug, unplug & media change the libvirtd process
holds a lock while setting/restoring labels
The main content lock is only ever held by the QEMU child process,
or libvirtd during VM shutdown. The rest of the operations only
require libvirtd to hold the metadata locks, relying on the active
QEMU still holding the content lock.
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h,
src/qemu/libvirtd_qemu.aug, src/qemu/test_libvirtd_qemu.aug:
Add config parameter for configuring lock managers
* src/qemu/qemu_driver.c: Add calls to the lock manager
Update the qemuDomainMigrateBegin method so that it accepts
an optional incoming XML document. This will be validated
for ABI compatibility against the current domain config,
and if this check passes, will be passed back out for use
by the qemuDomainMigratePrepare method on the target
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h,
src/qemu/qemu_migration.c: Allow custom XML to be passed
Currently the QEMU monitor I/O handler code uses errno values
to report errors. This results in a sub-optimal error messages
on certain conditions, in particular when parsing JSON strings
malformed data simply results in 'EINVAL'.
This changes the code to use the standard libvirt error reporting
APIs. The virError is stored against the qemuMonitorPtr struct,
and when a monitor API is run, any existing stored error is copied
into that thread's error local
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_text.c: Use
virError APIs for all monitor I/O handling code
Currently whenever there is any failure with parsing the monitor,
this is treated in the same was as end-of-file (ie QEMU quit).
The domain is terminated, if not already dead.
With this change, failures in parsing the monitor stream do not
result in the death of QEMU. The guest continues running unchanged,
but all further use of the monitor will be disabled.
The VMM_FAILURE event will be emitted, and the mgmt application
can decide when to kill/restart the guest to re-gain control
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Run a
different callback for monitor EOF vs error conditions.
* src/qemu/qemu_process.c: Emit VMM_FAILURE event when monitor
fails
* src/qemu/qemu_driver.c (qemuGetSchedulerParameters): Move
guts...
(qemuGetSchedulerParametersFlags): ...to new callback, and honor
flags more accurately.
This patch allows to modify interfaces of domain(qemu)
* src/conf/domain_conf.c src/conf/domain_conf.h src/libvirt_private.syms:
(virDomainNetInsert) : Insert a network device to domain definition.
(virDomainNetIndexByMac) : Returns an index of net device in array.
(virDomainNetRemoveByMac): Remove a NIC of passed MAC address.
* src/qemu/qemu_driver.c
(qemuDomainAttachDeviceConfig): add codes for NIC.
(qemuDomainDetachDeviceConfig): add codes for NIC.
Originally most of libvirt domain-specific calls were blocking
during a migration.
A new mechanism to allow specific calls (blkstat/blkinfo) to be
executed in such condition has been implemented.
In the long term it'd be desirable to get a more general
solution to mark further APIs as migration safe, without needing
special case code.
* src/qemu/qemu_migration.c: add some additional job signal
flags for doing blkstat/blkinfo during a migration
* src/qemu/qemu_domain.c: add a condition variable that can be
used to efficiently wait for the migration code to clear the
signal flag
* src/qemu/qemu_driver.c: execute blkstat/blkinfo using the
job signal flags during migration
When modifying the disk devices of a live domain and the domain
configuration, the function qemuDomainAttachDeviceConfig
first sets dev->data->disk to NULL. Later qemuDomainAttachDeviceLive
accesses dev->data.disk and causes a segfault.
* src/qemu/qemu_driver.c: fix qemuDomainModifyDeviceFlags() accordingly
http://lists.gnu.org/archive/html/qemu-devel/2011-05/threads.html#02162
Currently, qemu silently clips any JSON integer in the range
0x8000000000000000 - 0xffffffffffffffff (all numbers in this range
will be clipped to 0x7fffffffffffffff == LLONG_MAX).
To avoid this, pass these as signed 64 bit integers in the QMP
request.
The current virDomainMigrateFinish3 method signature attempts to
distinguish two types of errors, by allowing return with ret== 0,
but ddomain == NULL, to indicate a failure to start the guest.
This is flawed, because when ret == 0, there is no way for the
virErrorPtr details to be sent back to the client.
Change the signature of virDomainMigrateFinish3 so it simply
returns a virDomainPtr, in the same way as virDomainMigrateFinish2
The disk locking code will protect against the only possible
failure mode this doesn't account for (loosing conenctivity to
libvirtd after Finish3 starts the CPUs, but before the client
sees the reply for Finish3).
* src/driver.h, src/libvirt.c, src/libvirt_internal.h: Change
virDomainMigrateFinish3 to return a virDomainPtr instead of int
* src/remote/remote_driver.c, src/remote/remote_protocol.x,
daemon/remote.c, src/qemu/qemu_driver.c, src/qemu/qemu_migration.c:
Update for API change
When doing migration, if an error occurs in Perform, it must not
be overwritten during Finish/Confirm steps. If an error occurs
in Finish, it must not be overwritten in Confirm.
Previous commit a9d12c2444 added
code to qemudDomainMigrateFinish2 to preserve the error. This
is not the right place, because it is not applicable in non-p2p
migration. The src/libvirt.c virDomainMigrateV2/3 methods need
code to preserve errors for non-p2p migration, while the
doPeer2PeerMigrate2 and doPeer2PeerMigrate3 methods contain
code to preverse errors for p2p migration.
Remove the bogus error preservation from qemudDomainMigrateFinish2
and qemudDomainMigrateFinish3.
Fix virDomainMigrateV3 and doPeer2PeerMigrate3 so that they
preserve any error hit during the Finish3 step, before invoking
Confirm3.
Finally if qemuMigrationFinish fails to resume the CPUs, it must
preserve the error before tearing down the VM, so that VM cleanup
doesn't overwrite it.
* src/libvirt.c: Preserve error before invoking Confirm3
* src/qemu/qemu_driver.c: Remove bogus error preservation
code in qemudDomainMigrateFinish2/qemudDomainMigrateFinish3
* src/qemu/qemu_migration.c: Preserve error before invoking Confirm3
and after resume fails in qemuMigrationFinish.
* src/libvirt.c: Add further debug lines in helper APIs for
migration
* src/qemu/qemu_migration.c: Add debug lines for all internal
migration API parameters
Even when failing to start CPUs, the finish method was returning
a success result. Fix this so that the QEMU process is killed
off when finish fails under v3 protocol. Also rename the
killOnFinish boolean to 'v3proto' to make it clearer that this
is a tunable based on the migration protocol version
* src/qemu/qemu_driver.c: Update for API change
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Kill
VM in qemuMigrationFinish if failing to start CPUs
The SPICE seamless migration process requires data to be passed
back from the target host, to the source host via a cookie.
The cookie includes the target host's hostname, but this was not
stored, merely validated. This patch explicitly records the
remote hostname after parsing the cookie, and uses it when
initiating the SPICE migration
* qemu/qemu_migration.c: Fix SPICE seamless migration hostname
Before running perform in peer-2-peer migration, the current
guest state must be recorded, so that non-live migration can
currently unpause a running guest on completion.
* src/qemu/qemu_migration.c: Move check for offline guest
to fix non-live migration
The virDomainMigratePerform3 currently has a single URI parameter
whose meaning varies. It is either
- A QEMU migration URI (normal migration)
- A libvirtd connection URI (peer2peer migration)
Unfortunately when using peer2peer migration, without also
using tunnelled migration, it is possible that both URIs are
required.
This adds a second URI parameter to the virDomainMigratePerform3
method, to cope with this scenario. Each parameter how has a fixed
meaning.
NB, there is no way to actually take advantage of this yet,
since virDomainMigrate/virDomainMigrateToURI do not have any
way to provide the 2 separate URIs
* daemon/remote.c, src/remote/remote_driver.c,
src/remote/remote_protocol.x, src/remote_protocol-structs: Add
the second URI parameter to perform3 message
* src/driver.h, src/libvirt.c, src/libvirt_internal.h: Add
the second URI parameter to Perform3 method
* src/libvirt_internal.h, src/qemu/qemu_migration.c,
src/qemu/qemu_migration.h: Update to handle URIs correctly
This extends the v3 migration protocol such that the
virDomainMigrateBegin3 and virDomainMigratePerform3
methods accept an application supplied XML config for
the target VM.
If the 'xmlin' parameter is NULL, then Begin3 uses the
current guest XML as normal. A driver implementing the
Begin3 method should either reject all non-NULL 'xmlin'
parameters, or strictly validate that the app supplied
XML does not change guest ABI.
The Perform3 method also needed the xmlin parameter to
cope with the Peer2Peer migration sequence.
NB it is not yet possible to use this capability since
neither of the public virDomainMigrate/virDomainMigrateToURI
methods have a way to pass in XML.
* daemon/remote.c, src/remote/remote_driver.c,
src/remote/remote_protocol.x, src/remote_protocol-structs:
Add 'remote_string xmlin' parameter to begin3/perform3
RPC messages
* src/libvirt.c, src/driver.h, src/libvirt_internal.h: Add
'const char *xmlin' parameter to Begin3/Perform3 methods
* src/qemu/qemu_driver.c, src/qemu/qemu_migration.c,
src/qemu/qemu_migration.h: Pass xmlin parameter around
migration methods
Saving domain to previously created file changes also its ownership.
This is certainly not what users want if some conditions are met:
it is a regular, local file and dynamic_ownership is off.
NB: the enum that uses the string vnet-host (now changed to vhost-net)
is used in XML, but fortunately that hasn't been in an official
release yet, so it can still be fixed.
Since -vnc uses ':' to separate the address from the port, raw
IPv6 addresses need to be escaped like [addr]:port
* src/qemu/qemu_command.c: Escape raw IPv6 addresses with []
* tests/qemuxml2argvdata/qemuxml2argv-graphics-vnc.args,
tests/qemuxml2argvdata/qemuxml2argv-graphics-vnc.xml: Tweak
to test Ipv6 escaping
* docs/schemas/domain.rng: Allow Ipv6 addresses, or hostnames
in <graphics> listen attributes
The qemuMigrationConfirm method shouldn't deal with final VM
cleanup, since it can be called from the peer2peer migration,
which expects to still use the 'vm' object afterwards.
Push the cleanup code out of qemuMigrationConfirm, into its
caller, qemuDomainMigrateConfirm3
* src/qemu/qemu_driver.c: Add VM cleanup code to
qemuDomainMigrateConfirm3
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Remove
job handling cleanup from qemuMigrationConfirm
To allow new mandatory migration cookie data to be introduced,
add support for checking supported feature flags when parsing
migration cookie.
* src/qemu/qemu_migration.c: Feature flag checking in migration
cookie parsing
This adds a streaming-video=filter|all|off attribute. It is used to change
the behavior of video stream detection in spice, the default is filter (the
default for libvirt is not to specify it - the actual default is defined in
libspice-server.so).
Usage:
<graphics type='spice' autoport='yes'>
<streaming mode='off'/>
</graphics>
Tested with the above and with tests/qemuxml2argvtest.
Signed-off-by: Alon Levy <alevy@redhat.com>
This was discussed in:
https://www.redhat.com/archives/libvir-list/2011-May/msg01370.html
The capabilities code only sets the flag to allow use of vhost-net if
kvm is detected (set if the help string contains "(qemu-kvm-" or
"(kvm-"), but actually vhost-net is available in some qemu builds that
don't have kvm in their name, so just checking for ",vhost=" is enough.
Otherwise qemu is unable to write to it, with the error:
libvir: QEMU error : internal error unable to execute QEMU command 'memsave': Could not open '/var/cache/libvirt/qemu/qemu.mem.RRNvLv'
The v2 migration protocol had a limit on cookie length that was
too small to be useful for QEMU. Avoid generating cookies with
v2 protocol, so that old libvirtd can still reliably migrate a
guest to new libvirtd uses v2 protocol.
* src/qemu/qemu_driver.c: Avoid migration cookies with v2
migration
When generating a cookie for a guest with no data, the
QEMU_MIGRATION_COOKIE_GRAPHICS flag was set even if no
graphics data was added. Avoid setting the flag unless
it was needed, also add a safety check for mig->graphics
being non-NULL
* src/qemu/qemu_migration.c: Avoid cookie crash for guest
with no graphics
Improve invalid argument checks in the size query case. The drivers already
relied on this unchecked behavior.
Relax the implementation of virDomainGet(Memory|Blkio)MemoryParameters
in the drivers and allow to pass more memory than necessary for all
parameters.
params and nparams are essential and cannot be NULL. Check this in
libvirt.c and remove redundant checks from the drivers (e.g. xend).
Instead of enforcing that nparams must point to exact same value as
returned by virDomainGetSchedulerType relax this to a lower bound
check. This is what some drivers (e.g. xen hypervisor and esx)
already did. Other drivers (e.g. xend) didn't check nparams at all
and assumed that there is enough space in params.
Unify the behavior in all drivers to a lower bound check and update
nparams to the number of valid values in params on success.
By running the doTunnelSendAll code in a separate thread, the
main thread can do qemuMigrationWaitForCompletion as with
normal migration. This in turn ensures that job signals work
correctly and that progress monitoring can be done
* src/qemu/qemu_migration.c: Run tunnelled migration in
separate thread
Cancelling the QEMU migration may cause QEMU to flush pending
data on the migration socket. This may in turn block QEMU if
nothing reads from the other end of the socket. Closing the
socket before cancelling QEMU migration avoids this possible
deadlock.
* src/qemu/qemu_migration.c: Close sockets before cancelling
migration on failure
The 'nbytes' variable was not re-initialized to the
buffer size on each iteration of the tunnelled migration
loop. While saferead() will ensure a full read, except
on EOF, it is clearer to use the real buffer size
* src/qemu/qemu_migration.c: Always read full buffer of data
The qemuMigrationWaitForCompletion method contains a loop which
repeatedly queries QEMU to check migration progress, and also
processes job signals (pause, setspeed, setbandwidth, cancel).
The tunnelled migration loop does not currently support this
functionality, but should. Refactor the code to allow it to
be used with tunnelled migration.
Implement the v3 migration protocol, which has two extra
steps, 'begin' on the source host and 'confirm' on the
source host. All other methods also gain both input and
output cookies to allow bi-directional data passing at
all stages.
The QEMU peer2peer migration method gains another impl
to provide the v3 migration. This finally allows migration
cookies to work with tunnelled migration, which is required
for Spice seamless migration & the lock manager transfer
* src/qemu/qemu_driver.c: Wire up migrate v3 APIs
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Add
begin & confirm methods, and peer2peer impl of v3
Merge the doNonTunnelMigrate2 and doTunnelMigrate2 methods
into one doPeer2PeerMigrate2 method, since they are substantially
the same. With the introduction of v3 migration, this will be
even more important, to avoid massive code duplication.
* src/qemu/qemu_migration.c: Merge tunnel & non-tunnel migration
To facilitate the introduction of the v3 migration protocol,
the doTunnelMigrate method is refactored into two pieces. One
piece is intended to mirror the flow of virDomainMigrateVersion2,
while the other is the helper for setting up sockets and processing
the data.
Previously socket setup would be done before the 'prepare' step,
so errors could be dealt with immediately, avoiding need to shut
off the destination QEMU. In the new split, socket setup is done
after the 'prepare' step. This is not a serious problem, since
the control flow already requires calling 'finish' to tear down
the destination QEMU upon several errors.
* src/qemu/qemu_migration.c:
Use the graphics information from the QEMU migration cookie to
issue a 'client_migrate_info' monitor command to QEMU. This causes
the SPICE client to automatically reconnect to the target host
when migration completes
* src/qemu/qemu_migration.c: Set data for SPICE client relocation
before starting migration on src
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
new qemuMonitorGraphicsRelocate() command
Extend the QEMU migration cookie structure to allow information
about the destination host graphics setup to be passed by to
the source host. This will enable seamless migration of any
connected graphics clients
* src/qemu/qemu_migration.c: Add graphics info to migration
cookies
* daemon/libvirtd.c: Always initialize gnutls to enable
x509 cert parsing in QEMU
The migration protocol has support for a 'cookie' parameter which
is an opaque array of bytes as far as libvirt is concerned. Drivers
may use this for passing around arbitrary extra data they might
need during migration. The QEMU driver needs to do a few things:
- Pass hostname/uuid to allow strict protection against localhost
migration attempts
- Pass SPICE/VNC server port from the target back to the source to
allow seamless relocation of client sessions
- Pass lock driver state from source to destination
This patch introduces the basic glue for handling cookies
but only includes the host/guest UUID & name.
* src/libvirt_private.syms: Export virXMLParseStrHelper
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Parsing
and formatting of migration cookies
* src/qemu/qemu_driver.c: Pass in cookie parameters where possible
* src/remote/remote_protocol.h, src/remote/remote_protocol.x: Change
cookie max length to 16384 bytes
The qemuMigrationPrepareTunnel method should not unlock the
qemu driver, since that is the caller's job.
* src/qemu/qemu_migration.c: Fix qemuMigrationPrepareTunnel
unlocking of QEMU driver
Change all the driver struct initializers to use the
C99 style, leaving out unused fields. This will make
it possible to add new APIs without changing every
driver. eg change:
qemudDomainResume, /* domainResume */
qemudDomainShutdown, /* domainShutdown */
NULL, /* domainReboot */
qemudDomainDestroy, /* domainDestroy */
to
.domainResume = qemudDomainResume,
.domainShutdown = qemudDomainShutdown,
.domainDestroy = qemudDomainDestroy,
And get rid of any existing C99 style initializersr which
set NULL, eg change
.listPools = vboxStorageListPools,
.numOfDefinedPools = NULL,
.listDefinedPools = NULL,
.findPoolSources = NULL,
.poolLookupByName = vboxStoragePoolLookupByName,
to
.listPools = vboxStorageListPools,
.poolLookupByName = vboxStoragePoolLookupByName,
Fix some driver names:
s/virDrvCPUCompare/virDrvCompareCPU/
s/virDrvCPUBaseline/virDrvBaselineCPU/
s/virDrvQemuDomainMonitorCommand/virDrvDomainQemuMonitorCommand/
s/virDrvSecretNumOfSecrets/virDrvNumOfSecrets/
s/virDrvSecretListSecrets/virDrvListSecrets/
And some driver struct field names:
s/getFreeMemory/nodeGetFreeMemory/
Only in drivers which use virDomainObj, drivers that query hypervisor
for domain status need to be updated separately in case their hypervisor
supports this functionality.
The reason is also saved into domain state XML so if a domain is not
running (i.e., no state XML exists) the reason will be lost by libvirtd
restart. I think this is an acceptable limitation.
This is needed if we want to transfer a temporary file. If the
transfer is done with iohelper, we might run into a race condition,
where we unlink() file before iohelper is executed.
* src/fdstream.c, src/fdstream.h,
src/util/iohelper.c: Add new option
* src/lxc/lxc_driver.c, src/qemu/qemu_driver.c,
src/storage/storage_driver.c, src/uml/uml_driver.c,
src/xen/xen_driver.c: Expand existing function calls
We were 31/73 on whether to translate; since less than 50% translated
and since VIR_INFO is less than VIR_WARN which also doesn't translate,
this makes sense.
* cfg.mk (sc_prohibit_gettext_markup): Add VIR_INFO, since it
falls between WARN and DEBUG.
* daemon/libvirtd.c (qemudDispatchSignalEvent, remoteCheckAccess)
(qemudDispatchServer): Adjust offenders.
* daemon/remote.c (remoteDispatchAuthPolkit): Likewise.
* src/network/bridge_driver.c (networkReloadIptablesRules)
(networkStartNetworkDaemon, networkShutdownNetworkDaemon)
(networkCreate, networkDefine, networkUndefine): Likewise.
* src/qemu/qemu_driver.c (qemudDomainDefine)
(qemudDomainUndefine): Likewise.
* src/storage/storage_driver.c (storagePoolCreate)
(storagePoolDefine, storagePoolUndefine, storagePoolStart)
(storagePoolDestroy, storagePoolDelete, storageVolumeCreateXML)
(storageVolumeCreateXMLFrom, storageVolumeDelete): Likewise.
* src/util/bridge.c (brProbeVnetHdr): Likewise.
* po/POTFILES.in: Drop src/util/bridge.c.
These VIR_XXXX0 APIs make us confused, use the non-0-suffix APIs instead.
How do these coversions works? The magic is using the gcc extension of ##.
When __VA_ARGS__ is empty, "##" will swallow the "," in "fmt," to
avoid compile error.
example: origin after CPP
high_level_api("%d", a_int) low_level_api("%d", a_int)
high_level_api("a string") low_level_api("a string")
About 400 conversions.
8 special conversions:
VIR_XXXX0("") -> VIR_XXXX("msg") (avoid empty format) 2 conversions
VIR_XXXX0(string_literal_with_%) -> VIR_XXXX(%->%%) 0 conversions
VIR_XXXX0(non_string_literal) -> VIR_XXXX("%s", non_string_literal)
(for security) 6 conversions
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
No syntactic effect; this merely silences some clang warnings.
* src/libxl/libxl_driver.c (libxlDomainSetVcpusFlags): Drop
redundant ret=0 statement.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextDriveDel):
Likewise.
Introduce a virProcessKill function that can be safely called
even when the job mutex is held. This allows virDomainDestroy
to kill any VM even if it is asleep in a monitor job. The PID
will die and the thread asleep on the monitor will then wake
up releasing the job mutex.
* src/qemu/qemu_driver.c: Kill process before using qemuProcessStop
to ensure job is released
* src/qemu/qemu_process.c: Add virProcessKill for killing off
QEMU processes