Leak detected by Coverity, and introduced in commit 93ab585.
Reported by Alex Jia.
* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Free
devices array on error.
https://bugzilla.redhat.com/show_bug.cgi?id=770520
We had two nested loops both trying to use 'i' as the iteration
variable, which can result in an infinite loop when the inner
loop interferes with the outer loop. Introduced in commit 93ab585.
* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Don't
reuse iteration variable across two loops.
The lifetime of the virDomainEventState object is tied to
the lifetime of the driver, which in stateless drivers is
tied to the lifetime of the virConnectPtr.
If we add & remove a timer when allocating/freeing the
virDomainEventState object, we can get a situation where
the timer still triggers once after virDomainEventState
has been freed. The timeout callback can't keep a ref
on the event state though, since that would be a circular
reference.
The trick is to only register the timer when a callback
is registered with the event state & remove the timer
when the callback is unregistered.
The demo for the bug is to run
while true ; do date ; ../tools/virsh -q -c test:///default 'shutdown test; undefine test; dominfo test' ; done
prior to this fix, it will frequently hang and / or
crash, or corrupt memory
Currently all drivers using domain events need to provide a callback
for handling a timer to dispatch events in a clean stack. There is
no technical reason for dispatch to go via driver specific code. It
could trivially be dispatched directly from the domain event code,
thus removing tedious boilerplate code from all drivers
Also fix the libxl & xen drivers to pass 'true' when creating the
virDomainEventState, since they run inside the daemon & thus always
expect events to be present.
* src/conf/domain_event.c, src/conf/domain_event.h: Internalize
dispatch of events from timer callback
* src/libxl/libxl_driver.c, src/lxc/lxc_driver.c,
src/qemu/qemu_domain.c, src/qemu/qemu_driver.c,
src/remote/remote_driver.c, src/test/test_driver.c,
src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/xen/xen_driver.c: Remove all timer dispatch functions
When registering a callback for a particular event some callers
need to know how many callbacks already exist for that event.
While it is possible to ask for a count, this is not free from
race conditions when threaded. Thus the API for registering
callbacks should return the count of callbacks. Also rename
virDomainEventStateDeregisterAny to virDomainEventStateDeregisterID
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Return count of callbacks when
registering callbacks
* src/libxl/libxl_driver.c, src/libxl/libxl_driver.c,
src/qemu/qemu_driver.c, src/remote/remote_driver.c,
src/remote/remote_driver.c, src/uml/uml_driver.c,
src/vbox/vbox_tmpl.c, src/xen/xen_driver.c: Update
for change in APIs
If managed save fails at the right point in time, then the save
image can end up with 0 bytes in length (no valid header), and
our attempts in commit 55d88def to detect and skip invalid save
files missed this case.
* src/qemu/qemu_driver.c (qemuDomainSaveImageOpen): Also unlink
empty file as corrupt. Reported by Dennis Householder.
This chunk of code below repeated in several functions, factor it into
a helper method virDomainLiveConfigHelperMethod to eliminate duplicated code
based on Eric and Adam's suggestion. I have tested it for all the
relevant APIs changed.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
When destroying a domain qemuDomainDestroy kills its qemu process and
starts a new job, which means it unlocks the domain object and locks it
again after some time. Although the object is usually unlocked for a
pretty short time, chances are another thread processing an EOF event on
qemu monitor is able to lock the object first and does all the cleanup
by itself. This leads to wrong shutoff reason and lifecycle event detail
and virDomainDestroy API incorrectly reporting failure to destroy an
inactive domain.
Reported by Charlie Smurthwaite.
Currently qemuDomainAssignPCIAddresses() is called to assign addresses
to PCI devices.
We need to do something similar for devices with spapr-vio addresses.
So create one place where address assignment will be done, that is
qemuDomainAssignAddresses().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Fix a logic error, the initial value of ret = -1, if just set --config,
it will goto endjob directly without doing its really job here.
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
filter 0-device-weight when:
- getting blkio parameters with --config
- starting up a domain
When testing with blkio, I found these issues:
(dom is down)
virsh blkiotune dom --device-weights /dev/sda,300,/dev/sdb,500
virsh blkiotune dom --device-weights /dev/sda,300,/dev/sdb,0
virsh blkiotune dom
weight : 800
device_weight : /dev/sda,200,/dev/sdb,0
# issue 1: shows 0 device weight of /dev/sdb that may confuse user
(continued)
virsh start dom
# issue 2: If /dev/sdb doesn't exist, libvirt refuses to bring the
# dom up because it wants to set the device weight to 0 of a
# non-existing device. Since 0 means no weight-limit, we really don't
# have to set it.
Prior to this patch, for a running dom, the commands:
$ virsh blkiotune dom --device-weights /dev/sda,502,/dev/sdb,498
$ virsh blkiotune dom --device-weights /dev/sda,503
$ virsh blkiotune dom
weight : 500
device_weight : /dev/sda,503
claim that /dev/sdb no longer has a non-default weight, but
directly querying cgroups says otherwise:
$ cat /cgroup/blkio/libvirt/qemu/dom/blkio.weight_device
8:0 503
8:16 498
After this patch, an explicit 0 is required to remove a device path
from the XML, and omitting a device path that was previously
specified leaves that device path untouched in the XML, to match
cgroups behavior.
* src/qemu/qemu_driver.c (parseBlkioWeightDeviceStr): Rename...
(qemuDomainParseDeviceWeightStr): ...and use correct type.
(qemuDomainSetBlkioParameters): After parsing string, modify
rather than replacing existing table.
* tools/virsh.pod (blkiotune): Tweak wording.
Implement the block I/O throttle setting and getting support to qemu
driver.
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The virTimestamp and virTimeMs functions in src/util/util.h
duplicate functionality from virtime.h, in a non-async signal
safe manner. Remove them, and convert all code over to the new
APIs.
* src/util/util.c, src/util/util.h: Delete virTimeMs and virTimestamp
* src/lxc/lxc_driver.c, src/qemu/qemu_domain.c,
src/qemu/qemu_driver.c, src/qemu/qemu_migration.c,
src/qemu/qemu_process.c, src/util/event_poll.c: Convert to use
virtime APIs
Without this, 'virsh blkiotune --live --config --weight=n'
only affected live.
* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Allow
setting both configurations at once.
After the previous patch, there are now some redundant checks.
* src/qemu/qemu_driver.c (qemudDomainGetVcpuPinInfo)
(qemuGetSchedulerParametersFlags): Drop checks now guaranteed by
libvirt.c.
* src/lxc/lxc_driver.c (lxcGetSchedulerParametersFlags):
Likewise.
It requires the domain is running, otherwise fails. Resize to a lower
size is supported, but should be used with extreme caution.
In order to prohibit the "size" overflowing after multiplied by
1024. We do checking in the codes. For QMP mode, the default units
is Bytes, the passed size needs to be multiplied by 1024, however,
for HMP mode, the default units is "Megabytes", the passed "size"
needs to be divided by 1024 then.
Add the core functions that implement the functionality of the API.
Suspend is done by using an asynchronous mechanism so that we can return
the status to the caller before the host gets suspended. This asynchronous
operation is achieved by suspending the host in a separate thread of
execution. However, returning the status to the caller is only best-effort,
but not guaranteed.
To resume the host, an RTC alarm is set up (based on how long we want to
suspend) before suspending the host. When this alarm fires, the host
gets woken up.
Suspend-to-RAM operation on a host running Linux can take upto more than 20
seconds, depending on the load of the system. (Freezing of tasks, an operation
preceding any suspend operation, is given up after a 20 second timeout).
And Suspend-to-Disk can take even more time, considering the time required
for compaction, creating the memory image and writing it to disk etc.
So, we do not allow the user to specify a suspend duration of less than 60
seconds, to be on the safer side, since we don't want to prematurely declare
failure when we only had to wait for some more time.
Generally, functions which return malloc'd strings should be typed
as 'char *', not 'const char *', to make it obvious that the caller
is responsible to free things. free(const char *) fails to compile,
and although we have a cast embedded in VIR_FREE to work around poor
code that frees const char *, it's better to not rely on that hack.
* src/qemu/qemu_driver.c (qemuDiskPathToAlias): Change return type.
(qemuDomainBlockJobImpl): Update caller.
Commit 89b6284f made it possible to pass either a source name or
the target device to most API demanding a disk designation, but
forgot to update the documentation. It also failed to update
virDomainBlockStats to take both forms. This patch fixes both the
documentation and the remaining function.
Xen continues to use just device shorthand (that is, I did not
implement path lookup there, since xen does not track a domain_conf
to quickly tie a path back to the device shorthand).
* src/libvirt.c (virDomainBlockStats, virDomainBlockStatsFlags)
(virDomainGetBlockInfo, virDomainBlockPeek)
(virDomainBlockJobAbort, virDomainGetBlockJobInfo)
(virDomainBlockJobSetSpeed, virDomainBlockPull): Document
acceptable disk naming conventions.
* src/qemu/qemu_driver.c (qemuDomainBlockStats)
(qemuDomainBlockStatsFlags): Allow lookup by source name.
* src/test/test_driver.c (testDomainBlockStats): Likewise.
Rename the macvtap.c file to virnetdevmacvlan.c to reflect its
functionality. Move the port profile association code out into
virnetdevvportprofile.c. Make the APIs available unconditionally
to callers
* src/util/macvtap.h: rename to src/util/virnetdevmacvlan.h,
* src/util/macvtap.c: rename to src/util/virnetdevmacvlan.c
* src/util/virnetdevvportprofile.c, src/util/virnetdevvportprofile.h:
Pull in vport association code
* src/Makefile.am, src/conf/domain_conf.h, src/qemu/qemu_conf.c,
src/qemu/qemu_conf.h, src/qemu/qemu_driver.c: Update include
paths & remove conditional compilation
In preparation for code re-organization, rename the Macvtap
management APIs to have the following patterns
virNetDevMacVLanXXXXX - macvlan/macvtap interface management
virNetDevVPortProfileXXXX - virtual port profile management
* src/util/macvtap.c, src/util/macvtap.h: Rename APIs
* src/conf/domain_conf.c, src/network/bridge_driver.c,
src/qemu/qemu_command.c, src/qemu/qemu_command.h,
src/qemu/qemu_driver.c, src/qemu/qemu_hotplug.c,
src/qemu/qemu_migration.c, src/qemu/qemu_process.c,
src/qemu/qemu_process.h: Update for renamed APIs
Qemu will be the first driver to make use of a typed string in the
next round of additions. Separate out the trivial addition.
* src/qemu/qemu_driver.c (qemudSupportsFeature): Advertise feature.
(qemuDomainGetBlkioParameters, qemuDomainGetMemoryParameters)
(qemuGetSchedulerParametersFlags, qemudDomainBlockStatsFlags):
Allow typed strings flag where trivially supported.
The bridge management APIs in src/util/bridge.c require a brControl
object to be passed around. This holds the file descriptor for the
control socket. This extra object complicates use of the API for
only a minor efficiency gain, which is in turn entirely offset by
the need to fork/exec the brctl command for STP configuration.
This patch removes the 'brControl' object entirely, instead opening
the control socket & closing it again within the scope of each method.
The parameter names for the APIs are also made to consistently use
'brname' for bridge device name, and 'ifname' for an interface
device name. Finally annotations are added for non-NULL parameters
and return check validation
* src/util/bridge.c, src/util/bridge.h: Remove brControl object
and update API parameter names & annotations.
* src/lxc/lxc_driver.c, src/network/bridge_driver.c,
src/uml/uml_conf.h, src/uml/uml_conf.c, src/uml/uml_driver.c,
src/qemu/qemu_command.c, src/qemu/qemu_conf.h,
src/qemu/qemu_driver.c: Remove reference to 'brControl' object
While Xen only has a single paravirt console, UML, and
QEMU both support multiple paravirt consoles. The LXC
driver can also be trivially made to support multiple
consoles. This patch extends the XML to allow multiple
<console> elements in the XML. It also makes the UML
and QEMU drivers support this config.
* src/conf/domain_conf.c, src/conf/domain_conf.h: Allow
multiple <console> devices
* src/lxc/lxc_driver.c, src/xen/xen_driver.c,
src/xenxs/xen_sxpr.c, src/xenxs/xen_xm.c: Update for
internal API changes
* src/security/security_selinux.c, src/security/virt-aa-helper.c:
Only label consoles that aren't a copy of the serial device
* src/qemu/qemu_command.c, src/qemu/qemu_driver.c,
src/qemu/qemu_process.c, src/uml/uml_conf.c,
src/uml/uml_driver.c: Support multiple console devices
* tests/qemuxml2xmltest.c, tests/qemuxml2argvtest.c: Extra
tests for multiple virtio consoles. Set QEMU_CAPS_CHARDEV
for all console /channel tests
* tests/qemuxml2argvdata/qemuxml2argv-channel-virtio-auto.args,
tests/qemuxml2argvdata/qemuxml2argv-channel-virtio.args
tests/qemuxml2argvdata/qemuxml2argv-console-virtio.args: Update
for correct chardev syntax
* tests/qemuxml2argvdata/qemuxml2argv-console-virtio-many.args,
tests/qemuxml2argvdata/qemuxml2argv-console-virtio-many.xml: New
test file
Document the parameter names that will be used by
virDomain{Get,Set}SchedulerParameters{,Flags}, rather than
hard-coding those names in each driver, to match what is
done with memory, blkio, and blockstats parameters.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_SCHEDULER_CPU_SHARES)
(VIR_DOMAIN_SCHEDULER_VCPU_PERIOD)
(VIR_DOMAIN_SCHEDULER_VCPU_QUOTA, VIR_DOMAIN_SCHEDULER_WEIGHT)
(VIR_DOMAIN_SCHEDULER_CAP, VIR_DOMAIN_SCHEDULER_RESERVATION)
(VIR_DOMAIN_SCHEDULER_LIMIT, VIR_DOMAIN_SCHEDULER_SHARES): New
field name macros.
* src/qemu/qemu_driver.c (qemuSetSchedulerParametersFlags)
(qemuGetSchedulerParametersFlags): Use new defines.
* src/test/test_driver.c (testDomainGetSchedulerParamsFlags)
(testDomainSetSchedulerParamsFlags): Likewise.
* src/xen/xen_hypervisor.c (xenHypervisorGetSchedulerParameters)
(xenHypervisorSetSchedulerParameters): Likewise.
* src/xen/xend_internal.c (xenDaemonGetSchedulerParameters)
(xenDaemonSetSchedulerParameters): Likewise.
* src/lxc/lxc_driver.c (lxcSetSchedulerParametersFlags)
(lxcGetSchedulerParametersFlags): Likewise.
* src/esx/esx_driver.c (esxDomainGetSchedulerParametersFlags)
(esxDomainSetSchedulerParametersFlags): Likewise.
* src/libxl/libxl_driver.c (libxlDomainGetSchedulerParametersFlags)
(libxlDomainSetSchedulerParametersFlags): Likewise.
Since all virTypedParameter APIs allow us to return the number
of slots we actually populated, we should allow the user to
call with nparams too small (without overrunning their array)
or too large (ignoring the tail of the array that we can't fill),
rather than requiring that they get things exactly right.
Making this change will make it easier for a future patch to
introduce VIR_TYPED_PARAM_STRING, with filtering in libvirt.c
rather than in every single driver, since users already have
to be prepared for *nparams to be smaller on exit than on entry.
* src/qemu/qemu_driver.c (qemuDomainGetBlkioParameters)
(qemuDomainGetMemoryParameters): Allow variable nparams on entry.
(qemuGetSchedulerParametersFlags): Drop redundant check.
(qemudDomainBlockStats, qemudDomainBlockStatsFlags): Rename...
(qemuDomainBlockStats, qemuDomainBlockStatsFlags): ...to this.
Don't return unavailable stats.
The qemu RBD driver needs access to the conn in order to get the secret
needed for connecting to the ceph cluster.
Signed-off-by: Sage Weil <sage@newdream.net>
The QEMU monitor command 'add_client' can be used to connect to
a VNC or SPICE graphics display. This allows for implementation
of the virDomainOpenGraphics API
* src/qemu/qemu_driver.c: Implement virDomainOpenGraphics
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Add binding for 'add_client' command
Rather than making all clients of monitor commands that are JSON-only
check whether yajl support was compiled in, it is simpler to just
avoid setting the capability bit up front if we can't use the capability.
* src/qemu/qemu_capabilities.c (qemuCapsComputeCmdFlags): Only set
capability bit if we also have yajl library to use it.
* src/qemu/qemu_driver.c (qemuDomainReboot): Drop #ifdefs.
* src/qemu/qemu_process.c (qemuProcessStart): Likewise.
* tests/qemuhelptest.c (testHelpStrParsing): Pass test even
without yajl.
* tests/qemuxml2argvtest.c (mymain): Simplify use of json flag.
* tests/qemuxml2argvdata/qemuxml2argv-disk-drive-error-*.args:
Update expected results to match.
This patch is rather cosmetic as it only moves device alias
assignation from command line construction just before that.
However, it is needed in connotation of previous and next patch.
The improvements to virBuffer, along with a paradigm shift to pass
the original buffer through rather than creating a second buffer,
allow us to shave off quite a few lines of code.
* src/util/sysinfo.h (virSysinfoFormat): Alter signature.
* src/util/sysinfo.c (virSysinfoFormat, virSysinfoBIOSFormat)
(virSysinfoSystemFormat, virSysinfoProcessorFormat)
(virSysinfoMemoryFormat): Change indentation parameter.
* src/conf/domain_conf.c (virDomainSysinfoDefFormat): Adjust
caller.
* src/qemu/qemu_driver.c (qemuGetSysinfo): Likewise.
There is a little difference between the output of domxml-to-native and the actual commandline.
No matter qemu is in control or readline mode, domxml-to-native always converts it to readline mode.
That is because the parameter "monitor_json" for qemuBuildCommandLine() is always set to false
in qemuDomainXMLToNative().
Signed-off-by: tangchen <tangchen@cn.fujitsu.com>
Noticed when testing new libvirt against old qemu that lacked the
snapshot_blkdev HMP command. Libvirt was mistakenly treating the
command as successful, and re-writing the domain XML to use the
just-created 0-byte file, rendering the domain broken on restart.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextDiskSnapshot):
Notice another possible error message.
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive): Don't keep 0-byte file
on failure.
As this is needed. Although some functions check for domain
being active before obtaining job, we need to check it after,
because obtaining job unlocks domain object, during which
a state of domain can be changed.
This patch extends qemudDomainCoreDump so it supports new VIR_DUMP_RESET
flag. If this flag is set, domain is reset on successful dump. However,
this is needed to be done after we start CPUs.
With the recent refactoring of qemu snapshot relationships, it
is now trivial to filter on leaves.
* src/conf/domain_conf.c (virDomainSnapshotObjListCount)
(virDomainSnapshotObjListCopyNames): Handle new flag.
* src/qemu/qemu_driver.c (qemuDomainSnapshotListNames)
(qemuDomainSnapshotNum, qemuDomainSnapshotListChildrenNames)
(qemuDomainSnapshotNumChildren): Pass new flag through.
The previous optimizations lead to some follow-on cleanups.
* src/conf/domain_conf.c (virDomainSnapshotForEachChild)
(virDomainSnapshotForEachDescendant): Drop dead parameter.
(virDomainSnapshotActOnDescendant)
(virDomainSnapshotObjListNumFrom)
(virDomainSnapshotObjListGetNamesFrom): Update callers.
* src/qemu/qemu_driver.c (qemuDomainSnapshotNumChildren)
(qemuDomainSnapshotListChildrenNames, qemuDomainSnapshotDelete):
Likewise.
* src/conf/domain_conf.h: Update prototypes.
Maintain the parent/child relationships of all qemu snapshots.
* src/qemu/qemu_driver.c (qemuDomainSnapshotLoad): Populate
relationships after loading.
(qemuDomainSnapshotCreateXML): Set relations on creation; tweak
redefinition to reuse existing object.
(qemuDomainSnapshotReparentChildren, qemuDomainSnapshotDelete):
Clear relations on delete.
Not too hard to wire up. The trickiest part is realizing that
listing children of a snapshot cannot use SNAPSHOT_LIST_ROOTS,
and that we overloaded that bit to also mean SNAPSHOT_LIST_DESCENDANTS;
we use that bit to decide which iteration to use, but don't want
the existing counting/listing functions to see that bit.
* src/conf/domain_conf.h (virDomainSnapshotObjListNumFrom)
(virDomainSnapshotObjListGetNamesFrom): New prototypes.
* src/conf/domain_conf.c (virDomainSnapshotObjListNumFrom)
(virDomainSnapshotObjListGetNamesFrom): New functions.
* src/libvirt_private.syms (domain_conf.h): Export them.
* src/qemu/qemu_driver.c (qemuDomainSnapshotNumChildren)
(qemuDomainSnapshotListChildrenNames): New functions.
If parsing qemu command line fails (e.g. because of non-existing
process number supplied), we jump to cleanup label where we free
pidfile. Therefore it needs to be initialized. Otherwise we free
random pointer.
Implements the documentation for snapshot revert vs. force.
Part of the patch tightens existing behavior (previously, reverting
to an old snapshot without <domain> was blindly attempted, now it
requires force), while part of it relaxes behavior (previously, it
was not possible to revert an active domain to an ABI-incompatible
active snapshot, now force allows this transition).
* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Check for
risky situations, and allow force to get past them.
Qemu driver tries to update balloon data in virDomainGetInfo and if it
can't do so because there is another monitor job running, it just
reports what's known in domain def. However, if there was no job running
but getting the data from qemu fails, we would fail the whole API. This
doesn't make sense. Let's make the failure nonfatal.
Currently, qemuDomainGetXMLDesc and qemudDomainGetInfo check for
outstanding synchronous job before (eventual) monitor entering.
However, there can be already async job set, e.g. migration.
If the daemon is restarted so we reconnect to monitor, cdrom media
can be ejected. In that case we don't want to show it in domain xml,
or require it on migration destination.
To check for disk status use 'info block' monitor command.
First hypervisor implementation of the new API.
Allows 'virsh snapshot-list --tree' to be more efficient.
* src/qemu/qemu_driver.c (qemuDomainSnapshotGetParent): New
function.
Commit 282fe1f0 documented that transient domains will auto-delete
any snapshot metadata when the last reference to the domain is
removed, and that management apps are in charge of grabbing any
snapshot metadata prior to that point. However, this was not
actually implemented for qemu until now.
* src/qemu/qemu_driver.c (qemudDomainCreate)
(qemuDomainDestroyFlags, qemuDomainSaveInternal)
(qemudDomainCoreDump, qemuDomainRestoreFlags, qemudDomainDefine)
(qemuDomainUndefineFlags, qemuDomainMigrateConfirm3)
(qemuDomainRevertToSnapshot): Clean up snapshot metadata.
* src/qemu/qemu_migration.c (qemuMigrationPrepareAny)
(qemuMigrationPerformJob, qemuMigrationPerformPhase)
(qemuMigrationFinish): Likewise.
* src/qemu/qemu_process.c (qemuProcessHandleMonitorEOF)
(qemuProcessReconnect, qemuProcessReconnectHelper)
(qemuProcessAutoDestroyDom): Likewise.
This patch is mostly code motion - moving some functions out
of qemu_driver and into qemu_domain so they can be reused by
multiple qemu_* files (since qemu_driver.h must not grow).
It also adds a new helper function, qemuDomainRemoveInactive,
which will be used in the next patch.
* src/qemu/qemu_domain.h (qemuFindQemuImgBinary)
(qemuDomainSnapshotWriteMetadata, qemuDomainSnapshotForEachQcow2)
(qemuDomainSnapshotDiscard, qemuDomainSnapshotDiscardAll)
(qemuDomainRemoveInactive): New prototypes.
(struct qemu_snap_remove): New struct.
* src/qemu/qemu_domain.c (qemuDomainRemoveInactive)
(qemuDomainSnapshotDiscardAllMetadata): New functions.
(qemuFindQemuImgBinary, qemuDomainSnapshotWriteMetadata)
(qemuDomainSnapshotForEachQcow2, qemuDomainSnapshotDiscard)
(qemuDomainSnapshotDiscardAll): Move here...
* src/qemu/qemu_driver.c (qemuFindQemuImgBinary)
(qemuDomainSnapshotWriteMetadata, qemuDomainSnapshotForEachQcow2)
(qemuDomainSnapshotDiscard, qemuDomainSnapshotDiscardAll): ...from
here.
(qemuDomainUndefineFlags): Update caller.
* src/conf/domain_conf.c (virDomainRemoveInactive): Doc fixes.
Commit 19f8c98 introduced VIR_DOMAIN_UNDEFINE_SNAPSHOTS_METADATA,
with the intent that omitting the flag makes undefine fail, and
including the flag deletes metadata. But it used the wrong logic.
Also, hoist the transient domain sooner, so that we don't
accidentally remove metadata of a transient domain.
* src/qemu/qemu_driver.c (qemuDomainUndefineFlags): Check correct
flag value.
The commit that prevents disk corruption on domain shutdown
(96fc478417) causes regression with QEMU
0.14.* and 0.15.* because of a regression bug in QEMU that was fixed
only recently in QEMU git. The affected versions of QEMU do not quit on
SIGTERM if started with -no-shutdown, which we use to implement fake
reboot. Since -no-shutdown tells QEMU not to quit automatically on guest
shutdown, domains started using the affected QEMU cannot be shutdown
properly and stay in a paused state.
This patch disables fake reboot feature on such QEMU by not using
-no-shutdown, which makes shutdown work as expected. However,
virDomainReboot will not work in this case and it will report "Requested
operation is not valid: Reboot is not supported with this QEMU binary".
Virsh man page lists driver types to be used with attach-device
command, but does not specify that those are usable only with the XEN
Hypervisor.
This patch adds statement, that those options specified are applicable
only on the Xen hypervisor and adds option usable with qemu emulator.
This patch also changes type of error returned by QEMU driver if the
user specifies incompatible driver type from VIR_ERR_INTERNAL_ERROR to
VIR_ERR_CONFIG_UNSUPPORTED.
For all types of disks other than qcow2, we were requesting that
SELinux labeling visit the new file as if it were qcow2, which
means labeling would try to find the backing files of an empty file.
And for a pre-existing qcow2 disk, we were passing NULL, which meant
that labelling tried to probe the file type (and if probing is
disabled, per the default qemu.conf, this made snapshots fail).
What we really want is to make SELinux labeling visit the new
file as raw; it will later be converted to qcow2 if qemu successfully
made the snapshot.
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive): Force SELinux labeling
to avoid probe of new file.
For external snapshots to be useful on persistent domains, we must
alter the persistent definition alongside the running definition.
Thanks to the possibility of disk hotplug as well as of edits that
only affect the persistent xml, we can't assume that vm->def and
vm->newDef have the same disk at the same index, so we can only
update the persistent copy if the device destination matches up.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive)
(qemuDomainSnapshotCreateSingleDiskActive): Also affect newDef, if
present.
Ever since we introduced fake reboot, we call qemuProcessKill as a
reaction to SHUTDOWN event. Unfortunately, qemu doesn't guarantee it
flushed all internal buffers before sending SHUTDOWN, in which case
killing the process forcibly may result in (virtual) disk corruption.
By sending just SIGTERM without SIGKILL we give qemu time to to flush
all buffers and exit. Once qemu exits, we will see an EOF on monitor
connection and tear down the domain. In case qemu ignores SIGTERM or
just hangs there, the process stays running but that's not any different
from a possible hang anytime during the shutdown process so I think it's
just fine.
Also qemu (since 0.14 until it's fixed) has a bug in SIGTERM processing
which causes it not to exit but instead send new SHUTDOWN event and keep
waiting. I think the best we can do is to ignore duplicate SHUTDOWN
events to avoid a SHUTDOWN-SIGTERM loop and leave the domain in paused
state.
Now that migration speed is stored in qemuDomainObjPrivate structure,
save the new value when invoking qemuDomainMigrateSetMaxSpeed().
Allow setting migration speed on inactive domain too.
Regression introduced in commit 3881a470, due to an improper rebase
of a cleanup written beforehand but only applied after a rebased of
a refactoring that created a new function in commit 25fb3ef.
Also avoids passing NULL to printf %s.
* src/qemu/qemu_driver.c: In qemuDomainSnapshotForEachQcow2()
it free up the memory of qemu_driver->qemuImgBinary in the
cleanup tag which leads to the garbage value of qemuImgBinary
in qemu_driver struct and libvirtd crash when running
"virsh snapshot-create" command a second time.
Signed-off-by: Eric Blake <eblake@redhat.com>
Regression introduced in commit 89b6284fd, due to an incorrect
conversion to the new means of converting disk names back to
the correct object.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Avoid NULL deref.
This patch enables modifying network device configuration using the
virUpdateDeviceFlags API method. Matching of devices is accomplished
using MAC addresses.
While updating live configuration of a running domain, the user is
allowed only to change link state of the interface. Additional
modifications may be added later. For now the code checks for
unsupported changes and thereafter changes the link state, if
applicable.
When updating persistent configuration of guest's network interface the
whole configuration (except for the MAC address) may be modified and
is stored for the next startup.
* src/qemu/qemu_driver.c - Add dispatching of virUpdateDevice for
network devices update (live/config)
* src/qemu/qemu_hotplug.c - add setting of initial link state on live
device addition
- add function to change network device
configuration. By now it supports only
changing of link state
* src/qemu/qemu_hotplug.h - Headers to above functions
* src/qemu/qemu_process.c - set link states before virtual machine
start. Qemu does not support setting of
this on the command line.
The mainly changes are:
1) Update qemuMonitorGetBlockStatsInfo and it's children (Text/JSON)
functions to return the value of new latency fields.
2) Add new function qemuMonitorGetBlockStatsParamsNumber, which is
to count how many parameters the underlying QEMU supports.
3) Update virDomainBlockStats in src/qemu/qemu_driver.c to be
compatible with the changes by 1).
If libvirt daemon gets restarted and there is (at least) one
unresponsive qemu, the startup procedure hangs up. This patch creates
one thread per vm in which we try to reconnect to monitor. Therefore,
blocking in one thread will not affect other APIs.
This patch annotates APIs with low or high priority.
In low set MUST be all APIs which might eventually access monitor
(and thus block indefinitely). Other APIs may be marked as high
priority. However, some must be (e.g. domainDestroy).
For high priority calls (HPC), there are some high priority workers
(HPW) created in the pool. HPW can execute only HPC, although normal
worker can process any call regardless priority. Therefore, only those
APIs which are guaranteed to end in reasonable small amount of time
can be marked as HPC.
The size of this HPC pool is static, because HPC are expected to end
quickly, therefore jobs assigned to this pool will be served quickly.
It can be configured in libvirtd.conf via prio_workers variable.
Default is set to 5.
To mark API with low or high priority, append priority:{low|high} to
it's comment in src/remote/remote_protocol.x. This is similar to
autogen|skipgen. If not marked, the generator assumes low as default.
With this, it is now possible to create external snapshots even
when SELinux is enforcing, and to protect the new file with a
lock manager.
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive): Create and register
new file with proper permissions and locks.
(qemuDomainSnapshotCreateDiskActive): Update caller.
Lots of earlier patches led up to this point - the qemu snapshot_blkdev
monitor command can now be controlled by libvirt! Well, insofar as
SELinux doesn't prevent qemu from open(O_CREAT) on the files. There's
still some followup work before things work with SELinux enforcing,
but this patch is big enough to post now.
There's still room for other improvements, too (for example, taking a
disk snapshot of an inactive domain, by using qemu-img for both internal
and external snapshots; wiring up delete and revert control, including
additional flags from my RFC; supporting active QED disk snapshots;
supporting per-storage-volume snapshots such as LVM or btrfs snapshots;
etc.). But this patch is the one that proves the new XML works!
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Wire in
active disk snapshots.
(qemuDomainSnapshotDiskPrepare)
(qemuDomainSnapshotCreateDiskActive)
(qemuDomainSnapshotCreateSingleDiskActive): New functions.
My RFC for snapshot support [1] proposes several rules for when it is
safe to delete or revert to an external snapshot, predicated on
the existence of new API flags. These will be incrementally added
in future patches, but until then, blindly mishandling a disk
snapshot risks corrupting internal state, so it is better to
outright reject the attempts until the other pieces are in place,
thus incrementally relaxing the restrictions added in this patch.
[1] https://www.redhat.com/archives/libvir-list/2011-August/msg00361.html
* src/qemu/qemu_driver.c (qemuDomainSnapshotCountExternal): New
function.
(qemuDomainUndefineFlags, qemuDomainSnapshotDelete): Use it to add
safety valve.
(qemuDomainRevertToSnapshot, qemuDomainSnapshotCreateXML): Add safety
valve.
Prior to this patch, <domainsnapshot>/<disks> was ignored. This
changes it to be an error unless an explicit disk snapshot is
requested (a future patch may relax things if it turns out to
be useful to have a <disks> specification alongside a system
checkpoint).
* include/libvirt/libvirt.h.in
(VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY): New flag.
* src/libvirt.c (virDomainSnapshotCreateXML): Document it.
* src/esx/esx_driver.c (esxDomainSnapshotCreateXML): Disk
snapshots not supported yet.
* src/vbox/vbox_tmpl.c (vboxDomainSnapshotCreateXML): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Likewise.
I got confused when 'virsh domblkinfo dom disk' required the
path to a disk (which can be ambiguous, since a single file
can back multiple disks), rather than the unambiguous target
device name that I was using in disk snapshots. So, in true
developer fashion, I went for the best of both worlds - all
interfaces that operate on a disk (aka block) now accept
either the target name or the unambiguous path to the backing
file used by the disk.
* src/conf/domain_conf.h (virDomainDiskIndexByName): Add
parameter.
(virDomainDiskPathByName): New prototype.
* src/libvirt_private.syms (domain_conf.h): Export it.
* src/conf/domain_conf.c (virDomainDiskIndexByName): Also allow
searching by path, and decide whether ambiguity is okay.
(virDomainDiskPathByName): New function.
(virDomainDiskRemoveByName, virDomainSnapshotAlignDisks): Update
callers.
* src/qemu/qemu_driver.c (qemudDomainBlockPeek)
(qemuDomainAttachDeviceConfig, qemuDomainUpdateDeviceConfig)
(qemuDomainGetBlockInfo, qemuDiskPathToAlias): Likewise.
* src/qemu/qemu_process.c (qemuProcessFindDomainDiskByPath):
Likewise.
* src/libxl/libxl_driver.c (libxlDomainAttachDeviceDiskLive)
(libxlDomainDetachDeviceDiskLive, libxlDomainAttachDeviceConfig)
(libxlDomainUpdateDeviceConfig): Likewise.
* src/uml/uml_driver.c (umlDomainBlockPeek): Likewise.
* src/xen/xend_internal.c (xenDaemonDomainBlockPeek): Likewise.
* docs/formatsnapshot.html.in: Update documentation.
* tools/virsh.pod (domblkstat, domblkinfo): Likewise.
* docs/schemas/domaincommon.rng (diskTarget): Tighten pattern on
disk targets.
* docs/schemas/domainsnapshot.rng (disksnapshot): Update to match.
* tests/domainsnapshotxml2xmlin/disk_snapshot.xml: Update test.
Since a snapshot is fully recoverable, it is useful to have a
snapshot as a means of hibernating a guest, then reverting to
the snapshot to wake the guest up. This mode of usage is
similar to 'virsh save/virsh restore', except that virsh
save uses an external file while virsh snapshot keeps the
vm state internal to a qcow2 file. However, it only works on
persistent domains.
In the usage pattern of snapshot/revert for hibernating a guest,
there is no need to keep the guest running between the two points
in time, especially since that would generate runtime state that
would just be discarded. Add a flag to make it possible to
stop the domain after the snapshot has completed.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_SNAPSHOT_CREATE_HALT):
New flag.
* src/libvirt.c (virDomainSnapshotCreateXML): Document it.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML)
(qemuDomainSnapshotCreateActive): Implement it.
Reverting to a state prior to an external snapshot risks
corrupting any other branches in the snapshot hierarchy that
were using the snapshot as a read-only backing file. So
disk snapshot code will default to preventing reverting to
a snapshot that has any children, meaning that deleting just
the children of a snapshot becomes a useful operation in
preparing that snapshot for being a future reversion target.
The code for the new flag is simple - it's one less deletion,
plus a tweak to keep the current snapshot correct.
* include/libvirt/libvirt.h.in
(VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN_ONLY): New flag.
* src/libvirt.c (virDomainSnapshotDelete): Document it, and
enforce mutual exclusion.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDelete): Implement
it.
When reverting to a snapshot, the inactive domain configuration
has to be rolled back to what it was at the time of the snapshot.
Additionally, if the VM is active and the snapshot was active,
this now adds a failure if the two configurations are ABI
incompatible, rather than risking qemu confusion.
A future patch will add a VIR_DOMAIN_SNAPSHOT_FORCE flag, which
will be required for two risky code paths - reverting to an
older snapshot that lacked full domain information, and reverting
from running to a live snapshot that requires starting a new qemu
process. Any reverting that stops a running vm is also a form
of data loss (discarding the current running state to go back in
time), but as that is what reversion usually implies, it is
probably not worth requiring a force flag.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Copy out
domain.
(qemuDomainSnapshotCreateXML, qemuDomainRevertToSnapshot): Perform
ABI compatibility checks.
Just like VM saved state images (virsh save), snapshots MUST
track the inactive domain xml to detect any ABI incompatibilities.
The indentation is not perfect, but functionality comes before form.
Later patches will actually supply a full domain; for now, this
wires up the storage to support one, but doesn't ever generate one
in dumpxml output.
Happily, libvirt.c was already rejecting use of VIR_DOMAIN_XML_SECURE
from read-only connections, even though before this patch, there was
no information to be secured by the use of that flag.
And while we're at it, mark the libvirt snapshot metadata files
as internal-use only.
* src/libvirt.c (virDomainSnapshotGetXMLDesc): Document flag.
* src/conf/domain_conf.h (_virDomainSnapshotDef): Add member.
(virDomainSnapshotDefParseString, virDomainSnapshotDefFormat):
Update signature.
* src/conf/domain_conf.c (virDomainSnapshotDefFree): Clean up.
(virDomainSnapshotDefParseString): Optionally parse domain.
(virDomainSnapshotDefFormat): Output full domain.
* src/esx/esx_driver.c (esxDomainSnapshotCreateXML)
(esxDomainSnapshotGetXMLDesc): Update callers.
* src/vbox/vbox_tmpl.c (vboxDomainSnapshotCreateXML)
(vboxDomainSnapshotGetXMLDesc): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML)
(qemuDomainSnapshotLoad, qemuDomainSnapshotGetXMLDesc)
(qemuDomainSnapshotWriteMetadata): Likewise.
* docs/formatsnapshot.html.in: Rework doc example.
Based on a patch by Philipp Hahn.
A nice benefit of deleting all snapshots at undefine time is that
you don't have to do any reparenting or subtree identification - since
everything goes, this is an O(n) process, whereas using multiple
virDomainSnapshotDelete calls would be O(n^2) or worse. But it is
only doable for snapshot metadata, where we are in control of the
data being deleted; for the actual snapshots, there's too much
likelihood of something going wrong, and requiring even more API
calls to figure out what failed in the meantime, so callers are
better off deleting the snapshot data themselves one snapshot at
a time where they can deal with failures as they happen.
* src/qemu/qemu_driver.c (qemuDomainUndefineFlags): Honor new flags.
As more clients start to want to know this information, doing
a PATH stat walk and malloc for every client adds up.
We are only caching the location, not the capabilities, so even
if qemu-img is updated in the meantime, it will still probably
live in the same location. So there is no need to worry about
clearing this particular cache.
* src/qemu/qemu_conf.h (qemud_driver): Add member.
* src/qemu/qemu_driver.c (qemudShutdown): Cleanup.
(qemuFindQemuImgBinary): Add an argument, and cache result.
(qemuDomainSnapshotForEachQcow2, qemuDomainSnapshotDiscard)
(qemuDomainSnapshotCreateInactive, qemuDomainSnapshotRevertInactive)
(qemuDomainSnapshotCreateXML, qemuDomainRevertToSnapshot): Update
callers.
Just as leaving managed save metadata behind can cause problems
when creating a new domain that happens to collide with the name
of the just-deleted domain, the same is true of leaving any
snapshot metadata behind. For safety sake, extend the semantic
change of commit b26a9fa9 to also cover snapshot metadata as a
reason to reject undefining an inactive domain. A future patch
will make sure that shutdown of a transient domain automatically
deletes snapshot metadata (whether by destroy, shutdown, or
guest-initiated action). Management apps of transient domains
should take care to capture xml of snapshots, if it is necessary
to recreate the snapshot metadata on a later transient domain
with the same name and uuid.
This also documents a new flag that hypervisors can choose to
support as a shortcut for taking care of the metadata as part of
the undefine process; however, nontrivial driver support for these
flags will be deferred to future patches.
Note that ESX and VBox can never be transient; therefore, they
do not have to worry about automatic cleanup after shutdown
(the persistent domain still remains); likewise they never
store snapshot metadata, so the undefine flag is trivial.
The nontrivial work remaining is thus in the qemu driver.
* include/libvirt/libvirt.h.in
(VIR_DOMAIN_UNDEFINE_SNAPSHOTS_METADATA): New flag.
* src/libvirt.c (virDomainUndefine, virDomainUndefineFlags):
Document new limitations and flag.
* src/esx/esx_driver.c (esxDomainUndefineFlags): Trivial
implementation.
* src/vbox/vbox_tmpl.c (vboxDomainUndefineFlags): Likewise.
* src/qemu/qemu_driver.c (qemuDomainUndefineFlags): Enforce
the limitations.
Redefining a qemu snapshot requires a bit of a tweak to the common
snapshot parsing code, but the end result is quite nice.
Be careful that redefinitions do not introduce circular parent
chains. Also, we don't want to allow conversion between online
and offline existing snapshots. We could probably do some more
validation for snapshots that don't already exist to make sure
they are even feasible, by parsing qemu-img output, but that
can come later.
* src/conf/domain_conf.h (virDomainSnapshotParseFlags): New
internal flags.
* src/conf/domain_conf.c (virDomainSnapshotDefParseString): Alter
signature to take internal flags.
* src/esx/esx_driver.c (esxDomainSnapshotCreateXML): Update caller.
* src/vbox/vbox_tmpl.c (vboxDomainSnapshotCreateXML): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Support
new public flags.
Supporting NO_METADATA on snapshot creation is interesting - we must
still return a valid opaque snapshot object, but the user can't get
anything out of it (unless we add a virDomainSnapshotGetName()),
since it is no longer registered with the domain.
Also, virsh now tries to query for secure xml, in anticipation of
when we store <domain> xml inside <domainsnapshot>; for now, we
can trivially support it, since we have nothing secure.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Support
new flag.
(qemuDomainSnapshotGetXMLDesc): Trivially support VIR_DOMAIN_XML_SECURE.
To make it easier to know when undefine will fail because of existing
snapshot metadata, we need to know how many snapshots have metadata.
Also, it is handy to filter the list of snapshots to just those that
have no parents; document that flag now, but implement it in later patches.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_SNAPSHOT_LIST_ROOTS)
(VIR_DOMAIN_SNAPSHOT_LIST_METADATA): New flags.
* src/libvirt.c (virDomainSnapshotNum)
(virDomainSnapshotListNames): Document them.
* src/esx/esx_driver.c (esxDomainSnapshotNum)
(esxDomainSnapshotListNames): Implement trivial flag.
* src/vbox/vbox_tmpl.c (vboxDomainSnapshotNum)
(vboxDomainSnapshotListNames): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSnapshotNum)
(qemuDomainSnapshotListNames): Likewise.
Adding this was trivial compared to the previous patch for fixing
qemu snapshot deletion in the first place.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiscard): Add
parameter.
(qemuDomainSnapshotDiscardDescendant, qemuDomainSnapshotDelete):
Update callers.
Similar to the last patch in isolating the filtering from the
client actions, so that clients don't have to reinvent the
filtering.
* src/conf/domain_conf.h (virDomainSnapshotForEachChild): New
prototype.
* src/libvirt_private.syms (domain_conf.h): Export it.
* src/conf/domain_conf.c (virDomainSnapshotActOnChild)
(virDomainSnapshotForEachChild): New functions.
(virDomainSnapshotCountChildren): Delete.
(virDomainSnapshotHasChildren): Simplify.
* src/qemu/qemu_driver.c (qemuDomainSnapshotReparentChildren)
(qemuDomainSnapshotDelete): Likewise.
Deleting a snapshot and all its descendants had problems with
tracking the current snapshot. The deletion does not necessarily
proceed in depth-first order, so a parent could be deleted
before a child, wreaking havoc on passing the notion of the
current snapshot to the parent. Furthermore, even if traversal
were depth-first, doing multiple file writes to pass current up
the chain one snapshot at a time is wasteful, comparing to a
single update to the current snapshot at the end of the algorithm.
* src/qemu/qemu_driver.c (snap_remove): Add field.
(qemuDomainSnapshotDiscard): Add parameter.
(qemuDomainSnapshotDiscardDescendant): Adjust accordingly.
(qemuDomainSnapshotDelete): Properly reset current.
This one's nasty. Ever since we fixed virHashForEach to prevent
nested hash iterations for safety reasons (commit fba550f6),
virDomainSnapshotDelete with VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN
has been broken for qemu: it deletes children, while leaving
grandchildren intact but pointing to a no-longer-present parent.
But even before then, the code would often appear to succeed to
clean up grandchildren, but risked memory corruption if you have
a large and deep hierarchy of snapshots.
For acting on just children, a single virHashForEach is sufficient.
But for acting on an entire subtree, it requires iteration; and
since we declared recursion as invalid, we have to switch to a
while loop. Doing this correctly requires quite a bit of overhaul,
so I added a new helper function to isolate the algorithm from the
actions, so that callers do not have to reinvent the iteration.
Note that this _still_ does not handle CHILDREN correctly if one
of the children is the current snapshot; that will be next.
* src/conf/domain_conf.h (_virDomainSnapshotDef): Add mark.
(virDomainSnapshotForEachDescendant): New prototype.
* src/libvirt_private.syms (domain_conf.h): Export it.
* src/conf/domain_conf.c (virDomainSnapshotMarkDescendant)
(virDomainSnapshotActOnDescendant)
(virDomainSnapshotForEachDescendant): New functions.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiscardChildren):
Replace...
(qemuDomainSnapshotDiscardDescenent): ...with callback that
doesn't nest hash traversal.
(qemuDomainSnapshotDelete): Use new function.
For a system checkpoint of a running or paused domain, it's fairly
easy to honor new flags for altering which state to use after the
revert. For an inactive snapshot, the revert has to be done while
there is no qemu process, so do back-to-back transitions; this also
lets us revert to inactive snapshots even for transient domains.
* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Support new
flags.
Commit 5e47785 broke reverts to offline system checkpoint snapshots
with older qemu, since there is no longer any code path to use
qemu -loadvm on next boot. Meanwhile, reverts to offline system
checkpoints have been broken for newer qemu, both before and
after that commit, since -loadvm no longer works to revert to
disk state without accompanying vm state. Fix both of these by
using qemu-img to revert disk state.
Meanwhile, consolidate the (now 3) clients of a qemu-img iteration
over all disks of a VM into one function, so that any future
algorithmic fixes to the FIXMEs in that function after partial
loop iterations are dealt with at once. That does mean that this
patch doesn't handle partial reverts very well, but we're not
making the situation any worse in this patch.
* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Use
qemu-img rather than 'qemu -loadvm' to revert to offline snapshot.
(qemuDomainSnapshotRevertInactive): New helper.
(qemuDomainSnapshotCreateInactive): Factor guts...
(qemuDomainSnapshotForEachQcow2): ...into new helper.
(qemuDomainSnapshotDiscard): Use it.
If you take a checkpoint snapshot of a running domain, then pause
qemu, then restore the snapshot, the result should be a running
domain, but the code was leaving things paused. Furthermore, if
you take a checkpoint of a paused domain, then run, then restore,
there was a brief but non-deterministic window of time where the
domain was running rather than paused. Fix both of these
discrepancies by always pausing before restoring.
Also, check that the VM is active every time lock is dropped
between two monitor calls.
Finally, straighten out the events that get emitted on each
transition.
* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Always
pause before reversion, and improve events.
Implement the new running/paused overrides for saved state management.
Unfortunately, for virDomainSaveImageDefineXML, the saved state
updates are write-only - I don't know of any way to expose a way
to query the current run/pause setting of an existing save image
file to the user without adding a new API or modifying the domain
xml of virDomainSaveImageGetXMLDesc to include a new element to
reflect the state bit encoded into the save image. However, I
don't think this is a show-stopper, since the API is designed to
leave the state bit alone unless an explicit flag is used to
change it.
* src/qemu/qemu_driver.c (qemuDomainSaveInternal)
(qemuDomainSaveImageOpen): Adjust signature.
(qemuDomainSaveFlags, qemuDomainManagedSave)
(qemuDomainRestoreFlags, qemuDomainSaveImageGetXMLDesc)
(qemuDomainSaveImageDefineXML, qemuDomainObjRestore): Adjust
callers.
There are two classes of management apps that track events - one
that only cares about on/off (and only needs to track EVENT_STARTED
and EVENT_STOPPED), and one that cares about paused/running (also
tracks EVENT_SUSPENDED/EVENT_RESUMED). To keep both classes happy,
any transition that can go from inactive to paused must emit two
back-to-back events - one for started and one for suspended (since
later resuming of the domain will only send RESUMED, but the first
class isn't tracking that).
This also fixes a bug where virDomainCreateWithFlags with the
VIR_DOMAIN_START_PAUSED flag failed to start paused when restoring
from a managed save image.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_EVENT_SUSPENDED_RESTORED)
(VIR_DOMAIN_EVENT_SUSPENDED_FROM_SNAPSHOT)
(VIR_DOMAIN_EVENT_RESUMED_FROM_SNAPSHOT): New sub-events.
* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Use them.
(qemuDomainSaveImageStartVM): Likewise, and add parameter.
(qemudDomainCreate, qemuDomainObjStart): Send suspended event when
starting paused.
(qemuDomainObjRestore): Add parameter.
(qemuDomainObjStart, qemuDomainRestoreFlags): Update callers.
* examples/domain-events/events-c/event-test.c
(eventDetailToString): Map new detail strings.
Commit 6766ff10 introduced a corner case bug with snapshot creation:
if a snapshot is created, but then we hit OOM while trying to
create the return value of the function, then we have polluted the
internal directory with the snapshot metadata with no way to clean
it up from the running libvirtd.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Don't
write metadata file on OOM condition.
Several users have reported problems with 'virsh start' failing because
it was encountering a managed save situation where the managed save file
was incomplete. Be more robust to this by using two different magic
numbers, so that newer libvirt can gracefully handle an incomplete file
differently than a complete one, while older libvirt will at least fail
up front rather than trying to load only to have qemu fail at the end.
Managed save is a convenience - it exists to preserve as much state
as possible; if the state was not preserved, it is reasonable to just
log that fact, then proceed with a fresh boot. On the other hand,
user saves are under user control, so we must fail, but by making
the failure message distinct, the user can better decide how to handle
the situation of an incomplete save file.
* src/qemu/qemu_driver.c (QEMUD_SAVE_PARTIAL): New define.
(qemuDomainSaveInternal): Use it to mark incomplete images.
(qemuDomainSaveImageOpen, qemuDomainObjRestore): Add parameter
that controls what to do with partial images.
(qemuDomainRestoreFlags, qemuDomainSaveImageGetXMLDesc)
(qemuDomainSaveImageDefineXML, qemuDomainObjStart): Update callers.
Based on an initial idea by Osier Yang.
In a SELinux or root-squashing NFS environment, libvirt has to go
through some hoops to create a new file that qemu can then open()
by name. Snapshots are a case where we want to guarantee an empty
file that qemu can open; also, reopening a save file to convert it
from being marked partial to complete requires a reopen to avoid
O_DIRECT headaches. Refactor some existing code to make it easier
to reuse in later patches.
* src/qemu/qemu_migration.h (qemuMigrationToFile): Drop parameter.
* src/qemu/qemu_migration.c (qemuMigrationToFile): Let cgroup do
the stat, rather than asking caller to do it and pass info down.
* src/qemu/qemu_driver.c (qemuOpenFile): New function, pulled from...
(qemuDomainSaveInternal): ...here.
(doCoreDump, qemuDomainSaveImageOpen): Use it here as well.
The libvirt BlockPull API supports the use of an initial bandwidth limit but the
qemu block_stream API does not. To get the desired behavior we use the two APIs
strung together: first BlockPull, then BlockJobSetSpeed. We can do this at the
driver level to avoid duplicated code in each monitor path.
Signed-off-by: Adam Litke <agl@us.ibm.com>
There is no reason to forbid pausing an autodestroy domain
(not to mention that 'virsh start --paused --autodestroy'
succeeds in creating a paused autodestroy domain).
Meanwhile, qemu was failing to enforce the API documentation that
autodestroy domains cannot be saved. And while the original
documentation only mentioned save/restore, snapshots are another
form of saving that are close enough in semantics as to make no
sense on one-shot domains.
* src/qemu/qemu_driver.c (qemudDomainSuspend): Drop bogus check.
(qemuDomainSaveInternal, qemuDomainSnapshotCreateXML): Forbid
saves of autodestroy domains.
* src/libvirt.c (virDomainCreateWithFlags, virDomainCreateXML):
Document snapshot interaction.
There have been several instances of people having problems with
a broken managed save file, and not aware that they could use
'virsh managedsave-remove dom' to fix things. Making it possible
to do this as part of starting a domain makes the same functionality
easier to find, and one less API call.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_START_FORCE_BOOT): New
flag.
* src/libvirt.c (virDomainCreateWithFlags): Document it.
* src/qemu/qemu_driver.c (qemuDomainObjStart): Alter signature.
(qemuAutostartDomain, qemuDomainStartWithFlags): Update callers.
* tools/virsh.c (cmdStart): Expose it in virsh.
* tools/virsh.pod (start): Document it.
The QEMU 'sendkey' command expects keys to be encoded in the same
way as the RFB extended keycode set. Specifically it wants extended
keys to have the high bit of the first byte set, while the Linux
XT KBD driver codeset uses the low bit of the second byte. To deal
with this we introduce a new keymap 'RFB' and use that in the QEMU
driver
* include/libvirt/libvirt.h.in: Add VIR_KEYCODE_SET_RFB
* src/qemu/qemu_driver.c: Use RFB keycode set instead of XT KBD
* src/util/virkeycode-mapgen.py: Auto-generate the RFB keycode
set from the XT KBD set
* src/util/virkeycode.c: Add RFB keycode entry to table. Add a
verify check on cardinality of the codeOffset table
Audit all changes to the qemu vm->current_snapshot, and make them
update the saved xml file for both the previous and the new
snapshot, so that there is always at most one snapshot with
<active>1</active> in the xml, and that snapshot is used as the
current snapshot even across libvirtd restarts.
This patch does not fix the case of virDomainSnapshotDelete(,CHILDREN)
where one of the children is the current snapshot; that will be later.
* src/conf/domain_conf.h (_virDomainSnapshotDef): Alter member
type and name.
* src/conf/domain_conf.c (virDomainSnapshotDefParseString)
(virDomainSnapshotDefFormat): Update clients.
* docs/schemas/domainsnapshot.rng: Tighten rng.
* src/qemu/qemu_driver.c (qemuDomainSnapshotLoad): Reload current
snapshot.
(qemuDomainSnapshotCreateXML, qemuDomainRevertToSnapshot)
(qemuDomainSnapshotDiscard): Track current snapshot.
Changing the current vm, and writing that change to the file
system, all before a new qemu starts, is risky; it's hard to
roll back if starting the new qemu fails for some reason.
Instead of abusing vm->current_snapshot and making the command
line generator decide whether the current snapshot warrants
using -loadvm, it is better to just directly pass a snapshot all
the way through the call chain if it is to be loaded.
This frees up the last use of snapshot->def->active for qemu's
use, so the next patch can repurpose that field for tracking
which snapshot is current.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Don't use active
field of snapshot.
* src/qemu/qemu_process.c (qemuProcessStart): Add a parameter.
* src/qemu/qemu_process.h (qemuProcessStart): Update prototype.
* src/qemu/qemu_migration.c (qemuMigrationPrepareAny): Update
callers.
* src/qemu/qemu_driver.c (qemudDomainCreate)
(qemuDomainSaveImageStartVM, qemuDomainObjStart)
(qemuDomainRevertToSnapshot): Likewise.
(qemuDomainSnapshotSetCurrentActive)
(qemuDomainSnapshotSetCurrentInactive): Delete unused functions.
https://bugzilla.redhat.com/show_bug.cgi?id=727709
mentions that if qemu fails to create the snapshot (such as what
happens on Fedora 15 qemu, which has qmp but where savevm is only
in hmp, and where libvirt is old enough to not try the hmp fallback),
then 'virsh snapshot-list dom' will show a garbage snapshot entry,
and the libvirt internal directory for storing snapshot metadata
will have a bogus file.
This fixes the fallout bug of polluting the snapshot-list with
garbage on failure (the root cause of the F15 bug of not having
fallback to hmp has already been fixed in newer libvirt releases).
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Allocate
memory before making snapshot, and cleanup on failure. Don't
dereference NULL if transient domain exited during snapshot creation.
Our logic throws off analyzer tools:
ptr var = NULL;
if (flags == 0) flags = live ? _LIVE : _CONFIG;
if (flags & _LIVE) do stuff
if (flags & _CONFIG) var = non-null;
if (flags & _LIVE) do more stuff
else if (flags & _CONFIG) use var
the tools keep thinking that var can still be NULL in the last
if clause, adding the hint shuts them up.
* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Add a
static analysis hint.
Transient domains reject attempts to set autostart, and using
virDomainCreate to restart a domain only works on persistent
domains. Therefore, managed save makes no sense on transient
domains, and should be rejected up front rather than creating
an otherwise unrecoverable managed save file.
Besides, transient domains imply that a lot more management is
being done by the upper layer; this includes the assumption
that the upper layer is okay managing the saved state file
created by virDomainSave, and does not need to use managed save.
* src/libvirt.c: Document that transient domains are incompatible
with managed save.
* src/qemu/qemu_driver.c (qemuDomainManagedSave): Enforce it.
* src/libxl/libxl_driver.c (libxlDomainManagedSave): Likewise.
I noticed some inconsistent use of 'else'.
* src/qemu/qemu_driver.c (qemuCPUCompare)
(qemuDomainSnapshotCreateXML, qemuDomainRevertToSnapshot)
(qemuDomainSnapshotDiscard): Match coding conventions.
If a snapshot with the name already exists, virDomainSnapshotAssignDef()
just returns NULL, in which case the snapshot definition is leaked.
Currently this leak is not a big problem, since qemuDomainSnapshotLoad()
is only called once during initial startup of libvirtd.
Signed-off-by: Philipp Hahn <hahn@univention.de>
Revert 6a1f5f568f. Now that libvirt_iohelper takes fds by
inheritance rather than by open() (commit 1eb66479), there is
no longer a race where the parent can unlink() a file prior to
the iohelper open()ing the same file. From there, it makes
more sense to have the callers both create and unlink, rather
than the caller create and the stream unlink, since the latter
was only needed when iohelper had to do the unlink.
* src/fdstream.h (virFDStreamOpenFile, virFDStreamCreateFile):
Callers are responsible for deletion.
* src/fdstream.c (virFDStreamOpenFileInternal): Don't leak created
file on failure.
(virFDStreamOpenFile, virFDStreamCreateFile): Drop parameter.
* src/lxc/lxc_driver.c (lxcDomainOpenConsole): Update callers.
* src/qemu/qemu_driver.c (qemuDomainScreenshot)
(qemuDomainOpenConsole): Likewise.
* src/storage/storage_driver.c (storageVolumeDownload)
(storageVolumeUpload): Likewise.
* src/uml/uml_driver.c (umlDomainOpenConsole): Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainScreenshot): Likewise.
* src/xen/xen_driver.c (xenUnifiedDomainOpenConsole): Likewise.
The previous qemu patch could end up calling unlink(tmp) before
tmp was the name of a valid file (unlinking a fileXXXXXX template
instead), or calling unlink(tmp) twice on success (once here,
and once at the end of the stream). Meanwhile, vbox also suffered
from the same leaked tmp file bug.
* src/qemu/qemu_driver.c (qemuDomainScreenshot): Don't unlink on
success, or on invalid name.
* src/vbox/vbox_tmpl.c (vboxDomainScreenshot): Don't leak temp file.
Currently, we attempt to run sync job and async job at the same time. It
means that the monitor commands for two jobs can be run in any order.
In the function qemuDomainObjEnterMonitorInternal():
if (priv->job.active == QEMU_JOB_NONE && priv->job.asyncJob) {
if (qemuDomainObjBeginNestedJob(driver, obj) < 0)
We check whether the caller is an async job by priv->job.active and
priv->job.asynJob. But when an async job is running, and a sync job is
also running at the time of the check, then priv->job.active is not
QEMU_JOB_NONE. So we cannot check whether the caller is an async job
in the function qemuDomainObjEnterMonitorInternal(), and must instead
put the burden on the caller to tell us when an async command wants
to do a nested job.
Once the burden is on the caller, then only async monitor enters need
to worry about whether the VM is still running; for sync monitor enter,
the internal return is always 0, so lots of ignore_value can be dropped.
* src/qemu/THREADS.txt: Reflect new rules.
* src/qemu/qemu_domain.h (qemuDomainObjEnterMonitorAsync): New
prototype.
* src/qemu/qemu_process.h (qemuProcessStartCPUs)
(qemuProcessStopCPUs): Add parameter.
* src/qemu/qemu_migration.h (qemuMigrationToFile): Likewise.
(qemuMigrationWaitForCompletion): Make static.
* src/qemu/qemu_domain.c (qemuDomainObjEnterMonitorInternal): Add
parameter.
(qemuDomainObjEnterMonitorAsync): New function.
(qemuDomainObjEnterMonitor, qemuDomainObjEnterMonitorWithDriver):
Update callers.
* src/qemu/qemu_driver.c (qemuDomainSaveInternal)
(qemudDomainCoreDump, doCoreDump, processWatchdogEvent)
(qemudDomainSuspend, qemudDomainResume, qemuDomainSaveImageStartVM)
(qemuDomainSnapshotCreateActive, qemuDomainRevertToSnapshot):
Likewise.
* src/qemu/qemu_process.c (qemuProcessStopCPUs)
(qemuProcessFakeReboot, qemuProcessRecoverMigration)
(qemuProcessRecoverJob, qemuProcessStart): Likewise.
* src/qemu/qemu_migration.c (qemuMigrationToFile)
(qemuMigrationWaitForCompletion, qemuMigrationUpdateJobStatus)
(qemuMigrationJobStart, qemuDomainMigrateGraphicsRelocate)
(doNativeMigrate, doTunnelMigrate, qemuMigrationPerformJob)
(qemuMigrationPerformPhase, qemuMigrationFinish)
(qemuMigrationConfirm): Likewise.
* src/qemu/qemu_hotplug.c: Drop unneeded ignore_value.
whether or not previous return value is -1, the following codes will be
executed for a inactive guest in src/qemu/qemu_driver.c:
ret = virDomainSaveConfig(driver->configDir, persistentDef);
and if everything is okay, 'ret' is assigned to 0, the previous 'ret'
will be overwritten, this patch will fix this issue.
* src/qemu/qemu_driver.c: avoid return value is overwritten when give a argument
in out of blkio weight range for a inactive guest.
* how to reproduce?
% virsh blkiotune ${guestname} --weight 10
% echo $?
Note: guest must be inactive, argument 10 in out of blkio weight range,
and can get a error information by checking libvirtd.log, however,
virsh hasn't raised any error information, and return value is 0.
https://bugzilla.redhat.com/show_bug.cgi?id=726304
Signed-off-by: Alex Jia <ajia@redhat.com>
whether or not previous return value is -1, the following codes will be
executed for a inactive guest in qemuDomainSetMemoryParameters:
ret = virDomainSaveConfig(driver->configDir, persistentDef);
and if everything is okay, 'ret' is assigned to 0, the previous 'ret'
will be overwritten, this patch will fix this issue.
* src/qemu/qemu_driver.c: avoid return value is overwritten when set
min_guarante value to a inactive guest.
* how to reproduce?
% virsh memtune ${guestname} --min_guarante 1024
% echo $?
Note: guest must be inactive, in fact, 'min_guarante' hasn't been implemented
in memory tunable, and I can get the error when check actual libvirtd.log,
however, virsh hasn't raised any error information, and return value is 0.
Signed-off-by: Alex Jia <ajia@redhat.com>
Introduced by f9a837da73, the condition is not changed after
the else clause is removed. So now it quit with "domain is not
running" when the domain is running. However, when the domain is
not running, it reports "no job is active".
How to reproduce:
1)
% virsh start $domain
% virsh domjobabort $domain
error: Requested operation is not valid: domain is not running
2)
% virsh destroy $domain
% virsh domjobabort $domain
error: Requested operation is not valid: no job is active on the domain
3)
% virsh save $domain /tmp/$domain.save
Before above commands finished, try to abort job in another terminal
% virsh domabortjob $domain
error: Requested operation is not valid: domain is not running
The goal here is that save-image-dumpxml fed back to
save-image-define should not change the save file; anywhere that
this is not the case is probably a bug in domain_conf.c.
* src/qemu/qemu_driver.c (qemuDomainSaveImageGetXMLDesc)
(qemuDomainSaveImageDefineXML): New functions.
(qemuDomainSaveImageOpen): Add parameter.
(qemuDomainRestoreFlags, qemuDomainObjRestore): Adjust clients.
With this, it is possible to update the path to a disk backing
image on either the save or restore action, without having to
binary edit the XML embedded in the state file.
This also modifies virDomainSave to output a smaller xml (only
the inactive xml, which is all the more virDomainRestore parses),
while still guaranteeing padding for most typical abi-compatible
xml replacements, necessary so that the next patch for
virDomainSaveImageDefineXML will not cause unnecessary
modifications to the save image file.
* src/qemu/qemu_driver.c (qemuDomainSaveInternal): Add parameter,
only use inactive state, and guarantee padding.
(qemuDomainSaveImageOpen): Add parameter.
(qemuDomainSaveFlags, qemuDomainManagedSave)
(qemuDomainRestoreFlags, qemuDomainObjRestore): Update callers.
As written in virStorageFileGetMetadataFromFD decription, caller
must free metadata after use. Qemu driver miss this and therefore
leak metadata which can grow to huge mem leak if somebody query
for blockInfo a lot.
The error in getCompressionType will never be reported, change
the errors codes into warning (VIR_WARN("%s", _(foo)); doesn't break
syntax-check rule), and also improve the docs in qemu.conf to tell
user the truth.
Make MIGRATION_OUT use the new helper methods.
This also introduces new protection to migration v3 process: the
migration job is held from Begin to Confirm to avoid changes to a domain
during migration (esp. between Begin and Perform phases). This change is
automatically applied to p2p and tunneled migrations. For normal
migration, this requires support from a client. In other words, if an
old (pre 0.9.4) client starts normal migration of a domain, the domain
will not be protected against changes between Begin and Perform steps.
The cpu bandwidth is applied at the vcpu group level. We should apply it
at the vm group level too, because the vm may do heavy I/O, and it will affect
the other vm.
We apply cpu bandwidth at the vcpu and the vm group level, so we must ensure
that max(child_quota) <= parent_quota when we modify cpu bandwidth.
Now that virDomainSetVcpusFlags knows about VIR_DOMAIN_AFFECT_CURRENT,
so should virDomainGetVcpusFlags.
Unfortunately, the virsh counterpart 'virsh vcpucount' has already
commandeered --current for a different meaning, so teaching virsh
to expose this in the next patch will require a bit of care.
* src/libvirt.c (virDomainGetVcpusFlags): Allow
VIR_DOMAIN_AFFECT_CURRENT.
* src/libxl/libxl_driver.c (libxlDomainGetVcpusFlags): Likewise.
* src/qemu/qemu_driver.c (qemudDomainGetVcpusFlags): Likewise.
* src/test/test_driver.c (testDomainGetVcpusFlags): Likewise.
* src/xen/xen_driver.c (xenUnifiedDomainGetVcpusFlags): Likewise.
In the XML file we now have
<cputune>
<shares>1024</shares>
<period>90000</period>
<quota>0</quota>
</cputune>
But the schedinfo parameter are being named
cpu_shares: 1024
cfs_period: 90000
cfs_quota: 0
The period/quota is per-vcpu value, so these new tunables should be named
'vcpu_period' and 'vcpu_quota'.
The virDomainBlockPull* family of commands are enabled by the
following HMP/QMP commands: 'block_stream', 'block_job_cancel',
'info block-jobs' / 'query-block-jobs', and 'block_job_set_speed'.
* src/qemu/qemu_driver.c src/qemu/qemu_monitor_text.[ch]: implement disk
streaming by using the proper qemu monitor commands.
* src/qemu/qemu_monitor_json.[ch]: implement commands using the qmp monitor
When auto-dumping a domain on crash events, or autostarting a domain
with managed save state, let the user configure whether to imply
the bypass cache flag.
* src/qemu/qemu.conf (auto_dump_bypass_cache, auto_start_bypass_cache):
Document new variables.
* src/qemu/libvirtd_qemu.aug (vnc_entry): Let augeas parse them.
* src/qemu/qemu_conf.h (qemud_driver): Store new preferences.
* src/qemu/qemu_conf.c (qemudLoadDriverConfig): Parse them.
* src/qemu/qemu_driver.c (processWatchdogEvent, qemuAutostartDomain):
Honor them.
Wire together the previous patches to support file system cache
bypass during API save/restore requests in qemu.
* src/qemu/qemu_driver.c (qemuDomainSaveInternal, doCoreDump)
(qemudDomainObjStart, qemuDomainSaveImageOpen, qemuDomainObjRestore)
(qemuDomainObjStart): Add parameter.
(qemuDomainSaveFlags, qemuDomainManagedSave, qemudDomainCoreDump)
(processWatchdogEvent, qemudDomainStartWithFlags, qemuAutostartDomain)
(qemuDomainRestoreFlags): Update callers.
For all hypervisors that support save and restore, the new API
now performs the same functions as the old.
VBox is excluded from this list, because its existing domainsave
is broken (there is no corresponding domainrestore, and there
is no control over the filename used in the save). A later
patch should change vbox to use its implementation for
managedsave, and teach start to use managedsave results.
* src/libxl/libxl_driver.c (libxlDomainSave): Move guts...
(libxlDomainSaveFlags): ...to new function.
(libxlDomainRestore): Move guts...
(libxlDomainRestoreFlags): ...to new function.
* src/test/test_driver.c (testDomainSave, testDomainSaveFlags)
(testDomainRestore, testDomainRestoreFlags): Likewise.
* src/xen/xen_driver.c (xenUnifiedDomainSave)
(xenUnifiedDomainSaveFlags, xenUnifiedDomainRestore)
(xenUnifiedDomainRestoreFlags): Likewise.
* src/qemu/qemu_driver.c (qemudDomainSave, qemudDomainRestore):
Rename and move guts.
(qemuDomainSave, qemuDomainSaveFlags, qemuDomainRestore)
(qemuDomainRestoreFlags): ...here.
(qemudDomainSaveFlag): Rename...
(qemuDomainSaveInternal): ...to this, and update callers.
This is the one function outside of domain_conf.c that plays around
with (even modifying) the internals of the virDomainNetDef, and thus
can't be fixed up simply by replacing direct accesses to the fields of
the struct with the GetActual*() access functions.
In this case, we need to check if the defined type is "network", and
if it is *then* check the actual type; if the actual type is "bridge",
then we can at least put the bridgename in a place where it can be
used; otherwise (if type isn't "bridge"), we behave exactly as we used
to - just null out *everything*.
This patch implements cfs_period and cfs_quota's modification.
We can use the command 'virsh schedinfo' to query or modify cfs_period and
cfs_quota.
If you query period or quota from config file, the value 0 means it does not set
in the config file.
If you set period or quota to config file, the value 0 means that delete current
setting from config file.
If you modify period or quota while vm is running, the value 0 means that use
current value.
There were two API in driver.c that were silently masking flags
bits prior to calling out to the drivers, and several others
that were explicitly masking flags bits. This is not
forward-compatible - if we ever have that many flags in the
future, then talking to an old server that masks out the
flags would be indistinguishable from talking to a new server
that can honor the flag. In general, libvirt.c should forward
_all_ flags on to drivers, and only the drivers should reject
unknown flags.
In the case of virDrvSecretGetValue, the solution is to separate
the internal driver callback function to have two parameters
instead of one, with only one parameter affected by the public
API. In the case of virDomainGetXMLDesc, it turns out that
no one was ever mixing VIR_DOMAIN_XML_INTERNAL_STATUS with
the dumpxml path in the first place; that internal flag was
only used in saving and restoring state files, which happened
to be in functions internal to a single file, so there is no
mixing of the internal flag with a public flags argument.
Additionally, virDomainMemoryStats passed a flags argument
over RPC, but not to the driver.
* src/driver.h (VIR_DOMAIN_XML_FLAGS_MASK)
(VIR_SECRET_GET_VALUE_FLAGS_MASK): Delete.
(virDrvSecretGetValue): Separate out internal flags.
(virDrvDomainMemoryStats): Provide missing flags argument.
* src/driver.c (verify): Drop unused check.
* src/conf/domain_conf.h (virDomainObjParseFile): Delete
declaration.
(virDomainXMLInternalFlags): Move...
* src/conf/domain_conf.c: ...here. Delete redundant include.
(virDomainObjParseFile): Make static.
* src/libvirt.c (virDomainGetXMLDesc, virSecretGetValue): Update
clients.
(virDomainMemoryPeek, virInterfaceGetXMLDesc)
(virDomainMemoryStats, virDomainBlockPeek, virNetworkGetXMLDesc)
(virStoragePoolGetXMLDesc, virStorageVolGetXMLDesc)
(virNodeNumOfDevices, virNodeListDevices, virNWFilterGetXMLDesc):
Don't mask unknown flags.
* src/interface/netcf_driver.c (interfaceGetXMLDesc): Reject
unknown flags.
* src/secret/secret_driver.c (secretGetValue): Update clients.
* src/remote/remote_driver.c (remoteSecretGetValue)
(remoteDomainMemoryStats): Likewise.
* src/qemu/qemu_process.c (qemuProcessGetVolumeQcowPassphrase):
Likewise.
* src/qemu/qemu_driver.c (qemudDomainMemoryStats): Likewise.
* daemon/remote.c (remoteDispatchDomainMemoryStats): Likewise.
The regression is introduced by Commit da1eba6b, the new
codes with this commit doesn't reset "ret" to "-1" when
it fails on parsing the device XML (live device attachment)
This patch changes the codes to reset the "ret" and "-1",
and also changes the codes so that it don't modify "ret"
for condition checking.
How to reproduce:
% cat test.xml
<disk type='oops' device='disk'>
<driver name='qemu' type='raw'/>
<source file='/var/lib/libvirt/images/test.img'/>
<target dev='vda' bus='virtio'/>
</disk>
% virsh attach-device $domain test.xml
Device attached successfully
The device attachment failed actually with error "unknown disk type 'oops'",
however, it reports success.
Commit f548480b broke migration v3 on qemu, because the driver
passed flags on through to qemu_migration even though
qemu_migration wasn't using those flags.
* src/qemu/qemu_migration.h (QEMU_MIGRATION_FLAGS): New define.
* src/qemu/qemu_driver.c: Simplify all migration callbacks.
* src/qemu/qemu_migration.c (qemuMigrationConfirm): Fix regression.
The previous patches only cleaned up ATTRIBUTE_UNUSED flags cases;
auditing the drivers found other places where flags was being used
but not validated. In particular, domainGetXMLDesc had issues with
clients accepting a different set of flags than the common
virDomainDefFormat helper function.
* src/conf/domain_conf.c (virDomainDefFormat): Add common flag check.
* src/uml/uml_driver.c (umlDomainAttachDeviceFlags)
(umlDomainDetachDeviceFlags): Reject unknown
flags.
* src/vbox/vbox_tmpl.c (vboxDomainGetXMLDesc)
(vboxDomainAttachDeviceFlags)
(vboxDomainDetachDeviceFlags): Likewise.
* src/qemu/qemu_driver.c (qemudDomainMemoryPeek): Likewise.
(qemuDomainGetXMLDesc): Document common flag handling.
* src/libxl/libxl_driver.c (libxlDomainGetXMLDesc): Likewise.
* src/lxc/lxc_driver.c (lxcDomainGetXMLDesc): Likewise.
* src/openvz/openvz_driver.c (openvzDomainGetXMLDesc): Likewise.
* src/phyp/phyp_driver.c (phypDomainGetXMLDesc): Likewise.
* src/test/test_driver.c (testDomainGetXMLDesc): Likewise.
* src/vmware/vmware_driver.c (vmwareDomainGetXMLDesc): Likewise.
* src/xenapi/xenapi_driver.c (xenapiDomainGetXMLDesc): Likewise.
This patch extends qemudDomainSetVcpusFlags() function to support
VIR_DOMAIN_AFFECT_CURRENT flag.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Query commands are safe to be called during long running jobs (such as
migration). This patch makes them all work without the need to
special-case every single one of them.
The patch introduces new job.asyncCond condition and associated
job.asyncJob which are dedicated to asynchronous (from qemu monitor
point of view) jobs that can take arbitrarily long time to finish while
qemu monitor is still usable for other commands.
The existing job.active (and job.cond condition) is used all other
synchronous jobs (including the commands run during async job).
Locking schema is changed to use these two conditions. While asyncJob is
active, only allowed set of synchronous jobs is allowed (the set can be
different according to a particular asyncJob) so any method that
communicates to qemu monitor needs to check if it is allowed to be
executed during current asyncJob (if any). Once the check passes, the
method needs to normally acquire job.cond to ensure no other command is
running. Since domain object lock is released during that time, asyncJob
could have been started in the meantime so the method needs to recheck
the first condition. Then, normal jobs set job.active and asynchronous
jobs set job.asyncJob and optionally change the list of allowed job
groups.
Since asynchronous jobs only set job.asyncJob, other allowed commands
can still be run when domain object is unlocked (when communicating to
remote libvirtd or sleeping). To protect its own internal synchronous
commands, the asynchronous job needs to start a special nested job
before entering qemu monitor. The nested job doesn't check asyncJob, it
only acquires job.cond and sets job.active to block other jobs.
The LXC and UML drivers can both make use of auditing. Move
the qemu_audit.{c,h} files to src/conf/domain_audit.{c,h}
* src/conf/domain_audit.c: Rename from src/qemu/qemu_audit.c
* src/conf/domain_audit.h: Rename from src/qemu/qemu_audit.h
* src/Makefile.am: Remove qemu_audit.{c,h}, add domain_audit.{c,h}
* src/qemu/qemu_audit.h, src/qemu/qemu_cgroup.c,
src/qemu/qemu_command.c, src/qemu/qemu_driver.c,
src/qemu/qemu_hotplug.c, src/qemu/qemu_migration.c,
src/qemu/qemu_process.c: Update for changed audit API names
Given a PID, the QEMU driver reads /proc/$PID/cmdline and
/proc/$PID/environ to get the configuration. This is fed
into the ARGV->XML convertor to build an XML configuration
for the process.
/proc/$PID/exe is resolved to identify the full command
binary path
After checking for name/uuid uniqueness, an attempt is
made to connect to the monitor socket. If successful
then 'info status' and 'info kvm' are issued to determine
whether the CPUs are running and if KVM is enabled.
* src/qemu/qemu_driver.c: Implement virDomainQemuAttach
* src/qemu/qemu_process.h, src/qemu/qemu_process.c: Add
qemuProcessAttach to connect to the monitor of an
existing QEMU process
When converting QEMU argv into a virDomainDefPtr, also extract
the pidfile, monitor character device config and the monitor
mode.
* src/qemu/qemu_command.c, src/qemu/qemu_command.h: Extract
pidfile & monitor config from QEMU argv
* src/qemu/qemu_driver.c, tests/qemuargv2xmltest.c: Add extra
params when calling qemuParseCommandLineString
The drivers were accepting domain configs without checking if those
were actually meant for them. For example the LXC driver happily
accepts configs with type QEMU.
Add a check for the expected domain types to the virDomainDefParse*
functions.
Some callers expected virFileMakePath to set errno, some expected
it to return an errno value. Unify this to return 0 on success and
-1 on error. Set errno to report detailed error information.
Also optimize virFileMakePath if stat fails with an errno different
from ENOENT.
add a new API pciDeviceReAttachInit() in pci.c to initialize state values for nodedev reattach
Initialize three state value of device driver to 1. This is just for a new call to
qemudNodeDeviceReAttach()
Although most functions with flags check to verify no application is
passing in flag bits that are currently undefined, for some reason
this function wasn't.
virFileMakePath returns an errno value on error, that will never
be negative. An virFileMakePath error would have been ignored here,
instead of being reported correctly.
The qemudDomainSaveFlag method will call EndJob on the 'vm'
object it is passed in. This can result in the 'vm' object
being free'd if the last reference is removed. Thus no caller
of 'qemudDomainSaveFlag' must *ever* reference 'vm' again
upon return.
Unfortunately qemudDomainSave and qemuDomainManagedSave
both call 'virDomainObjUnlock', which can result in a
crash. This is non-deterministic since it involves a race
with the monitor I/O thread.
Fix this by making qemudDomainSaveFlag responsible for
calling virDomainObjUnlock instead.
* src/qemu/qemu_driver.c: Fix potential use after free
when saving guests
If we pass VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG to
qemuGetSchedulerParametersFlags() or *nparams is less than 1,
we will unlock qemu_driver without locking it. It's very dangerous.
We should lock qemu_driver after calling virCheckFlags().
Although we create a temporary file, it is owned by root:root and have
rights 0600. In case qemu does not run under root, it is unable to write
to that file and thus we transfer 0B sized file.
We already have a public virDomainPinVcpu, which implies that
Pin and Vcpu are treated as separate words. Unreleased commit
e261987c introduced virDomainGetVcpupinInfo as the first public
API that used Vcpupin, although we had prior internal uses of
that spelling. For consistency, change the spelling to be two
words everywhere, regardless of whether pin comes first or last.
* daemon/remote.c: Treat vcpu and pin as separate words.
* include/libvirt/libvirt.h.in: Likewise.
* src/conf/domain_conf.c: Likewise.
* src/conf/domain_conf.h: Likewise.
* src/driver.h: Likewise.
* src/libvirt.c: Likewise.
* src/libvirt_private.syms: Likewise.
* src/libvirt_public.syms: Likewise.
* src/libxl/libxl_driver.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
* src/remote/remote_driver.c: Likewise.
* src/xen/xend_internal.c: Likewise.
* tools/virsh.c: Likewise.
* src/remote/remote_protocol.x: Likewise.
* src/remote_protocol-structs: Likewise.
Suggested by Matthias Bolte.
This patch implements the code to address the new API (virDomainGetVcpupinInfo)
in the qemu driver.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
If an application is using libvirt + KVM as a piece of its
internal infrastructure to perform a specific task, it can
be desirable to guarentee the VM dies when the virConnectPtr
disconnects from libvirtd. This ensures the app can't leak
any VMs it was using. Adding VIR_DOMAIN_START_AUTOKILL as
a flag when starting guests enables this to be done.
* include/libvirt/libvirt.h.in: All VIR_DOMAIN_START_AUTOKILL
* src/qemu/qemu_driver.c: Support automatic killing of guests
upon connection close
* tools/virsh.c: Add --autokill flag to 'start' and 'create'
commands
Sometimes it is useful to be able to automatically destroy a guest when
a connection is closed. For example, kill an incoming migration if
the client managing the migration dies. This introduces a map between
guest 'uuid' strings and virConnectPtr objects. When a connection is
closed, any associated guests are killed off.
* src/qemu/qemu_conf.h: Add autokill hash table to qemu driver
* src/qemu/qemu_process.c, src/qemu/qemu_process.h: Add APIs
for performing autokill of guests associated with a connection
* src/qemu/qemu_driver.c: Initialize autodestroy map
For controlled shutdown we issue a 'system_powerdown' command
to the QEMU monitor. This triggers an ACPI event which (most)
guest OS wire up to a controlled shutdown. There is no equiv
ACPI event to trigger a controlled reboot. This patch attempts
to fake a reboot.
- In qemuDomainObjPrivatePtr we have a bool fakeReboot
flag.
- The virDomainReboot method sets this flag and then
triggers a normal 'system_powerdown'.
- The QEMU process is started with '-no-shutdown'
so that the guest CPUs pause when it powers off the
guest
- When we receive the 'POWEROFF' event from QEMU JSON
monitor if fakeReboot is not set we invoke the
qemuProcessKill command and shutdown continues
normally
- If fakeReboot was set, we spawn a background thread
which issues 'system_reset' to perform a warm reboot
of the guest hardware. Then it issues 'cont' to
start the CPUs again
* src/qemu/qemu_command.c: Add -no-shutdown flag if
we have JSON support
* src/qemu/qemu_domain.h: Add 'fakeReboot' flag to
qemuDomainObjPrivate struct
* src/qemu/qemu_driver.c: Fake reboot using the
system_powerdown command if JSON support is available
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
binding for system_reset command
* src/qemu/qemu_process.c: Reset the guest & start CPUs if
fakeReboot is set
GCC complained about a C99 for-loop declaration outside of C99 mode
when compiling on RHEL 5.
* src/qemu/qemu_driver.c (qemudDomainPinVcpuFlags): Avoid C99 for
loop, since gcc 4.1.2 hates it.
Since we virEventRegisterDefaultImpl is now a public API, callers need
a way to invoke the default registered Handle and Timeout functions. We
already have general functions for these internally, so promote
them to the public API.
v2:
Actually add APIs to libvirt.h
Pinning to all physical cpus means resetting, hence it is preferable to
delete vcpupin setting of XML.
This patch changes qemu driver to delete vcpupin setting by invoking
virDomainVcpupinDel API when pinning the specified virtual cpu to
all host physical cpus.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
The virDomainBlockPull* family of commands are enabled by the
'block_stream' and 'info block_stream' qemu monitor commands.
* src/qemu/qemu_driver.c src/qemu/qemu_monitor_text.[ch]: implement disk
streaming by using the stream and info stream text monitor commands
* src/qemu/qemu_monitor_json.[ch]: implement commands using the qmp monitor
Signed-off-by: Adam Litke <agl@us.ibm.com>
Acked-by: Daniel P. Berrange <berrange@redhat.com>
This patch deprecates following enums:
VIR_DOMAIN_MEM_CURRENT
VIR_DOMAIN_MEM_LIVE
VIR_DOMAIN_MEM_CONFIG
VIR_DOMAIN_VCPU_LIVE
VIR_DOMAIN_VCPU_CONFIG
VIR_DOMAIN_DEVICE_MODIFY_CURRENT
VIR_DOMAIN_DEVICE_MODIFY_LIVE
VIR_DOMAIN_DEVICE_MODIFY_CONFIG
And modify internal codes to use virDomainModificationImpact.
This commit is safe precisely because there has been no release
for any of the enum values being deleted (they were added post-0.9.1).
After the 0.9.2 release, we can then take advantage of
virDomainModificationImpact in more places.
* include/libvirt/libvirt.h.in (virDomainModificationImpact): New
enum.
(virDomainSchedParameterFlags, virMemoryParamFlags): Delete, since
these were never released, and the new enum works fine here.
* src/libvirt.c (virDomainGetMemoryParameters)
(virDomainSetMemoryParameters)
(virDomainGetSchedulerParametersFlags)
(virDomainSetSchedulerParametersFlags): Update documentation.
* src/qemu/qemu_driver.c (qemuDomainSetMemoryParameters)
(qemuDomainGetMemoryParameters, qemuSetSchedulerParametersFlags)
(qemuSetSchedulerParameters, qemuGetSchedulerParametersFlags)
(qemuGetSchedulerParameters): Adjust clients.
* tools/virsh.c (cmdSchedinfo, cmdMemtune): Likewise.
Based on ideas by Daniel Veillard and Hu Tao.
The change 18c2a59206 caused
some regressions in behaviour of virDomainBlockStats
and virDomainBlockInfo in the QEMU driver.
The virDomainBlockInfo API stopped working for inactive
guests if querying a block device.
The virDomainBlockStats API did not promptly report
an error if the guest was not running in some cases.
* src/qemu/qemu_driver.c: Fix inactive guest handling
in BlockStats/Info APIs
* src/conf/domain_conf.c, src/conf/domain_conf.h: APIs for
inserting/finding/removing virDomainLeaseDefPtr instances
* src/qemu/qemu_driver.c: Wire up hotplug/unplug for leases
* src/qemu/qemu_hotplug.h, src/qemu/qemu_hotplug.c: Support
for hotplug and unplug of leases
Some lock managers associate state with leases, allowing a process
to temporarily release its leases, and re-acquire them later, safe
in the knowledge that no other process has acquired + released the
leases in between.
This is already used between suspend/resume operations, and must
also be used across migration. This passes the lockstate in the
migration cookie. If the lock manager uses lockstate, then it
becomes compulsory to use the migration v3 protocol to get the
cookie support.
* src/qemu/qemu_driver.c: Validate that migration v2 protocol is
not used if lock manager needs state transfer
* src/qemu/qemu_migration.c: Transfer lock state in migration
cookie XML
The QEMU integrates with the lock manager instructure in a number
of key places
* During startup, a lock is acquired in between the fork & exec
* During startup, the libvirtd process acquires a lock before
setting file labelling
* During shutdown, the libvirtd process acquires a lock
before restoring file labelling
* During hotplug, unplug & media change the libvirtd process
holds a lock while setting/restoring labels
The main content lock is only ever held by the QEMU child process,
or libvirtd during VM shutdown. The rest of the operations only
require libvirtd to hold the metadata locks, relying on the active
QEMU still holding the content lock.
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h,
src/qemu/libvirtd_qemu.aug, src/qemu/test_libvirtd_qemu.aug:
Add config parameter for configuring lock managers
* src/qemu/qemu_driver.c: Add calls to the lock manager
* src/qemu/qemu_driver.c (qemuGetSchedulerParameters): Move
guts...
(qemuGetSchedulerParametersFlags): ...to new callback, and honor
flags more accurately.
This patch allows to modify interfaces of domain(qemu)
* src/conf/domain_conf.c src/conf/domain_conf.h src/libvirt_private.syms:
(virDomainNetInsert) : Insert a network device to domain definition.
(virDomainNetIndexByMac) : Returns an index of net device in array.
(virDomainNetRemoveByMac): Remove a NIC of passed MAC address.
* src/qemu/qemu_driver.c
(qemuDomainAttachDeviceConfig): add codes for NIC.
(qemuDomainDetachDeviceConfig): add codes for NIC.
Originally most of libvirt domain-specific calls were blocking
during a migration.
A new mechanism to allow specific calls (blkstat/blkinfo) to be
executed in such condition has been implemented.
In the long term it'd be desirable to get a more general
solution to mark further APIs as migration safe, without needing
special case code.
* src/qemu/qemu_migration.c: add some additional job signal
flags for doing blkstat/blkinfo during a migration
* src/qemu/qemu_domain.c: add a condition variable that can be
used to efficiently wait for the migration code to clear the
signal flag
* src/qemu/qemu_driver.c: execute blkstat/blkinfo using the
job signal flags during migration
When modifying the disk devices of a live domain and the domain
configuration, the function qemuDomainAttachDeviceConfig
first sets dev->data->disk to NULL. Later qemuDomainAttachDeviceLive
accesses dev->data.disk and causes a segfault.
* src/qemu/qemu_driver.c: fix qemuDomainModifyDeviceFlags() accordingly
The current virDomainMigrateFinish3 method signature attempts to
distinguish two types of errors, by allowing return with ret== 0,
but ddomain == NULL, to indicate a failure to start the guest.
This is flawed, because when ret == 0, there is no way for the
virErrorPtr details to be sent back to the client.
Change the signature of virDomainMigrateFinish3 so it simply
returns a virDomainPtr, in the same way as virDomainMigrateFinish2
The disk locking code will protect against the only possible
failure mode this doesn't account for (loosing conenctivity to
libvirtd after Finish3 starts the CPUs, but before the client
sees the reply for Finish3).
* src/driver.h, src/libvirt.c, src/libvirt_internal.h: Change
virDomainMigrateFinish3 to return a virDomainPtr instead of int
* src/remote/remote_driver.c, src/remote/remote_protocol.x,
daemon/remote.c, src/qemu/qemu_driver.c, src/qemu/qemu_migration.c:
Update for API change
When doing migration, if an error occurs in Perform, it must not
be overwritten during Finish/Confirm steps. If an error occurs
in Finish, it must not be overwritten in Confirm.
Previous commit a9d12c2444 added
code to qemudDomainMigrateFinish2 to preserve the error. This
is not the right place, because it is not applicable in non-p2p
migration. The src/libvirt.c virDomainMigrateV2/3 methods need
code to preserve errors for non-p2p migration, while the
doPeer2PeerMigrate2 and doPeer2PeerMigrate3 methods contain
code to preverse errors for p2p migration.
Remove the bogus error preservation from qemudDomainMigrateFinish2
and qemudDomainMigrateFinish3.
Fix virDomainMigrateV3 and doPeer2PeerMigrate3 so that they
preserve any error hit during the Finish3 step, before invoking
Confirm3.
Finally if qemuMigrationFinish fails to resume the CPUs, it must
preserve the error before tearing down the VM, so that VM cleanup
doesn't overwrite it.
* src/libvirt.c: Preserve error before invoking Confirm3
* src/qemu/qemu_driver.c: Remove bogus error preservation
code in qemudDomainMigrateFinish2/qemudDomainMigrateFinish3
* src/qemu/qemu_migration.c: Preserve error before invoking Confirm3
and after resume fails in qemuMigrationFinish.
Even when failing to start CPUs, the finish method was returning
a success result. Fix this so that the QEMU process is killed
off when finish fails under v3 protocol. Also rename the
killOnFinish boolean to 'v3proto' to make it clearer that this
is a tunable based on the migration protocol version
* src/qemu/qemu_driver.c: Update for API change
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Kill
VM in qemuMigrationFinish if failing to start CPUs
The virDomainMigratePerform3 currently has a single URI parameter
whose meaning varies. It is either
- A QEMU migration URI (normal migration)
- A libvirtd connection URI (peer2peer migration)
Unfortunately when using peer2peer migration, without also
using tunnelled migration, it is possible that both URIs are
required.
This adds a second URI parameter to the virDomainMigratePerform3
method, to cope with this scenario. Each parameter how has a fixed
meaning.
NB, there is no way to actually take advantage of this yet,
since virDomainMigrate/virDomainMigrateToURI do not have any
way to provide the 2 separate URIs
* daemon/remote.c, src/remote/remote_driver.c,
src/remote/remote_protocol.x, src/remote_protocol-structs: Add
the second URI parameter to perform3 message
* src/driver.h, src/libvirt.c, src/libvirt_internal.h: Add
the second URI parameter to Perform3 method
* src/libvirt_internal.h, src/qemu/qemu_migration.c,
src/qemu/qemu_migration.h: Update to handle URIs correctly
This extends the v3 migration protocol such that the
virDomainMigrateBegin3 and virDomainMigratePerform3
methods accept an application supplied XML config for
the target VM.
If the 'xmlin' parameter is NULL, then Begin3 uses the
current guest XML as normal. A driver implementing the
Begin3 method should either reject all non-NULL 'xmlin'
parameters, or strictly validate that the app supplied
XML does not change guest ABI.
The Perform3 method also needed the xmlin parameter to
cope with the Peer2Peer migration sequence.
NB it is not yet possible to use this capability since
neither of the public virDomainMigrate/virDomainMigrateToURI
methods have a way to pass in XML.
* daemon/remote.c, src/remote/remote_driver.c,
src/remote/remote_protocol.x, src/remote_protocol-structs:
Add 'remote_string xmlin' parameter to begin3/perform3
RPC messages
* src/libvirt.c, src/driver.h, src/libvirt_internal.h: Add
'const char *xmlin' parameter to Begin3/Perform3 methods
* src/qemu/qemu_driver.c, src/qemu/qemu_migration.c,
src/qemu/qemu_migration.h: Pass xmlin parameter around
migration methods
Saving domain to previously created file changes also its ownership.
This is certainly not what users want if some conditions are met:
it is a regular, local file and dynamic_ownership is off.
The qemuMigrationConfirm method shouldn't deal with final VM
cleanup, since it can be called from the peer2peer migration,
which expects to still use the 'vm' object afterwards.
Push the cleanup code out of qemuMigrationConfirm, into its
caller, qemuDomainMigrateConfirm3
* src/qemu/qemu_driver.c: Add VM cleanup code to
qemuDomainMigrateConfirm3
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Remove
job handling cleanup from qemuMigrationConfirm
Otherwise qemu is unable to write to it, with the error:
libvir: QEMU error : internal error unable to execute QEMU command 'memsave': Could not open '/var/cache/libvirt/qemu/qemu.mem.RRNvLv'
The v2 migration protocol had a limit on cookie length that was
too small to be useful for QEMU. Avoid generating cookies with
v2 protocol, so that old libvirtd can still reliably migrate a
guest to new libvirtd uses v2 protocol.
* src/qemu/qemu_driver.c: Avoid migration cookies with v2
migration
Improve invalid argument checks in the size query case. The drivers already
relied on this unchecked behavior.
Relax the implementation of virDomainGet(Memory|Blkio)MemoryParameters
in the drivers and allow to pass more memory than necessary for all
parameters.
params and nparams are essential and cannot be NULL. Check this in
libvirt.c and remove redundant checks from the drivers (e.g. xend).
Instead of enforcing that nparams must point to exact same value as
returned by virDomainGetSchedulerType relax this to a lower bound
check. This is what some drivers (e.g. xen hypervisor and esx)
already did. Other drivers (e.g. xend) didn't check nparams at all
and assumed that there is enough space in params.
Unify the behavior in all drivers to a lower bound check and update
nparams to the number of valid values in params on success.
Implement the v3 migration protocol, which has two extra
steps, 'begin' on the source host and 'confirm' on the
source host. All other methods also gain both input and
output cookies to allow bi-directional data passing at
all stages.
The QEMU peer2peer migration method gains another impl
to provide the v3 migration. This finally allows migration
cookies to work with tunnelled migration, which is required
for Spice seamless migration & the lock manager transfer
* src/qemu/qemu_driver.c: Wire up migrate v3 APIs
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Add
begin & confirm methods, and peer2peer impl of v3
The migration protocol has support for a 'cookie' parameter which
is an opaque array of bytes as far as libvirt is concerned. Drivers
may use this for passing around arbitrary extra data they might
need during migration. The QEMU driver needs to do a few things:
- Pass hostname/uuid to allow strict protection against localhost
migration attempts
- Pass SPICE/VNC server port from the target back to the source to
allow seamless relocation of client sessions
- Pass lock driver state from source to destination
This patch introduces the basic glue for handling cookies
but only includes the host/guest UUID & name.
* src/libvirt_private.syms: Export virXMLParseStrHelper
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Parsing
and formatting of migration cookies
* src/qemu/qemu_driver.c: Pass in cookie parameters where possible
* src/remote/remote_protocol.h, src/remote/remote_protocol.x: Change
cookie max length to 16384 bytes
Change all the driver struct initializers to use the
C99 style, leaving out unused fields. This will make
it possible to add new APIs without changing every
driver. eg change:
qemudDomainResume, /* domainResume */
qemudDomainShutdown, /* domainShutdown */
NULL, /* domainReboot */
qemudDomainDestroy, /* domainDestroy */
to
.domainResume = qemudDomainResume,
.domainShutdown = qemudDomainShutdown,
.domainDestroy = qemudDomainDestroy,
And get rid of any existing C99 style initializersr which
set NULL, eg change
.listPools = vboxStorageListPools,
.numOfDefinedPools = NULL,
.listDefinedPools = NULL,
.findPoolSources = NULL,
.poolLookupByName = vboxStoragePoolLookupByName,
to
.listPools = vboxStorageListPools,
.poolLookupByName = vboxStoragePoolLookupByName,
Fix some driver names:
s/virDrvCPUCompare/virDrvCompareCPU/
s/virDrvCPUBaseline/virDrvBaselineCPU/
s/virDrvQemuDomainMonitorCommand/virDrvDomainQemuMonitorCommand/
s/virDrvSecretNumOfSecrets/virDrvNumOfSecrets/
s/virDrvSecretListSecrets/virDrvListSecrets/
And some driver struct field names:
s/getFreeMemory/nodeGetFreeMemory/
Only in drivers which use virDomainObj, drivers that query hypervisor
for domain status need to be updated separately in case their hypervisor
supports this functionality.
The reason is also saved into domain state XML so if a domain is not
running (i.e., no state XML exists) the reason will be lost by libvirtd
restart. I think this is an acceptable limitation.
This is needed if we want to transfer a temporary file. If the
transfer is done with iohelper, we might run into a race condition,
where we unlink() file before iohelper is executed.
* src/fdstream.c, src/fdstream.h,
src/util/iohelper.c: Add new option
* src/lxc/lxc_driver.c, src/qemu/qemu_driver.c,
src/storage/storage_driver.c, src/uml/uml_driver.c,
src/xen/xen_driver.c: Expand existing function calls
We were 31/73 on whether to translate; since less than 50% translated
and since VIR_INFO is less than VIR_WARN which also doesn't translate,
this makes sense.
* cfg.mk (sc_prohibit_gettext_markup): Add VIR_INFO, since it
falls between WARN and DEBUG.
* daemon/libvirtd.c (qemudDispatchSignalEvent, remoteCheckAccess)
(qemudDispatchServer): Adjust offenders.
* daemon/remote.c (remoteDispatchAuthPolkit): Likewise.
* src/network/bridge_driver.c (networkReloadIptablesRules)
(networkStartNetworkDaemon, networkShutdownNetworkDaemon)
(networkCreate, networkDefine, networkUndefine): Likewise.
* src/qemu/qemu_driver.c (qemudDomainDefine)
(qemudDomainUndefine): Likewise.
* src/storage/storage_driver.c (storagePoolCreate)
(storagePoolDefine, storagePoolUndefine, storagePoolStart)
(storagePoolDestroy, storagePoolDelete, storageVolumeCreateXML)
(storageVolumeCreateXMLFrom, storageVolumeDelete): Likewise.
* src/util/bridge.c (brProbeVnetHdr): Likewise.
* po/POTFILES.in: Drop src/util/bridge.c.
These VIR_XXXX0 APIs make us confused, use the non-0-suffix APIs instead.
How do these coversions works? The magic is using the gcc extension of ##.
When __VA_ARGS__ is empty, "##" will swallow the "," in "fmt," to
avoid compile error.
example: origin after CPP
high_level_api("%d", a_int) low_level_api("%d", a_int)
high_level_api("a string") low_level_api("a string")
About 400 conversions.
8 special conversions:
VIR_XXXX0("") -> VIR_XXXX("msg") (avoid empty format) 2 conversions
VIR_XXXX0(string_literal_with_%) -> VIR_XXXX(%->%%) 0 conversions
VIR_XXXX0(non_string_literal) -> VIR_XXXX("%s", non_string_literal)
(for security) 6 conversions
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Introduce a virProcessKill function that can be safely called
even when the job mutex is held. This allows virDomainDestroy
to kill any VM even if it is asleep in a monitor job. The PID
will die and the thread asleep on the monitor will then wake
up releasing the job mutex.
* src/qemu/qemu_driver.c: Kill process before using qemuProcessStop
to ensure job is released
* src/qemu/qemu_process.c: Add virProcessKill for killing off
QEMU processes
This matches the public API and helps to get rid of some special
case code in the remote generator.
Rename driver API functions and XDR protocol structs.
No functional change included outside of the remote generator.
As well as taint warnings going to the main libvirt log,
add taint warnings to the per-domain logfile
Domain id=3 is tainted: high-privileges
Domain id=3 is tainted: disk-probing
Domain id=3 is tainted: shell-scripts
Domain id=3 is tainted: custom-monitor
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: Enhance
qemuDomainTaint to also log to the domain logfile
* src/qemu/qemu_driver.c: Pass -1 for logFD to taint methods to
auto-append to logfile
* src/qemu/qemu_process.c: Pass open logFD at startup for taint
methods
Wire up logging of VM tainting to the QEMU driver
- If running QEMU as root user/group or without capabilities
being cleared
- If passing custom QEMU command line args
- If issuing custom QEMU monitor commands
- If using a network interface config with an associated
shell script
- If using a disk config relying on format probing
The warnings, per-VM appear in the main libvirtd logs
11:56:17.571: 10832: warning : qemuDomainObjTaint:712 : Domain id=1 name='l2' uuid=c7a3edbd-edaf-9455-926a-d65c16db1802 is tainted: high-privileges
11:56:17.571: 10832: warning : qemuDomainObjTaint:712 : Domain id=1 name='l2' uuid=c7a3edbd-edaf-9455-926a-d65c16db1802 is tainted: disk-probing
The taint flags are reset when the VM is stopped.
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: Helper APIs
for logging taint warnings
* src/qemu/qemu_driver.c: Log tainting with custom QEMU monitor
commands and disk/net hotplug with unsupported configs
* src/qemu/qemu_process.c: Log tainting at startup based on
unsupported configs