In a SELinux or root-squashing NFS environment, libvirt has to go
through some hoops to create a new file that qemu can then open()
by name. Snapshots are a case where we want to guarantee an empty
file that qemu can open; also, reopening a save file to convert it
from being marked partial to complete requires a reopen to avoid
O_DIRECT headaches. Refactor some existing code to make it easier
to reuse in later patches.
* src/qemu/qemu_migration.h (qemuMigrationToFile): Drop parameter.
* src/qemu/qemu_migration.c (qemuMigrationToFile): Let cgroup do
the stat, rather than asking caller to do it and pass info down.
* src/qemu/qemu_driver.c (qemuOpenFile): New function, pulled from...
(qemuDomainSaveInternal): ...here.
(doCoreDump, qemuDomainSaveImageOpen): Use it here as well.
The libvirt BlockPull API supports the use of an initial bandwidth limit but the
qemu block_stream API does not. To get the desired behavior we use the two APIs
strung together: first BlockPull, then BlockJobSetSpeed. We can do this at the
driver level to avoid duplicated code in each monitor path.
Signed-off-by: Adam Litke <agl@us.ibm.com>
There is no reason to forbid pausing an autodestroy domain
(not to mention that 'virsh start --paused --autodestroy'
succeeds in creating a paused autodestroy domain).
Meanwhile, qemu was failing to enforce the API documentation that
autodestroy domains cannot be saved. And while the original
documentation only mentioned save/restore, snapshots are another
form of saving that are close enough in semantics as to make no
sense on one-shot domains.
* src/qemu/qemu_driver.c (qemudDomainSuspend): Drop bogus check.
(qemuDomainSaveInternal, qemuDomainSnapshotCreateXML): Forbid
saves of autodestroy domains.
* src/libvirt.c (virDomainCreateWithFlags, virDomainCreateXML):
Document snapshot interaction.
There have been several instances of people having problems with
a broken managed save file, and not aware that they could use
'virsh managedsave-remove dom' to fix things. Making it possible
to do this as part of starting a domain makes the same functionality
easier to find, and one less API call.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_START_FORCE_BOOT): New
flag.
* src/libvirt.c (virDomainCreateWithFlags): Document it.
* src/qemu/qemu_driver.c (qemuDomainObjStart): Alter signature.
(qemuAutostartDomain, qemuDomainStartWithFlags): Update callers.
* tools/virsh.c (cmdStart): Expose it in virsh.
* tools/virsh.pod (start): Document it.
The QEMU 'sendkey' command expects keys to be encoded in the same
way as the RFB extended keycode set. Specifically it wants extended
keys to have the high bit of the first byte set, while the Linux
XT KBD driver codeset uses the low bit of the second byte. To deal
with this we introduce a new keymap 'RFB' and use that in the QEMU
driver
* include/libvirt/libvirt.h.in: Add VIR_KEYCODE_SET_RFB
* src/qemu/qemu_driver.c: Use RFB keycode set instead of XT KBD
* src/util/virkeycode-mapgen.py: Auto-generate the RFB keycode
set from the XT KBD set
* src/util/virkeycode.c: Add RFB keycode entry to table. Add a
verify check on cardinality of the codeOffset table
Audit all changes to the qemu vm->current_snapshot, and make them
update the saved xml file for both the previous and the new
snapshot, so that there is always at most one snapshot with
<active>1</active> in the xml, and that snapshot is used as the
current snapshot even across libvirtd restarts.
This patch does not fix the case of virDomainSnapshotDelete(,CHILDREN)
where one of the children is the current snapshot; that will be later.
* src/conf/domain_conf.h (_virDomainSnapshotDef): Alter member
type and name.
* src/conf/domain_conf.c (virDomainSnapshotDefParseString)
(virDomainSnapshotDefFormat): Update clients.
* docs/schemas/domainsnapshot.rng: Tighten rng.
* src/qemu/qemu_driver.c (qemuDomainSnapshotLoad): Reload current
snapshot.
(qemuDomainSnapshotCreateXML, qemuDomainRevertToSnapshot)
(qemuDomainSnapshotDiscard): Track current snapshot.
Changing the current vm, and writing that change to the file
system, all before a new qemu starts, is risky; it's hard to
roll back if starting the new qemu fails for some reason.
Instead of abusing vm->current_snapshot and making the command
line generator decide whether the current snapshot warrants
using -loadvm, it is better to just directly pass a snapshot all
the way through the call chain if it is to be loaded.
This frees up the last use of snapshot->def->active for qemu's
use, so the next patch can repurpose that field for tracking
which snapshot is current.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Don't use active
field of snapshot.
* src/qemu/qemu_process.c (qemuProcessStart): Add a parameter.
* src/qemu/qemu_process.h (qemuProcessStart): Update prototype.
* src/qemu/qemu_migration.c (qemuMigrationPrepareAny): Update
callers.
* src/qemu/qemu_driver.c (qemudDomainCreate)
(qemuDomainSaveImageStartVM, qemuDomainObjStart)
(qemuDomainRevertToSnapshot): Likewise.
(qemuDomainSnapshotSetCurrentActive)
(qemuDomainSnapshotSetCurrentInactive): Delete unused functions.
https://bugzilla.redhat.com/show_bug.cgi?id=727709
mentions that if qemu fails to create the snapshot (such as what
happens on Fedora 15 qemu, which has qmp but where savevm is only
in hmp, and where libvirt is old enough to not try the hmp fallback),
then 'virsh snapshot-list dom' will show a garbage snapshot entry,
and the libvirt internal directory for storing snapshot metadata
will have a bogus file.
This fixes the fallout bug of polluting the snapshot-list with
garbage on failure (the root cause of the F15 bug of not having
fallback to hmp has already been fixed in newer libvirt releases).
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Allocate
memory before making snapshot, and cleanup on failure. Don't
dereference NULL if transient domain exited during snapshot creation.
Our logic throws off analyzer tools:
ptr var = NULL;
if (flags == 0) flags = live ? _LIVE : _CONFIG;
if (flags & _LIVE) do stuff
if (flags & _CONFIG) var = non-null;
if (flags & _LIVE) do more stuff
else if (flags & _CONFIG) use var
the tools keep thinking that var can still be NULL in the last
if clause, adding the hint shuts them up.
* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Add a
static analysis hint.
Transient domains reject attempts to set autostart, and using
virDomainCreate to restart a domain only works on persistent
domains. Therefore, managed save makes no sense on transient
domains, and should be rejected up front rather than creating
an otherwise unrecoverable managed save file.
Besides, transient domains imply that a lot more management is
being done by the upper layer; this includes the assumption
that the upper layer is okay managing the saved state file
created by virDomainSave, and does not need to use managed save.
* src/libvirt.c: Document that transient domains are incompatible
with managed save.
* src/qemu/qemu_driver.c (qemuDomainManagedSave): Enforce it.
* src/libxl/libxl_driver.c (libxlDomainManagedSave): Likewise.
I noticed some inconsistent use of 'else'.
* src/qemu/qemu_driver.c (qemuCPUCompare)
(qemuDomainSnapshotCreateXML, qemuDomainRevertToSnapshot)
(qemuDomainSnapshotDiscard): Match coding conventions.
If a snapshot with the name already exists, virDomainSnapshotAssignDef()
just returns NULL, in which case the snapshot definition is leaked.
Currently this leak is not a big problem, since qemuDomainSnapshotLoad()
is only called once during initial startup of libvirtd.
Signed-off-by: Philipp Hahn <hahn@univention.de>
Revert 6a1f5f568f8. Now that libvirt_iohelper takes fds by
inheritance rather than by open() (commit 1eb66479), there is
no longer a race where the parent can unlink() a file prior to
the iohelper open()ing the same file. From there, it makes
more sense to have the callers both create and unlink, rather
than the caller create and the stream unlink, since the latter
was only needed when iohelper had to do the unlink.
* src/fdstream.h (virFDStreamOpenFile, virFDStreamCreateFile):
Callers are responsible for deletion.
* src/fdstream.c (virFDStreamOpenFileInternal): Don't leak created
file on failure.
(virFDStreamOpenFile, virFDStreamCreateFile): Drop parameter.
* src/lxc/lxc_driver.c (lxcDomainOpenConsole): Update callers.
* src/qemu/qemu_driver.c (qemuDomainScreenshot)
(qemuDomainOpenConsole): Likewise.
* src/storage/storage_driver.c (storageVolumeDownload)
(storageVolumeUpload): Likewise.
* src/uml/uml_driver.c (umlDomainOpenConsole): Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainScreenshot): Likewise.
* src/xen/xen_driver.c (xenUnifiedDomainOpenConsole): Likewise.
The previous qemu patch could end up calling unlink(tmp) before
tmp was the name of a valid file (unlinking a fileXXXXXX template
instead), or calling unlink(tmp) twice on success (once here,
and once at the end of the stream). Meanwhile, vbox also suffered
from the same leaked tmp file bug.
* src/qemu/qemu_driver.c (qemuDomainScreenshot): Don't unlink on
success, or on invalid name.
* src/vbox/vbox_tmpl.c (vboxDomainScreenshot): Don't leak temp file.
Currently, we attempt to run sync job and async job at the same time. It
means that the monitor commands for two jobs can be run in any order.
In the function qemuDomainObjEnterMonitorInternal():
if (priv->job.active == QEMU_JOB_NONE && priv->job.asyncJob) {
if (qemuDomainObjBeginNestedJob(driver, obj) < 0)
We check whether the caller is an async job by priv->job.active and
priv->job.asynJob. But when an async job is running, and a sync job is
also running at the time of the check, then priv->job.active is not
QEMU_JOB_NONE. So we cannot check whether the caller is an async job
in the function qemuDomainObjEnterMonitorInternal(), and must instead
put the burden on the caller to tell us when an async command wants
to do a nested job.
Once the burden is on the caller, then only async monitor enters need
to worry about whether the VM is still running; for sync monitor enter,
the internal return is always 0, so lots of ignore_value can be dropped.
* src/qemu/THREADS.txt: Reflect new rules.
* src/qemu/qemu_domain.h (qemuDomainObjEnterMonitorAsync): New
prototype.
* src/qemu/qemu_process.h (qemuProcessStartCPUs)
(qemuProcessStopCPUs): Add parameter.
* src/qemu/qemu_migration.h (qemuMigrationToFile): Likewise.
(qemuMigrationWaitForCompletion): Make static.
* src/qemu/qemu_domain.c (qemuDomainObjEnterMonitorInternal): Add
parameter.
(qemuDomainObjEnterMonitorAsync): New function.
(qemuDomainObjEnterMonitor, qemuDomainObjEnterMonitorWithDriver):
Update callers.
* src/qemu/qemu_driver.c (qemuDomainSaveInternal)
(qemudDomainCoreDump, doCoreDump, processWatchdogEvent)
(qemudDomainSuspend, qemudDomainResume, qemuDomainSaveImageStartVM)
(qemuDomainSnapshotCreateActive, qemuDomainRevertToSnapshot):
Likewise.
* src/qemu/qemu_process.c (qemuProcessStopCPUs)
(qemuProcessFakeReboot, qemuProcessRecoverMigration)
(qemuProcessRecoverJob, qemuProcessStart): Likewise.
* src/qemu/qemu_migration.c (qemuMigrationToFile)
(qemuMigrationWaitForCompletion, qemuMigrationUpdateJobStatus)
(qemuMigrationJobStart, qemuDomainMigrateGraphicsRelocate)
(doNativeMigrate, doTunnelMigrate, qemuMigrationPerformJob)
(qemuMigrationPerformPhase, qemuMigrationFinish)
(qemuMigrationConfirm): Likewise.
* src/qemu/qemu_hotplug.c: Drop unneeded ignore_value.
whether or not previous return value is -1, the following codes will be
executed for a inactive guest in src/qemu/qemu_driver.c:
ret = virDomainSaveConfig(driver->configDir, persistentDef);
and if everything is okay, 'ret' is assigned to 0, the previous 'ret'
will be overwritten, this patch will fix this issue.
* src/qemu/qemu_driver.c: avoid return value is overwritten when give a argument
in out of blkio weight range for a inactive guest.
* how to reproduce?
% virsh blkiotune ${guestname} --weight 10
% echo $?
Note: guest must be inactive, argument 10 in out of blkio weight range,
and can get a error information by checking libvirtd.log, however,
virsh hasn't raised any error information, and return value is 0.
https://bugzilla.redhat.com/show_bug.cgi?id=726304
Signed-off-by: Alex Jia <ajia@redhat.com>
whether or not previous return value is -1, the following codes will be
executed for a inactive guest in qemuDomainSetMemoryParameters:
ret = virDomainSaveConfig(driver->configDir, persistentDef);
and if everything is okay, 'ret' is assigned to 0, the previous 'ret'
will be overwritten, this patch will fix this issue.
* src/qemu/qemu_driver.c: avoid return value is overwritten when set
min_guarante value to a inactive guest.
* how to reproduce?
% virsh memtune ${guestname} --min_guarante 1024
% echo $?
Note: guest must be inactive, in fact, 'min_guarante' hasn't been implemented
in memory tunable, and I can get the error when check actual libvirtd.log,
however, virsh hasn't raised any error information, and return value is 0.
Signed-off-by: Alex Jia <ajia@redhat.com>
Introduced by f9a837da73a11ef, the condition is not changed after
the else clause is removed. So now it quit with "domain is not
running" when the domain is running. However, when the domain is
not running, it reports "no job is active".
How to reproduce:
1)
% virsh start $domain
% virsh domjobabort $domain
error: Requested operation is not valid: domain is not running
2)
% virsh destroy $domain
% virsh domjobabort $domain
error: Requested operation is not valid: no job is active on the domain
3)
% virsh save $domain /tmp/$domain.save
Before above commands finished, try to abort job in another terminal
% virsh domabortjob $domain
error: Requested operation is not valid: domain is not running
The goal here is that save-image-dumpxml fed back to
save-image-define should not change the save file; anywhere that
this is not the case is probably a bug in domain_conf.c.
* src/qemu/qemu_driver.c (qemuDomainSaveImageGetXMLDesc)
(qemuDomainSaveImageDefineXML): New functions.
(qemuDomainSaveImageOpen): Add parameter.
(qemuDomainRestoreFlags, qemuDomainObjRestore): Adjust clients.
With this, it is possible to update the path to a disk backing
image on either the save or restore action, without having to
binary edit the XML embedded in the state file.
This also modifies virDomainSave to output a smaller xml (only
the inactive xml, which is all the more virDomainRestore parses),
while still guaranteeing padding for most typical abi-compatible
xml replacements, necessary so that the next patch for
virDomainSaveImageDefineXML will not cause unnecessary
modifications to the save image file.
* src/qemu/qemu_driver.c (qemuDomainSaveInternal): Add parameter,
only use inactive state, and guarantee padding.
(qemuDomainSaveImageOpen): Add parameter.
(qemuDomainSaveFlags, qemuDomainManagedSave)
(qemuDomainRestoreFlags, qemuDomainObjRestore): Update callers.
As written in virStorageFileGetMetadataFromFD decription, caller
must free metadata after use. Qemu driver miss this and therefore
leak metadata which can grow to huge mem leak if somebody query
for blockInfo a lot.
The error in getCompressionType will never be reported, change
the errors codes into warning (VIR_WARN("%s", _(foo)); doesn't break
syntax-check rule), and also improve the docs in qemu.conf to tell
user the truth.
Make MIGRATION_OUT use the new helper methods.
This also introduces new protection to migration v3 process: the
migration job is held from Begin to Confirm to avoid changes to a domain
during migration (esp. between Begin and Perform phases). This change is
automatically applied to p2p and tunneled migrations. For normal
migration, this requires support from a client. In other words, if an
old (pre 0.9.4) client starts normal migration of a domain, the domain
will not be protected against changes between Begin and Perform steps.
The cpu bandwidth is applied at the vcpu group level. We should apply it
at the vm group level too, because the vm may do heavy I/O, and it will affect
the other vm.
We apply cpu bandwidth at the vcpu and the vm group level, so we must ensure
that max(child_quota) <= parent_quota when we modify cpu bandwidth.
Now that virDomainSetVcpusFlags knows about VIR_DOMAIN_AFFECT_CURRENT,
so should virDomainGetVcpusFlags.
Unfortunately, the virsh counterpart 'virsh vcpucount' has already
commandeered --current for a different meaning, so teaching virsh
to expose this in the next patch will require a bit of care.
* src/libvirt.c (virDomainGetVcpusFlags): Allow
VIR_DOMAIN_AFFECT_CURRENT.
* src/libxl/libxl_driver.c (libxlDomainGetVcpusFlags): Likewise.
* src/qemu/qemu_driver.c (qemudDomainGetVcpusFlags): Likewise.
* src/test/test_driver.c (testDomainGetVcpusFlags): Likewise.
* src/xen/xen_driver.c (xenUnifiedDomainGetVcpusFlags): Likewise.
In the XML file we now have
<cputune>
<shares>1024</shares>
<period>90000</period>
<quota>0</quota>
</cputune>
But the schedinfo parameter are being named
cpu_shares: 1024
cfs_period: 90000
cfs_quota: 0
The period/quota is per-vcpu value, so these new tunables should be named
'vcpu_period' and 'vcpu_quota'.
The virDomainBlockPull* family of commands are enabled by the
following HMP/QMP commands: 'block_stream', 'block_job_cancel',
'info block-jobs' / 'query-block-jobs', and 'block_job_set_speed'.
* src/qemu/qemu_driver.c src/qemu/qemu_monitor_text.[ch]: implement disk
streaming by using the proper qemu monitor commands.
* src/qemu/qemu_monitor_json.[ch]: implement commands using the qmp monitor
When auto-dumping a domain on crash events, or autostarting a domain
with managed save state, let the user configure whether to imply
the bypass cache flag.
* src/qemu/qemu.conf (auto_dump_bypass_cache, auto_start_bypass_cache):
Document new variables.
* src/qemu/libvirtd_qemu.aug (vnc_entry): Let augeas parse them.
* src/qemu/qemu_conf.h (qemud_driver): Store new preferences.
* src/qemu/qemu_conf.c (qemudLoadDriverConfig): Parse them.
* src/qemu/qemu_driver.c (processWatchdogEvent, qemuAutostartDomain):
Honor them.
Wire together the previous patches to support file system cache
bypass during API save/restore requests in qemu.
* src/qemu/qemu_driver.c (qemuDomainSaveInternal, doCoreDump)
(qemudDomainObjStart, qemuDomainSaveImageOpen, qemuDomainObjRestore)
(qemuDomainObjStart): Add parameter.
(qemuDomainSaveFlags, qemuDomainManagedSave, qemudDomainCoreDump)
(processWatchdogEvent, qemudDomainStartWithFlags, qemuAutostartDomain)
(qemuDomainRestoreFlags): Update callers.
For all hypervisors that support save and restore, the new API
now performs the same functions as the old.
VBox is excluded from this list, because its existing domainsave
is broken (there is no corresponding domainrestore, and there
is no control over the filename used in the save). A later
patch should change vbox to use its implementation for
managedsave, and teach start to use managedsave results.
* src/libxl/libxl_driver.c (libxlDomainSave): Move guts...
(libxlDomainSaveFlags): ...to new function.
(libxlDomainRestore): Move guts...
(libxlDomainRestoreFlags): ...to new function.
* src/test/test_driver.c (testDomainSave, testDomainSaveFlags)
(testDomainRestore, testDomainRestoreFlags): Likewise.
* src/xen/xen_driver.c (xenUnifiedDomainSave)
(xenUnifiedDomainSaveFlags, xenUnifiedDomainRestore)
(xenUnifiedDomainRestoreFlags): Likewise.
* src/qemu/qemu_driver.c (qemudDomainSave, qemudDomainRestore):
Rename and move guts.
(qemuDomainSave, qemuDomainSaveFlags, qemuDomainRestore)
(qemuDomainRestoreFlags): ...here.
(qemudDomainSaveFlag): Rename...
(qemuDomainSaveInternal): ...to this, and update callers.
This is the one function outside of domain_conf.c that plays around
with (even modifying) the internals of the virDomainNetDef, and thus
can't be fixed up simply by replacing direct accesses to the fields of
the struct with the GetActual*() access functions.
In this case, we need to check if the defined type is "network", and
if it is *then* check the actual type; if the actual type is "bridge",
then we can at least put the bridgename in a place where it can be
used; otherwise (if type isn't "bridge"), we behave exactly as we used
to - just null out *everything*.
This patch implements cfs_period and cfs_quota's modification.
We can use the command 'virsh schedinfo' to query or modify cfs_period and
cfs_quota.
If you query period or quota from config file, the value 0 means it does not set
in the config file.
If you set period or quota to config file, the value 0 means that delete current
setting from config file.
If you modify period or quota while vm is running, the value 0 means that use
current value.