We are about to turn on support for active block commit. Although
qemu 2.0 was the first version to mostly support it, that version
mis-handles 0-length files, and doesn't have anything available for
easy probing. But qemu 2.1 fixed bugs, and made life simpler by
letting the 'top' argument be optional. Unless someone begs for
active commit with qemu 2.0, for now we are just going to enable
it only by probing for qemu 2.1 behavior (anyone backporting active
commit can also backport the optional argument behavior). This
requires qemu.git commit 7676e2c597000eff3a7233b40cca768b358f9bc9.
Although all our actual uses of block-commit supply arguments for
both base and top, we can omit both arguments and use a bogus
device string to trigger an interesting behavior in qemu. All QMP
commands first do argument validation, failing with GenericError
if a mandatory argument is missing. Once that passes, the code
in the specific command gets to do further checking, and the qemu
developers made sure that if device is the only supplied argument,
then the block-commit code will look up the device first, with a
failure of DeviceNotFound, before attempting any further argument
validation (most other validations fail with GenericError). Thus,
the category of error class can reliably be used to decipher
whether the top argument was optional, which in turn implies a
working active commit. Since we expect our bogus device string to
trigger an error either way, the code is written to return a
distinct return value without spamming the logs.
* src/qemu/qemu_monitor.h (qemuMonitorSupportsActiveCommit): New
prototype.
* src/qemu/qemu_monitor.c (qemuMonitorSupportsActiveCommit):
Implement it.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockCommit):
Allow NULL for top and base, for probing purposes.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockCommit):
Likewise, implementing the probe.
* tests/qemumonitorjsontest.c (mymain): Enable...
(testQemuMonitorJSONqemuMonitorSupportsActiveCommit): ...a new test.
Signed-off-by: Eric Blake <eblake@redhat.com>
Replace:
if (virBufferError(&buf)) {
virBufferFreeAndReset(&buf);
virReportOOMError();
...
}
with:
if (virBufferCheckError(&buf) < 0)
...
This should not be a functional change (unless some callers
misused the virBuffer APIs - a different error would be reported
then)
In "src/conf/domain_conf.h" there are many enum declarations. The
cleanup in this header filer was started, but it wasn't enough and
there are many other files that has enum variables declared. So, the
commit was starting to be big. This commit finish the cleanup in this
header file and in other files that has enum variables, parameters,
or functions declared.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
When passing migration bandwidth to QEMU, we multiply it by 1024 * 1024
to convert the speed to B/s and the result still needs to fit in
int64_t.
https://bugzilla.redhat.com/show_bug.cgi?id=1083483
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
I almost wrote a hash value free function that just called
VIR_FREE, then realized I couldn't be the first person to
do that. Sure enough, it was worth factoring into a common
helper routine.
* src/util/virhash.h (virHashValueFree): New function.
* src/util/virhash.c (virHashValueFree): Implement it.
* src/util/virobject.h (virObjectFreeHashData): New function.
* src/libvirt_private.syms (virhash.h, virobject.h): Export them.
* src/nwfilter/nwfilter_learnipaddr.c (virNWFilterLearnInit): Use
common function.
* src/qemu/qemu_capabilities.c (virQEMUCapsCacheNew): Likewise.
* src/qemu/qemu_command.c (qemuDomainCCWAddressSetCreate):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorGetBlockInfo): Likewise.
* src/qemu/qemu_process.c (qemuProcessWaitForMonitor): Likewise.
* src/util/virclosecallbacks.c (virCloseCallbacksNew): Likewise.
* src/util/virkeyfile.c (virKeyFileParseGroup): Likewise.
* tests/qemumonitorjsontest.c
(testQemuMonitorJSONqemuMonitorJSONGetBlockInfo): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
This patch adds qemuMonitorGetDumpGuestMemoryCapability, which is used to check
whether the specified dump-guest-memory format is supported by qemu.
Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Busy enterprise workloads hosted on large sized VM's tend to dirty
memory faster than the transfer rate achieved via live guest migration.
Despite some good recent improvements (& using dedicated 10Gig NICs
between hosts) the live migration may NOT converge.
Recently support was added in qemu (version 1.6) to allow a user to
choose if they wish to force convergence of their migration via a
new migration capability : "auto-converge". This feature allows for qemu
to auto-detect lack of convergence and trigger a throttle-down of the
VCPUs.
This patch includes the libvirt support needed to trigger this
feature. (Testing is in progress)
Signed-off-by: Chegu Vinod <chegu_vinod@hp.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=844378
When qemu dies early after connecting to its monitor but before we
actually try to read something from the monitor, we would just fail
domain start with useless message:
"An error occurred, but the cause is unknown"
This is because the real error gets reported in a monitor EOF handler
executing within libvirt's event loop.
The fix is to take any error set in qemuMonitor structure and propagate
it into the thread-local error when qemuMonitorClose is called and no
thread-local error is set.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Wire up all the pieces to send arbitrary qemu events to a
client using libvirt-qemu.so. If the extra bookkeeping of
generating event objects even when no one is listening turns
out to be noticeable, we can try to further optimize things
by adding a counter for how many connections are using events,
and only dump events when the counter is non-zero; but for
now, I didn't think it was worth the code complexity.
* src/qemu/qemu_driver.c
(qemuConnectDomainQemuMonitorEventRegister)
(qemuConnectDomainQemuMonitorEventDeregister): New functions.
* src/qemu/qemu_monitor.h (qemuMonitorEmitEvent): New prototype.
(qemuMonitorDomainEventCallback): New typedef.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONIOProcessEvent):
Report events.
* src/qemu/qemu_monitor.c (qemuMonitorEmitEvent): New function, to
pass events through.
* src/qemu/qemu_process.c (qemuProcessHandleEvent): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Any source file which calls the logging APIs now needs
to have a VIR_LOG_INIT("source.name") declaration at
the start of the file. This provides a static variable
of the virLogSource type.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The dtrace probe macros rely on the logging API. We can't make
the internal.h header include the virlog.h header though since
that'd be a circular include. Instead simply split the dtrace
probes into their own header file, since there's no compelling
reason for them to be in the main internal.h header.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
While investigating https://bugzilla.redhat.com/show_bug.cgi?id=1061827
I noticed that we pass user input unscathed for block-pull, but
always pass a canonical absolute name through for block-commit.
[Note that we probably _ought_ to validate that the user's request
for block-pull actually matches the backing chain, the way we already
do for block-commit - but that's a separate issue. Further note that
the ability to pass user input through unscathed allows backdoors
such as specifying a backing image that is a network URI such as
a gluster disk, instead of forcing things to the local file system;
which is an area still under active investigation on whether libvirt
needs to behave differently for network disks.]
Since qemu may write the name that the user passed in as the backing
file, a user may have a reason to want a relative file name passed
through to qemu, and always munging things to absolute prevents that.
Put another way, if you have the backing chain:
[A] <- [B(back=./A)] <- [C(back=./B)]
and commit B into A (virsh blockcommit $dom vda --base A --top B),
the metadata of C will have to be re-written. But should it be
rewritten as [C(back=./A)] or as [C(back=/path/to/A)]? Still up in
the air is whether qemu's decision should be based on whether B
and/or C had relative paths, or on whether the --base and/or
--top arguments to the command were relative paths; but if we always
pass a canonical name, we've prevented the spelling of the command
arguments from being part of the hueristics that qemu uses.
I also audited the code, and verified that we never call
qemuMonitorBlockCommit() with a NULL base, either before or after
the change to qemu_driver.c.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Preserve user's
spelling, since absolute vs. relative matters to qemu.
* src/qemu/qemu_monitor.h (qemuMonitorBlockCommit): Base is never
null.
* src/qemu/qemu_monitor.c (qemuMonitorBlockCommit): Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockCommit):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockCommit):
Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
If virDomainMemoryStats was run on a domain with virtio balloon driver
running on an old qemu which supports QMP but does not support qom-list
QMP command, libvirtd would crash. The reason is we did not check if
qemuMonitorJSONGetObjectListPaths failed and moreover we even stored its
result in an unsigned integer type.
There is a number of reported issues when we fail starting a domain.
Turns out that, in some scenarios like high load, 3 second timeout is
not enough for qemu to start up to the phase where the socket is
created. Since there is no downside of waiting longer, raise the
timeout right to 30 seconds.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Recent changes to events (commit 8a29ffcf) resulted in new compile
failures on some targets (such as ARM OMAP5):
conf/domain_event.c: In function 'virDomainEventDispatchDefaultFunc':
conf/domain_event.c:1198:30: error: cast increases required alignment of
target type [-Werror=cast-align]
conf/domain_event.c:1314:34: error: cast increases required alignment of
target type [-Werror=cast-align]
cc1: all warnings being treated as errors
The error is due to alignment; the base class is merely aligned
to the worst of 'int' and 'void*', while the child class must
be aligned to a 'long long'. The solution is to include a
'long long' (and for good measure, a function pointer) in the
base class to ensure correct alignment regardless of what a
child class may add, but to wrap the inclusion in a union so
as to not incur any wasted space. On a typical x86_64 platform,
the base class remains 16 bytes; on i686, the base class remains
12 bytes; and on the impacted ARM platform, the base class grows
from 12 bytes to 16 bytes due to the increase of alignment from
4 to 8 bytes.
Reported by Michele Paolino and others.
* src/util/virobject.h (_virObject): Use a union to ensure that
subclasses never have stricter alignment than the parent.
* src/util/virobject.c (virObjectNew, virObjectUnref)
(virObjectRef): Adjust clients.
* src/libvirt.c (virConnectRef, virDomainRef, virNetworkRef)
(virInterfaceRef, virStoragePoolRef, virStorageVolRef)
(virNodeDeviceRef, virSecretRef, virStreamRef, virNWFilterRef)
(virDomainSnapshotRef): Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorOpenInternal)
(qemuMonitorClose): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Most of our code base uses space after comma but not before;
fix the remaining uses before adding a syntax check.
* src/qemu/qemu_cgroup.c: Consistently use commas.
* src/qemu/qemu_command.c: Likewise.
* src/qemu/qemu_conf.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
* src/qemu/qemu_monitor.c: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
QEMU 1.6.0 introduced new migration status: setup
Libvirt does not expect such string in QMP and refuses to migrate with error
"unexpected migration status in setup"
This patch fixes it.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1018267
The aim of virObject refing and urefing is to tell where the object is
to be used and when is no longer needed. Hence any object shouldn't be
used after it has been unrefed, as we might be the last to hold the
reference. The better way is to call virObjectUnref() *after* the last
object usage. In this specific case, the monitor EOF handler was called
after the qemuMonitorIO called virObjectUnref. Not only that @mon was
disposed (which is not used in the handler anyway) but the @mon->vm
which is causing a SIGSEGV:
2013-11-15 10:17:54.425+0000: 20110: error : qemuMonitorIO:688 : internal error: early end of file from monitor: possible problem:
qemu-kvm: -incoming tcp:01.01.01.0:49152: Failed to bind socket: Cannot assign requested address
Program received signal SIGSEGV, Segmentation fault.
qemuProcessHandleMonitorEOF (mon=<optimized out>, vm=0x7fb728004170) at qemu/qemu_process.c:299
299 if (priv->beingDestroyed) {
(gdb) p *priv
Cannot access memory at address 0x0
(gdb) p vm
$1 = (virDomainObj *) 0x7fb728004170
(gdb) p *vm
$2 = {parent = {parent = {magic = 3735928559, refs = 0, klass = 0xdeadbeef}, lock = {lock = {__data = {__lock = 2, __count = 0, __owner = 20110, __nusers = 1, __kind = 0, __spins = 0, __list = {__prev = 0x0,
__next = 0x0}}, __size = "\002\000\000\000\000\000\000\000\216N\000\000\001", '\000' <repeats 26 times>, __align = 2}}}, pid = 0, state = {state = 0, reason = 0}, autostart = 0, persistent = 0,
updated = 0, def = 0x0, newDef = 0x0, snapshots = 0x0, current_snapshot = 0x0, hasManagedSave = false, privateData = 0x0, privateDataFreeFunc = 0x0, taint = 304}
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The qemu monitor supports retrieval of actual CPUID bits presented to
the guest using QMP monitor. Add APIs to extract these information and
tests for them.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Since the 90139a62 commit the error is copied into mon->lastError but
it's never freed from there.
==31989== 395 bytes in 1 blocks are definitely lost in loss record 877 of 978
==31989== at 0x4A06C2B: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==31989== by 0x7EAF129: strdup (in /lib64/libc-2.15.so)
==31989== by 0x50D586C: virStrdup (virstring.c:554)
==31989== by 0x50976C1: virCopyError (virerror.c:191)
==31989== by 0x5097A35: virCopyLastError (virerror.c:312)
==31989== by 0x114909A9: qemuMonitorIO (qemu_monitor.c:690)
==31989== by 0x509BEDE: virEventPollDispatchHandles (vireventpoll.c:501)
==31989== by 0x509C701: virEventPollRunOnce (vireventpoll.c:648)
==31989== by 0x509A620: virEventRunDefaultImpl (virevent.c:274)
==31989== by 0x520D21C: virNetServerRun (virnetserver.c:1112)
==31989== by 0x11F368: main (libvirtd.c:1513)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
I've noticed a SIGSEGV-ing libvirtd on the destination when the qemu
died too quickly = in Prepare phase. What is happening here is:
1) [Thread 3493] We are in qemuMigrationPrepareAny() and calling
qemuProcessStart() which subsequently calls qemuProcessWaitForMonitor()
and qemuConnectMonitor(). So far so good. The qemuMonitorOpen()
succeeds, however switching monitor to QMP mode fails as qemu died
meanwhile. That is qemuMonitorSetCapabilities() returns -1.
2013-10-08 15:54:10.629+0000: 3493: debug : qemuMonitorSetCapabilities:1356 : mon=0x14a53da0
2013-10-08 15:54:10.630+0000: 3493: debug : qemuMonitorJSONCommandWithFd:262 : Send command '{"execute":"qmp_capabilities","id":"libvirt-1"}' for write with FD -1
2013-10-08 15:54:10.630+0000: 3493: debug : virEventPollUpdateHandle:147 : EVENT_POLL_UPDATE_HANDLE: watch=17 events=13
...
2013-10-08 15:54:10.631+0000: 3493: debug : qemuMonitorSend:956 : QEMU_MONITOR_SEND_MSG: mon=0x14a53da0 msg={"execute":"qmp_capabilities","id":"libvirt-1"}
fd=-1
2013-10-08 15:54:10.631+0000: 3262: debug : virEventPollRunOnce:641 : Poll got 1 event(s)
2) [Thread 3262] The event loop is trying to do the talking to monitor.
However, qemu is dead already, remember?
2013-10-08 15:54:13.436+0000: 3262: error : qemuMonitorIORead:551 : Unable to read from monitor: Connection reset by peer
2013-10-08 15:54:13.516+0000: 3262: debug : virFileClose:90 : Closed fd 25
...
2013-10-08 15:54:13.533+0000: 3493: debug : qemuMonitorSend:968 : Send command resulted in error internal error: early end of file from monitor: possible problem:
3) [Thread 3493] qemuProcessStart() failed. No big deal. Go to the
'endjob' label and subsequently to the 'cleanup'. Since the domain is
not persistent and ret is -1, the qemuDomainRemoveInactive() is called.
This has an (unpleasant) effect of virObjectUnref()-in the @vm object.
Unpleasant because the event loop which is about to trigger EOF callback
still holds a pointer to the @vm (not the reference). See the valgrind
output below.
4) [Thread 3262] So the event loop starts triggering EOF:
2013-10-08 15:54:13.542+0000: 3262: debug : qemuMonitorIO:729 : Triggering EOF callback
2013-10-08 15:54:13.543+0000: 3262: debug : qemuProcessHandleMonitorEOF:294 : Received EOF on 0x14549110 'migt10'
And the monitor is cleaned up. This results in calling
qemuProcessHandleMonitorEOF with the @vm pointer passed. The pointer is
kept in qemuMonitor struct.
==3262== Thread 1:
==3262== Invalid read of size 4
==3262== at 0x77ECCAA: pthread_mutex_lock (in /lib64/libpthread-2.15.so)
==3262== by 0x52FAA06: virMutexLock (virthreadpthread.c:85)
==3262== by 0x52E3891: virObjectLock (virobject.c:320)
==3262== by 0x11626743: qemuProcessHandleMonitorEOF (qemu_process.c:296)
==3262== by 0x11642593: qemuMonitorIO (qemu_monitor.c:730)
==3262== by 0x52BD526: virEventPollDispatchHandles (vireventpoll.c:501)
==3262== by 0x52BDD49: virEventPollRunOnce (vireventpoll.c:648)
==3262== by 0x52BBC68: virEventRunDefaultImpl (virevent.c:274)
==3262== by 0x542D3D9: virNetServerRun (virnetserver.c:1112)
==3262== by 0x11F368: main (libvirtd.c:1513)
==3262== Address 0x14549128 is 24 bytes inside a block of size 136 free'd
==3262== at 0x4C2AF5C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==3262== by 0x529B1FF: virFree (viralloc.c:580)
==3262== by 0x52E3703: virObjectUnref (virobject.c:270)
==3262== by 0x531557E: virDomainObjListRemove (domain_conf.c:2355)
==3262== by 0x1160E899: qemuDomainRemoveInactive (qemu_domain.c:2061)
==3262== by 0x1163A0C6: qemuMigrationPrepareAny (qemu_migration.c:2450)
==3262== by 0x1163A923: qemuMigrationPrepareDirect (qemu_migration.c:2626)
==3262== by 0x11682D71: qemuDomainMigratePrepare3Params (qemu_driver.c:10309)
==3262== by 0x53B0976: virDomainMigratePrepare3Params (libvirt.c:7266)
==3262== by 0x1502D3: remoteDispatchDomainMigratePrepare3Params (remote.c:4797)
==3262== by 0x12DECA: remoteDispatchDomainMigratePrepare3ParamsHelper (remote_dispatch.h:5741)
==3262== by 0x54322EB: virNetServerProgramDispatchCall (virnetserverprogram.c:435)
The mon->vm is set in qemuMonitorOpenInternal() which is the correct
place to increase @vm ref counter. The correct place to decrease the ref
counter is then qemuMonitorDispose().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Change the monitor error code to add the ability to access the qemu log
file using a file descriptor so that we can dig in it for a more useful
error message. The error is now logged on monitor hangups and overwrites
a possible lesser error. A hangup on the monitor usualy means that qemu
has crashed and there's a significant chance it produced a useful error
message.
The functionality will be latent until the next patch.
Early VM startup errors usually produce a better error message in the
machine log file. Currently we were accessing it only when the process
exited during certain phases of startup. This will help adding a more
comprehensive error extraction for early qemu startup phases.
This patch adds infrastructure to keep a file descriptor for the machine
log file that will be used in case an error happens.
The VIR_DOMAIN_PAUSED_GUEST_PANICKED constant is badly named,
leaking the QEMU event name. Elsewhere in the API we use
'CRASHED' rather than 'PANICKED', and the addition of 'GUEST'
is redundant since all events are guest related.
Thus rename it to VIR_DOMAIN_PAUSED_CRASHED, which matches
with VIR_DOMAIN_RUNNING_CRASHED and VIR_DOMAIN_EVENT_CRASHED.
It was added in commit 14e7e0ae8d
which post-dates v1.1.0, so is safe to rename before 1.1.1
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch will add the qemuMonitorJSONGetMemoryStats() to execute a
"guest-stats" on the balloonpath using "get-qom" replacing the former
mechanism which looked through the "query-ballon" returned data for
the fields. The "query-balloon" code only returns 'actual' memory.
Rather than duplicating the existing code, have the JSON API use the
GetBalloonInfo API.
A check in the qemuMonitorGetMemoryStats() will be made to ensure the
balloon driver path has been set. Since the underlying JSON code can
return data not associated with the balloon driver, we don't fail on
a failure to get the balloonpath. Of course since we've made the check,
we can then set the ballooninit flag. Getting the path here is primarily
due to the process reconnect path which doesn't attempt to set the
collection period.
At vm startup and attach attempt to set the balloon driver statistics
collection period based on the value found in the domain xml file. This
is not done at reconnect since it's possible that a collection period
was set on the live guest and making the set period call would reset to
whatever value is stored in the config file.
Setting the stats collection period has a side effect of searching through
the qom-list output for the virtio balloon driver and making sure that it
has the right properties in order to allow setting of a collection period
and eventually fetching of statistics.
The walk through the qom-list is expensive and thus the balloonpath will
be saved in the monitor private structure as well as a flag indicating
that the initialization has already been attempted (in the event that a
path is not found, no sense to keep checking).
This processing model conforms to the qom object model model which
requires setting object properties after device startup. That is, it's
not possible to pass the period along via the startup code as it won't
be recognized.
The function being introduced is responsible for preparing and
executing 'chardev-add' qemu monitor command. Moreover, in case
of PTY chardev, the corresponding pty path is updated.
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add monitor callback API domainGuestPanic, that implements
'destroy', 'restart' and 'preserve' events of the 'on_crash'
in the XML when domain crashed.
A bug in Cygwin [1] and poor error messages from gcc [2] lead
to this confusing compilation error:
qemu/qemu_monitor.c:418:9: error: passing argument 2 of 'sendmsg' from incmpatible pointer type
/usr/include/sys/socket.h:42:11: note: expected 'const struct msghdr *' but argument is of type 'struct msghdr *'
[1] http://cygwin.com/ml/cygwin/2013-05/msg00451.html
[2] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57475
* src/qemu/qemu_monitor.c (includes): Include <sys/socket.h>
before <sys/un.h>.
Signed-off-by: Eric Blake <eblake@redhat.com>
In order to learn libvirt multiqueue several things must be done:
1) The '/dev/net/tun' device needs to be opened multiple times with
IFF_MULTI_QUEUE flag passed to ioctl(fd, TUNSETIFF, &ifr);
2) Similarly, '/dev/vhost-net' must be opened as many times as in 1)
in order to keep 1:1 ratio recommended by qemu and kernel folks.
3) The command line construction code needs to switch from 'fd=X' to
'fds=X:Y:...:Z' and from 'vhostfd=X' to 'vhostfds=X:Y:...:Z'.
4) The monitor handling code needs to learn to pass multiple FDs.
Ever since the conversion to using only QMP for probing features
of qemu 1.2 and newer, we have been unable to detect features
that are added only by additional command line options. For
example, we'd like to know if '-machine mem-merge=on' (added
in qemu 1.5) is present. To do this, we will take advantage
of qemu 1.5's query-command-line-parameters QMP call [1].
This patch wires up the framework for probing the command results;
if the QMP command is missing, or if a particular command line
option does not output any parameters (for example, -net uses
a polymorphic parser, which showed up as no parameters as of qemu
1.5), we silently treat that command as having no results.
[1] https://lists.gnu.org/archive/html/qemu-devel/2013-04/msg05180.html
* src/qemu/qemu_monitor.h (qemuMonitorGetOptions)
(qemuMonitorSetOptions)
(qemuMonitorGetCommandLineOptionParameters): New functions.
* src/qemu/qemu_monitor_json.h
(qemuMonitorJSONGetCommandLineOptionParameters): Likewise.
* src/qemu/qemu_monitor.c (_qemuMonitor): Add cache field.
(qemuMonitorDispose): Clean it.
(qemuMonitorGetCommandLineOptionParameters): Implement new function.
* src/qemu/qemu_monitor_json.c
(qemuMonitorJSONGetCommandLineOptionParameters): Likewise.
(testQemuMonitorJSONGetCommandLineParameters): Test it.
Signed-off-by: Eric Blake <eblake@redhat.com>
The source code base needs to be adapted as well. Some files
include virutil.h just for the string related functions (here,
the include is substituted to match the new file), some include
virutil.h without any need (here, the include is removed), and
some require both.
Probe for QEMU's QMP TPM support by querying the lists of
supported TPM models (query-tpm-models) and backend types
(query-tpm-types).
The setting of the capability flags following the strings
returned from the commands above is only provided in the
patch where domain_conf.c gets TPM support due to dependencies
on functions only introduced there.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
The JSON generator is able to represent only values less than LLONG_MAX, fix the
bandwidth limit checks when converting to value to catch overflows before they
reach the generator.
The VIR_ERR_NO_SUPPORT error code is reserved for cases where an
API is not implemented in a driver. It definitely should not be
used when an API execution fails due to unsupported operation.
If virCondInit fails (okay, so that's unlikely), then we end up
attempting a virObjectUnlock() on the cleanup path, even though
we don't hold a lock. This is not guaranteed to be safe. While
at it, I noticed a couple places where we were referencing mon->fd
outside locks.
* src/qemu/qemu_monitor.c (qemuMonitorOpenInternal): Minimize lock
duration. mon->watch doesn't need clean up on error.
(qemuMonitorGetBlockExtent, qemuMonitorBlockResize): Don't
dereference fd outside of lock.
This will be used with new migration scheme.
This patch creates basically just monitor stub
functions. Wiring them into something useful
is done in later patches.
This will be used with new migration scheme.
This patch creates basically just monitor stub
functions. Wiring them into something useful
is done in later patches.
As a side effect, this also fixes reporting disk migration process.
It was added to memory migration progress, which was wrong. Disk
progress has dedicated fields in virDomainJobInfo structure.
Add entry points for calling the qemu 'add-fd' and 'remove-fd'
monitor commands. There is no entry point for 'query-fdsets';
the assumption is that a developer can use
virsh qemu-monitor-command domain '{"execute":"query-fdsets"}'
when debugging issues, and that meanwhile, libvirt is responsible
enough to remember what fds it associated with what fdsets.
Likewise, on the 'add-fd' command, it is assumed that libvirt
will always pass a set id, rather than letting qemu autogenerate
the next available id number.
* src/qemu/qemu_monitor.c (qemuMonitorAddFd, qemuMonitorRemoveFd):
New functions.
* src/qemu/qemu_monitor.h (qemuMonitorAddFd, qemuMonitorRemoveFd):
New prototypes.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONAddFd)
(qemuMonitorJSONRemoveFd): New functions.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONAddFd)
(qemuMonitorJSONRemoveFd): New prototypes.
The virDomainObj, qemuAgent, qemuMonitor, lxcMonitor classes
all require a mutex, so can be switched to use virObjectLockable
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently all classes must directly inherit from virObject.
This allows for arbitrarily deep hierarchy. There's not much
to this aside from chaining up the 'dispose' handlers from
each class & providing APIs to check types.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Perform all the appropriate plumbing.
When qemu/KVM VMs are paused manually through a monitor not-owned by libvirt,
libvirt will think of them as "paused" event after they are resumed and
effectively running. With this patch the discrepancy goes away.
This is meant to address bug 892791.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Noticed these while building on FreeBSD.
* src/qemu/qemu_monitor.c (qemuMonitorBlockInfoLookup): Rename
variable to avoid 'devname' collision.
* src/qemu/qemu_driver.c (qemuDomainInterfaceStats): Mark unused
variable.
Only one error in qemu_monitor was already using the relatively
new OPERATION_UNSUPPORTED error, even though it is a better fit
for all of the messages related to options that are unsupported
due to the version of qemu in use rather than due to a user's
XML or .conf file choice. Suggested by Osier Yang.
* src/qemu/qemu_monitor.c (qemuMonitorSendFileHandle)
(qemuMonitorAddHostNetwork, qemuMonitorRemoveHostNetwork)
(qemuMonitorAttachDrive, qemuMonitorDiskSnapshot)
(qemuMonitorDriveMirror, qemuMonitorTransaction)
(qemuMonitorBlockCommit, qemuMonitorDrivePivot)
(qemuMonitorBlockJob, qemuMonitorSystemWakeup)
(qemuMonitorGetVersion, qemuMonitorGetMachines)
(qemuMonitorGetCPUDefinitions, qemuMonitorGetCommands)
(qemuMonitorGetEvents, qemuMonitorGetKVMState)
(qemuMonitorGetObjectTypes, qemuMonitorGetObjectProps)
(qemuMonitorGetTargetArch): Use better error category.
Without this patch, attempts to create a disk snapshot when qemu
is too old results in a cryptic message:
virsh # snapshot-create 23 --disk-only
error: operation failed: Failed to take snapshot: unknown command: 'snapshot_blkdev'
Now it reports:
virsh # snapshot-create 23 --disk-only
error: unsupported configuration: live disk snapshot not supported with this QEMU binary
All versions of qemu that support live disk snapshot also support
QMP (basically upstream qemu 1.1 and later, and backports to RHEL 6.2).
* src/qemu/qemu_capabilities.h (QEMU_CAPS_DISK_SNAPSHOT): New
capability.
* src/qemu/qemu_capabilities.c (qemuCaps): Track it.
(qemuCapsProbeQMPCommands): Set it.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive): Use
it.
* src/qemu/qemu_monitor.c (qemuMonitorDiskSnapshot): Simplify.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskSnapshot):
Likewise.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextDiskSnapshot):
Delete.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextDiskSnapshot):
Likewise.
https://bugzilla.redhat.com/show_bug.cgi?id=872292
Libvirt should not attempt to call a QMP command that has not been
documented in qemu.git - if future qemu introduces a command by the
same name but with subtly different semantics, then libvirt will be
broken when trying to use that command.
We also had some code that could never be reached - some of our
commands have an alternate for new vs. old qemu HMP commands; but
if we are new enough to support QMP, we only need a fallback to
the new HMP counterpart, and don't need to try for a QMP counterpart
for the old HMP version.
See also this attempt to convert the three snapshot commands to QMP:
https://lists.gnu.org/archive/html/qemu-devel/2012-07/msg01597.html
although it looks like that will still not happen before qemu 1.3.
That thread eventually decided that qemu would use the name
'save-vm' rather than 'savevm', which mitigates the fact that
libvirt's attempt to use a QMP 'savevm' would be broken, but we
might not be as lucky on the other commands.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONSetCPU)
(qemuMonitorJSONAddDrive, qemuMonitorJSONDriveDel)
(qemuMonitorJSONCreateSnapshot, qemuMonitorJSONLoadSnapshot)
(qemuMonitorJSONDeleteSnapshot): Use only HMP fallback for now.
(qemuMonitorJSONAddHostNetwork, qemuMonitorJSONRemoveHostNetwork)
(qemuMonitorJSONAttachDrive, qemuMonitorJSONGetGuestDriveAddress):
Delete; QMP implies QEMU_CAPS_DEVICE, which prefers AddNetdev,
RemoveNetdev, and AddDrive anyways (qemu_hotplug.c has all callers).
* src/qemu/qemu_monitor.c (qemuMonitorAddHostNetwork)
(qemuMonitorRemoveHostNetwork, qemuMonitorAttachDrive): Reflect
deleted commands.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONAddHostNetwork)
(qemuMonitorJSONRemoveHostNetwork, qemuMonitorJSONAttachDrive):
Likewise.
If qemuMonitorOpenUnix is called without a related pid, i.e. for
QMP probing, a connect failure can happen as the result of a race.
Without a pid there is no retry and thus we give up too early.
This changes the code to retry if no pid is supplied.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
The libvirt coding standard is to use 'function(...args...)'
instead of 'function (...args...)'. A non-trivial number of
places did not follow this rule and are fixed in this patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When there is no 'qemu-kvm' binary and the emulator used for a machine
is, for example, 'qemu-system-x86_64' that, by default, runs without
kvm enabled, libvirt still supplies '-no-kvm' option to this process,
even though it does not recognize such option (making the start of a
domain fail in that case).
This patch fixes building a command-line for QEMU machines without KVM
acceleration and is based on following assumptions:
- QEMU_CAPS_KVM flag means that QEMU is running KVM accelerated
machines by default (without explicitly requesting that using a
command-line option). It is the closest to the truth according to
the code with the only exception being the comment next to the
flag, so it's fixed in this patch as well.
- QEMU_CAPS_ENABLE_KVM flag means that QEMU is, by default, running
without KVM acceleration and in case we need KVM acceleration it
needs to be explicitly instructed to do so. This is partially
true for the past (this option essentially means that QEMU
recognizes the '-enable-kvm' option, even though it's almost the
same).
Upstream qemu 1.3 is adding two new monitor commands, 'drive-mirror'
and 'block-job-complete'[1], which can drive live block copy and
storage migration. [Additionally, RHEL 6.3 had backported an earlier
version of most of the same functionality, but under the names
'__com.redhat_drive-mirror' and '__com.redhat_drive-reopen' and with
slightly different JSON arguments, and has been using patches similar
to these upstream patches for several months now.]
The libvirt API virDomainBlockRebase as already committed for 0.9.12
is flexible enough to expose the basics of block copy, but some
additional features in the 'drive-mirror' qemu command, such as
setting error policy, setting granularity, or using a persistent
bitmap, may later require a new libvirt API virDomainBlockCopy. I
will wait to add that API until we know more about what qemu 1.3
will finally provide.
This patch caters only to the upstream qemu 1.3 interface, although
I have proven that the changes for RHEL 6.3 can be isolated to
just qemu_monitor_json.c, and the rest of this series will
gracefully handle either interface once the JSON differences are
papered over in a downstream patch.
For consistency with other block job commands, libvirt must handle
the bandwidth argument as MiB/sec from the user, even though qemu
exposes the speed argument as bytes/sec; then again, qemu rounds
up to cluster size internally, so using MiB hides the worst effects
of that rounding if you pass small numbers.
[1]https://lists.gnu.org/archive/html/qemu-devel/2012-10/msg04123.html
* src/qemu/qemu_capabilities.h (QEMU_CAPS_DRIVE_MIRROR)
(QEMU_CAPS_DRIVE_REOPEN): New bits.
* src/qemu/qemu_capabilities.c (qemuCaps): Name them.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONCheckCommands): Set
them.
(qemuMonitorJSONDriveMirror, qemuMonitorDrivePivot): New functions.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDriveMirror)
(qemuMonitorDrivePivot): Declare them.
* src/qemu/qemu_monitor.c (qemuMonitorDriveMirror)
(qemuMonitorDrivePivot): New passthroughs.
* src/qemu/qemu_monitor.h (qemuMonitorDriveMirror)
(qemuMonitorDrivePivot): Declare them.
qemu 1.3 will be adding a 'block-commit' monitor command, per
qemu.git commit ed61fc1. It matches nicely to the libvirt API
virDomainBlockCommit.
* src/qemu/qemu_capabilities.h (QEMU_CAPS_BLOCK_COMMIT): New bit.
* src/qemu/qemu_capabilities.c (qemuCapsProbeQMPCommands): Set it.
* src/qemu/qemu_monitor.h (qemuMonitorBlockCommit): New prototype.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockCommit):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorBlockCommit): Implement it.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockCommit):
Likewise.
(qemuMonitorJSONHandleBlockJobImpl)
(qemuMonitorJSONGetBlockJobInfoOne): Handle new event type.
This patch adds support for SUSPEND_DISK event; both lifecycle and
separated. The support is added for QEMU, machines are changed to
PMSUSPENDED, but as QEMU sends SHUTDOWN afterwards, the state changes
to shut-off. This and much more needs to be done in order for libvirt
to work with transient devices, wake-ups etc. This patch is not
aiming for that functionality.
After calling qemuMonitorClose(), it is still possible for
the QEMU monitor I/O event callback to get invoked. This
will trigger an error message because mon->fd has been set
to -1 at this point. Silently ignore the case where mon->fd
is -1, likewise for mon->watch being zero.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The qemuMonitorOpen method only needs a virDomainObjPtr in order
to access the QEMU pid. This is not critical when detecting the
QEMU capabilties, so can easily be skipped
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The qemuMonitorSetCapabilities() API is used to initialize the QMP
protocol capabilities. It has since been abused to initialize some
libvirt internal capabilities based on command/event existance too.
Move the latter code out into qemuCapsProbeQMP() in the QEMU
capabilities source file instead
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The qemu monitor does not require qemu_conf.h, and the
qemu capabilities code actually wants bitmap.h
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetTargetArch() method to support invocation
of the 'query-target' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetObjectProps() method to support invocation
of the 'device-list-properties' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetObjectTypes() method to support invocation
of the 'qom-list-types' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetEvents() method to support invocation
of the 'query-events' JSON monitor command. No HMP equivalent
is required, since this will only be used when JSON is available
The existing qemuMonitorJSONCheckEvents() method is refactored
to use this new method
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetCPUCommands() method to support invocation
of the 'query-commands' JSON monitor command. No HMP equivalent
is required, since this will only be used when JSON is available
The existing qemuMonitorJSONCheckCommands() method is refactored
to use this new method
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetCPUDefinitions() method to support invocation
of the 'query-cpu-definitions' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetMachines() method to support invocation
of the 'query-machines' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetVersion() method to support invocation
of the 'query-version' JSON monitor command. No HMP equivalent
is provided, since this will only be used for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Recently, there have been some improvements made to qemu so it
supports seamless migration or something very close to it.
However, it requires libvirt interaction. Once qemu is migrated,
the SPICE server needs to send its internal state to the destination.
Once it's done, it fires SPICE_MIGRATE_COMPLETED event and this
fact is advertised in 'query-spice' output as well.
We must not kill qemu until SPICE server finishes the transfer.
There are a number of process related functions spread
across multiple files. Start to consolidate them by
creating a virprocess.{c,h} file
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://www.gnu.org/licenses/gpl-howto.html recommends that
the 'If not, see <url>.' phrase be a separate sentence.
* tests/securityselinuxhelper.c: Remove doubled line.
* tests/securityselinuxtest.c: Likewise.
* globally: s/; If/. If/
Upstream qemu has raised a concern about whether dumping guest
memory by reading guest paging tables is a security hole:
https://lists.gnu.org/archive/html/qemu-devel/2012-09/msg02607.html
While auditing libvirt to see if we would be impacted, I noticed
that we had some dead code. It is simpler to nuke the dead code
and limit our monitor code to just the subset we make use of.
* src/qemu/qemu_monitor.h (QEMU_MONITOR_DUMP): Drop poorly named
and mostly-unused enum.
* src/qemu/qemu_monitor.c (qemuMonitorDumpToFd): Drop arguments.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDump): Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDump): Likewise.
* src/qemu/qemu_driver.c (qemuDumpToFd): Update caller.
Don't bother checking for the existance of the HMP passthrough
command. Just try to execute it, and propagate the failure.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The current qemu capabilities are stored in a virBitmapPtr
object, whose type is exposed to callers. We want to store
more data besides just the flags, so we need to move to a
struct type. This object will also need to be reference
counted, since we'll be maintaining a cache of data per
binary. This change introduces a 'qemuCapsPtr' virObject
class. Most of the change is just renaming types and
variables in all the callers
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Technically speaking we should wait until we receive the QMP
greeting message before attempting to send any QMP monitor
commands. Mostly we've got away with this, but there is a race
in some QEMU which cause it to SEGV if you sent it data too
soon after startup. Waiting for the QMP greeting avoids the
race
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently qemuMonitorOpen() requires an address of the QEMU
monitor. When doing QMP based capabilities detection it is
easier if a pre-opened FD can be provided, since then the
monitor can be run on the STDIO console. Add a new API
qemuMonitorOpenFD() for such usage
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add some non-null annotations to qemuMonitorOpen and also
check that the error callback is set, since it is mandatory
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the functions the parse/format, and validate PCI addresses to
their own file so they can be conveniently used in other places
besides device_conf.c
Refactoring existing code without causing any functional changes to
prepare for new code.
This patch makes the code reusable.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Per the FSF address could be changed from time to time, and GNU
recommends the following now: (http://www.gnu.org/licenses/gpl-howto.html)
You should have received a copy of the GNU General Public License
along with Foobar. If not, see <http://www.gnu.org/licenses/>.
This patch removes the explicit FSF address, and uses above instead
(of course, with inserting 'Lesser' before 'General').
Except a bunch of files for security driver, all others are changed
automatically, the copyright for securify files are not complete,
that's why to do it manually:
src/security/security_selinux.h
src/security/security_driver.h
src/security/security_selinux.c
src/security/security_apparmor.h
src/security/security_apparmor.c
src/security/security_driver.c
If QEMU supports the BALLOON_EVENT QMP event, then we can
avoid invoking 'query-balloon' when returning XML or the
domain info.
* src/qemu/qemu_capabilities.c, src/qemu/qemu_capabilities.h:
Add QEMU_CAPS_BALLOON_EVENT
* src/qemu/qemu_driver.c: Skip query-balloon in
qemudDomainGetInfo and qemuDomainGetXMLDesc if we have
QEMU_CAPS_BALLOON_EVENT set
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Check
for BALLOON_EVENT at connect to monitor. Add callback
for balloon change notifications
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h:
Add handling of BALLOON_EVENT and impl 'query-events'
check
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
dump-guest-memory is a new dump mechanism, and it can work when the
guest uses host devices. This patch adds a API to use this new
monitor command.
We will always use json mode if qemu's version is >= 0.15, so I
don't implement the API for text mode.
While unescaping the commands the commands passed through to the monitor
function qemuMonitorUnescapeArg() initialized lenght of the input string
to strlen()+1 which is fine for alloc but not for iteration of the
string.
This patch fixes the off-by-one error and drops the pointless check for
a single trailing slash that is automaticaly handled by the default
branch of switch.
When building as driver modules, it is not possible for the QEMU
driver module to reference the DTrace/SystemTAP probes linked into
the main libvirt.so. Thus we need to move the QEMU probes into a
separate file 'libvirt_qemu_probes.d'. Also rename the existing
file from 'probes.d' to 'libvirt_probes.d' while we're at it
* daemon/Makefile.am, src/internal.h: Include libvirt_probes.h
instead of probes.h
* src/Makefile.am: Add rules for libvirt_qemu_probes.d
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor_json.c,
src/qemu/qemu_monitor_text.c: Include libvirt_qemu_probes.h
* src/libvirt_probes.d: Rename from probes.d
* src/libvirt_qemu_probes.d: QEMU specific probes formerly
in probes.d
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit cdce2f42d tried to silence a compiler warning on 32-bit builds,
but the gcc shipped with RHEL 5 is old enough that the type conversion
via multiplication by 1 was insufficient for the task.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Previous attempt
didn't get past all gcc versions.
On 32-bit platforms, gcc warns that the comparison between a long
and (ULLONG_MAX/1024/1024) is always false; throwing in a type
conversion shuts up the warning.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Shut gcc up.
With RHEL 6.2, virDomainBlockPull(dom, dev, bandwidth, 0) has a race
with non-zero bandwidth: there is a window between the block_stream
and block_job_set_speed monitor commands where an unlimited amount
of data was let through, defeating the point of a throttle.
This race was first identified in commit a9d3495e, and libvirt was
able to reduce the size of the window for that race. In the meantime,
the qemu developers decided to fix things properly; per this message:
https://lists.gnu.org/archive/html/qemu-devel/2012-04/msg03793.html
the fix will be in qemu 1.1, and changes block-job-set-speed to use
a different parameter name, as well as adding a new optional parameter
to block-stream, which eliminates the race altogether.
Since our documentation already mentioned that we can refuse a non-zero
bandwidth for some hypervisors, I think the best solution is to do
just that for RHEL 6.2 qemu, so that the race is obvious to the user
(anyone using stock RHEL 6.2 binaries won't have this patch, and anyone
building their own libvirt with this patch for RHEL can also rebuild
qemu to get the modern semantics, so it is no real loss in behavior).
Meanwhile the code must be fixed to honor actual qemu 1.1 naming.
Rename the parameter to 'modern', since the naming difference now
covers more than just 'async' block-job-cancel. And while at it,
fix an unchecked integer overflow.
* src/qemu/qemu_monitor.h (enum BLOCK_JOB_CMD): Drop unused value,
rename enum to match conventions.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Reflect enum rename.
* src/qemu_qemu_monitor_json.h (qemuMonitorJSONBlockJob): Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Likewise,
and support difference between RHEL 6.2 and qemu 1.1 block pull.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Reject
bandwidth during pull with too-old qemu.
* src/libvirt.c (virDomainBlockPull, virDomainBlockRebase):
Document this.
Most of our errors complaining about an inability to support a
particular action due to qemu limitations used CONFIG_UNSUPPORTED,
but we had a few outliers. Reported by Jiri Denemark.
* src/qemu/qemu_command.c (qemuBuildDriveDevStr): Prefer
CONFIG_UNSUPPORTED.
* src/qemu/qemu_driver.c (qemuDomainReboot)
(qemuDomainBlockJobImpl): Likewise.
* src/qemu/qemu_hotplug.c (qemuDomainAttachPciControllerDevice):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorTransaction)
(qemuMonitorBlockJob, qemuMonitorSystemWakeup): Likewise.
RHEL 6.2 was released with an early version of block jobs, which only
worked on the qed file format, where the commands were spelled with
underscore (contrary to QMP style), and where 'block_job_cancel' was
synchronous and did not trigger an event.
The upcoming qemu 1.1 release has fixed these short-comings [1][2]:
the commands now work on multiple file types, are spelled with dash,
and 'block-job-cancel' is asynchronous and emits an event upon conclusion.
[1]qemu commit 370521a1d6f5537ea7271c119f3fbb7b0fa57063
[2]https://lists.gnu.org/archive/html/qemu-devel/2012-04/msg01248.html
This patch recognizes the new spellings, and fixes virDomainBlockRebase
to give a graceful error when talking to a too-old qemu on a partial
rebase attempt. Fixes for the new semantics will come later. This
patch also removes a bogus ATTRIBUTE_NONNULL mistakenly added in
commit 10ec36e2.
* src/qemu/qemu_capabilities.h (QEMU_CAPS_BLOCKJOB_SYNC)
(QEMU_CAPS_BLOCKJOB_ASYNC): New bits.
* src/qemu/qemu_capabilities.c (qemuCaps): Name them.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONCheckCommands): Set
them.
(qemuMonitorJSONBlockJob): Manage both command names.
(qemuMonitorJSONDiskSnapshot): Minor formatting fix.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob): Alter signature.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJob): Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Pass through
capability bit.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Update callers.
The oVirt developers have stated that the real reasons they want
to have qemu reuse existing volumes when creating a snapshot are:
1. the management framework is set up so that creation has to be
done from a central node for proper resource tracking, and having
libvirt and/or qemu create things violates the framework, and
2. qemu defaults to creating snapshots with an absolute path to
the backing file, but oVirt wants to manage a backing chain that
uses just relative names, to allow for easier migration of a chain
across storage locations.
When 0.9.10 added VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT (commit
4e9953a4), it only addressed point 1, but libvirt was still using
O_TRUNC which violates point 2. Meanwhile, the new qemu
'transaction' monitor command includes a new optional mode argument
that will force qemu to reuse the metadata of the file it just
opened (with the burden on the caller to have valid metadata there
in the first place). So, this tweaks the meaning of the flag to
cover both points as intended for use by oVirt. It is not strictly
backward-compatible to 0.9.10 behavior, but it can be argued that
the O_TRUNC of 0.9.10 was a bug.
Note that this flag is all-or-nothing, and only selects between
'existing' and the default 'absolute-paths'. A more flexible
approach that would allow per-disk selections, as well as adding
support for the 'no-backing-file' mode, would be possible by
extending the <domainsnapshot> xml to have a per-disk mode, but
until we have a management application expressing a need for that
additional complexity, it is not worth doing.
* src/libvirt.c (virDomainSnapshotCreateXML): Tweak documentation.
* src/qemu/qemu_monitor.h (qemuMonitorDiskSnapshot): Add
parameters.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskSnapshot):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorDiskSnapshot): Pass them
through.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskSnapshot): Use
new monitor command arguments.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive)
(qemuDomainSnapshotCreateSingleDiskActive): Adjust callers.
(qemuDomainSnapshotDiskPrepare): Allow qed, modify rules on reuse.
QEmu 1.1 is adding a 'transaction' command to the JSON monitor.
Each element of a transaction corresponds to a top-level command,
with the additional guarantee that the transaction flushes all
pending I/O, then guarantees that all actions will be successful
as a group or that failure will roll back the state to what it
was before the monitor command. The difference between a
top-level command:
{ "execute": "blockdev-snapshot-sync", "arguments":
{ "device": "virtio0", ... } }
and a transaction:
{ "execute": "transaction", "arguments":
{ "actions": [
{ "type": "blockdev-snapshot-sync", "data":
{ "device": "virtio0", ... } } ] } }
is just a couple of changed key names and nesting the shorter
command inside a JSON array to the longer command. This patch
just adds the framework; the next patch will actually use a
transaction.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONMakeCommand): Move
guts...
(qemuMonitorJSONMakeCommandRaw): ...into new helper. Add support
for array element.
(qemuMonitorJSONTransaction): New command.
(qemuMonitorJSONDiskSnapshot): Support use in a transaction.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskSnapshot): Add
argument.
(qemuMonitorJSONTransaction): New declaration.
* src/qemu/qemu_monitor.h (qemuMonitorTransaction): Likewise.
(qemuMonitorDiskSnapshot): Add argument.
* src/qemu/qemu_monitor.c (qemuMonitorTransaction): New wrapper.
(qemuMonitorDiskSnapshot): Pass argument on.
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive): Update caller.
This patch introduces a new event type for the QMP event
SUSPEND:
VIR_DOMAIN_EVENT_ID_PMSUSPEND
The event doesn't take any data, but considering there might
be reason for wakeup in future, the callback definition is:
typedef void
(*virConnectDomainEventSuspendCallback)(virConnectPtr conn,
virDomainPtr dom,
int reason,
void *opaque);
"reason" is unused currently, always passes "0".
This patch introduces a new event type for the QMP event
WAKEUP:
VIR_DOMAIN_EVENT_ID_PMWAKEUP
The event doesn't take any data, but considering there might
be reason for wakeup in future, the callback definition is:
typedef void
(*virConnectDomainEventWakeupCallback)(virConnectPtr conn,
virDomainPtr dom,
int reason,
void *opaque);
"reason" is unused currently, always passes "0".
This patch introduces a new event type for the QMP event
DEVICE_TRAY_MOVED, which occurs when the tray of a removable
disk is moved (i.e opened or closed):
VIR_DOMAIN_EVENT_ID_TRAY_CHANGE
The event's data includes the device alias and the reason
for tray status' changing, which indicates why the tray
status was changed. Thus the callback definition for the event
is:
enum {
VIR_DOMAIN_EVENT_TRAY_CHANGE_OPEN = 0,
VIR_DOMAIN_EVENT_TRAY_CHANGE_CLOSE,
\#ifdef VIR_ENUM_SENTINELS
VIR_DOMAIN_EVENT_TRAY_CHANGE_LAST
\#endif
} virDomainEventTrayChangeReason;
typedef void
(*virConnectDomainEventTrayChangeCallback)(virConnectPtr conn,
virDomainPtr dom,
const char *devAlias,
int reason,
void *opaque);
Using 'unsigned long' for memory values is risky on 32-bit platforms,
as a PAE guest can have more than 4GiB memory. Our API is
(unfortunately) locked at 'unsigned long' and a scale of 1024, but
the rest of our system should consistently use 64-bit values,
especially since the previous patch centralized overflow checking.
* src/conf/domain_conf.h (_virDomainDef): Always use 64-bit values
for memory. Change hugepage_backed to a bool.
* src/conf/domain_conf.c (virDomainDefParseXML)
(virDomainDefCheckABIStability, virDomainDefFormatInternal): Fix
clients.
* src/vmx/vmx.c (virVMXFormatConfig): Likewise.
* src/xenxs/xen_sxpr.c (xenParseSxpr, xenFormatSxpr): Likewise.
* src/xenxs/xen_xm.c (xenXMConfigGetULongLong): New function.
(xenXMConfigGetULong, xenXMConfigSetInt): Avoid truncation.
(xenParseXM, xenFormatXM): Fix clients.
* src/phyp/phyp_driver.c (phypBuildLpar): Likewise.
* src/openvz/openvz_driver.c (openvzDomainSetMemoryInternal):
Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainDefineXML): Likewise.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Likewise.
* src/qemu/qemu_process.c (qemuProcessStart): Likewise.
* src/qemu/qemu_monitor.h (qemuMonitorGetBalloonInfo): Likewise.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONGetBalloonInfo):
Likewise.
* src/qemu/qemu_driver.c (qemudDomainGetInfo)
(qemuDomainGetXMLDesc): Likewise.
* src/uml/uml_conf.c (umlBuildCommandLine): Likewise.
This actually wires up the new optional parameter to block_stream:
http://wiki.qemu.org/Features/LiveBlockMigration/ImageStreamingAPI
The error checking is still sparse, since libvirt must not use
qemu-img or header probing on a qcow2 file in use by qemu to
check if the backing file name is valid; so for now, libvirt is
relying on qemu to diagnose an incorrect backing name. Fixing this
will require libvirt to track the entire backing file chain at the
time qemu is started and keeps it updated with snapshot and pull
operations.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Add
parameter, and update callers.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJob): Update
signature.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob): Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Update caller.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Likewise.
Block job commands are not part of upstream qemu until 1.1; and
proper support of job completion and cancellation depends on being
able to receive QMP events, which implies the JSON monitor.
Additionally, some early versions of block job commands were
backported to RHEL qemu, but these versions lacked asynchronous
job cancellation and partial block pull, so there are several
patches that will still be needed in this area of libvirt code
to support both flavors of block job commands.
Due to earlier patches in libvirt, we are guaranteed that all versions
of qemu that support block job commands already require libvirt to
use the JSON monitor. That means that the text version of block jobs
will not be used, and having to refactor two copies of the block job
handlers makes no sense. So instead, we delete the text handlers.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Drop text monitor
support.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextBlockJob): Delete.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextParseBlockJobOne)
(qemuMonitorTextParseBlockJob, qemuMonitorTextBlockJob):
Likewise.
QMP commands don't need to be escaped since converting them to json
also escapes special characters. When a QMP command fails, however,
libvirt falls back to HMP commands. These fallback functions
(qemuMonitorText*) do their own escaping, and pass the result directly
to qemuMonitorHMPCommandWithFd. If the monitor is in json mode, these
pre-escaped commands will be escaped again when converted to json,
which can result in the wrong arguments being sent.
For example, a filename test\file would be sent in json as
test\\file.
This prevented attaching an image file with a " or \ in its name in
qemu 1.0.50, and also broke rbd attachment (which uses backslashes to
escape some internal arguments.)
Reported-by: Masuko Tomoya <tomoya.masuko@gmail.com>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
using 'system-wakeup' monitor command. It is supported only in JSON,
as we are enabling it if possible. Moreover, this command is available
in qemu-1.1+ which definitely has JSON.
In the future (my next patch in fact) we may want to make
decisions depending on qemu having a monitor command or not.
Therefore, we want to set qemuCaps flag instead of querying
on the monitor each time we are about to make that decision.
QEMU always sends details about all available block devices as an answer
for "info block"/"query-block" command. On the other hand, our
qemuMonitorGetBlockInfo was made for a single block devices queries
only. Thus, when asking for multiple devices, we asked qemu multiple
times to always get the same answer from which different parts were
filtered. This patch makes qemuMonitorGetBlockInfo return a hash table
of all block devices, which may later be used for getting details about
specific devices.
If an async job run on a domain will stop the domain at the end of the
job, a concurrently run query job can hang in qemu monitor and nothing
can be done with that domain from this point on. An attempt to start
such domain results in "Timed out during operation: cannot acquire state
change lock" error.
However, quite a few things have to happen at the right time... There
must be an async job running which stops a domain at the end. This race
was reported with dump --crash but other similar jobs, such as
(managed)save and migration, should be able to trigger this bug as well.
While this async job is processing its last monitor command, that is a
query-migrate to which qemu replies with status "completed", a new
libvirt API that results in a query job must arrive and stay waiting
until the query-migrate command finishes. Once query-migrate is done but
before the async job closes qemu monitor while stopping the domain, the
other thread needs to wake up and call qemuMonitorSend to send its
command to qemu. Before qemu gets a chance to respond to this command,
the async job needs to close the monitor. At this point, the query job
thread is waiting for a condition that no-one will ever signal so it
never finishes the job.
Implement the block I/O throttle setting and getting support to qemu
driver.
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Implements functions for both HMP and QMP mode.
For HMP mode, qemu uses "M" as the units by default, so the passed "sized"
is divided by 1024.
For QMP mode, qemu uses "Bytes" as the units by default, the passed "sized"
is multiplied by 1024.
All of the monitor functions return -1 on failure, 0 on success, or -2 if
not supported.
The QEMU monitor command 'add_client' can be used to connect to
a VNC or SPICE graphics display. This allows for implementation
of the virDomainOpenGraphics API
* src/qemu/qemu_driver.c: Implement virDomainOpenGraphics
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Add binding for 'add_client' command
This change adds some systemtap/dtrace probes to the QEMU monitor
client code. In particular it allows watching of all operations
for a VM
* examples/systemtap/qemu-monitor.stp: Watch all monitor commands
* src/Makefile.am: Passing libdir/bindir/sbindir to dtrace2systemtap.pl
* src/dtrace2systemtap.pl: Accept libdir/bindir/sbindir as args
and look for '# binary:' comment to mark probes against libvirtd
vs libvirt.so
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor_json.c,
src/qemu/qemu_monitor_text.c: Add probes for key functions
If the daemon is restarted so we reconnect to monitor, cdrom media
can be ejected. In that case we don't want to show it in domain xml,
or require it on migration destination.
To check for disk status use 'info block' monitor command.
This patch adds handlers for modification of guest's interface
link state. Both HMP and QMP commands are supported, but as the
link state functionality is from the beginning supported in QMP
the HMP code will probably never be used.
The mainly changes are:
1) Update qemuMonitorGetBlockStatsInfo and it's children (Text/JSON)
functions to return the value of new latency fields.
2) Add new function qemuMonitorGetBlockStatsParamsNumber, which is
to count how many parameters the underlying QEMU supports.
3) Update virDomainBlockStats in src/qemu/qemu_driver.c to be
compatible with the changes by 1).
No one uses this yet, but it will be important once
virDomainSnapshotCreateXML learns a VIR_DOMAIN_SNAPSHOT_DISK_ONLY
flag, and the xml allows passing in the new file names.
* src/qemu/qemu_monitor.h (qemuMonitorDiskSnapshot): New prototype.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextDiskSnapshot):
Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskSnapshot):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorDiskSnapshot): New
function.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskSnapshot):
Likewise.
When an operation started by virDomainBlockPull completes (either with
success or with failure), raise an event to indicate the final status.
This API allow users to avoid polling on virDomainGetBlockJobInfo if
they would prefer to use an event mechanism.
* daemon/remote.c: Dispatch events to client
* include/libvirt/libvirt.h.in: Define event ID and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle the new event
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for block_stream completion and emit a libvirt block pull event
* src/remote/remote_driver.c: Receive and dispatch events to application
* src/remote/remote_protocol.x: Wire protocol definition for the event
* src/remote_protocol-structs: structure definitions for protocol verification
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for BLOCK_STREAM_COMPLETED event
from QEMU monitor
The virDomainBlockPull* family of commands are enabled by the
following HMP/QMP commands: 'block_stream', 'block_job_cancel',
'info block-jobs' / 'query-block-jobs', and 'block_job_set_speed'.
* src/qemu/qemu_driver.c src/qemu/qemu_monitor_text.[ch]: implement disk
streaming by using the proper qemu monitor commands.
* src/qemu/qemu_monitor_json.[ch]: implement commands using the qmp monitor
When qemuMonitorCloseFileHandle is called in error path, we need to
preserve the original error since a possible further error when running
closefd monitor command is not very useful to users.
Continuation of commit 313ac7fd, and enforce things with a syntax
check.
Technically, virNetServerClientCalculateHandleMode is not printing
a mode_t, but rather a collection of VIR_EVENT_HANDLE_* bits;
however, these bits are < 8, so there is no different in the
output, and that was the easiest way to silence the new syntax check.
* cfg.mk (sc_flags_debug): New syntax check.
(exclude_file_name_regexp--sc_flags_debug): Add exemptions.
* src/fdstream.c (virFDStreamOpenFileInternal): Print flags in
hex, mode_t in octal.
* src/libvirt-qemu.c (virDomainQemuMonitorCommand)
(virDomainQemuAttach): Likewise.
* src/locking/lock_driver_nop.c (virLockManagerNopInit): Likewise.
* src/locking/lock_driver_sanlock.c (virLockManagerSanlockInit):
Likewise.
* src/locking/lock_manager.c: Likewise.
* src/qemu/qemu_migration.c: Likewise.
* src/qemu/qemu_monitor.c: Likewise.
* src/rpc/virnetserverclient.c
(virNetServerClientCalculateHandleMode): Print mode with %o.
When attaching to an external QEMU process, it is neccessary
to check if the process is using KVM or not. This can be done
using a monitor command
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
API for checking if KVM is enabled
The 'char control[CMSG_SPACE(sizeof(int))];' was not being
wiped, so could potentially contain uninitialized bytes.
While this was harmless in this case, it caused complaints
from valgrind
* src/qemu/qemu_monitor.c: memset 'control' variable
in qemuMonitorIOWriteWithFD