Rather than have a separate routine to parse the alias of an iothread
returned from qemu in order to get the iothread_id value, parse the alias
when returning and just return the iothread_id in qemuMonitorIOThreadInfoPtr
This set of patches removes the function, changes the "char *name" to
"unsigned int" and handles all the fallout.
Among all the monitor APIs some where checking if mon is NULL and some
were not. Since it's possible to have mon equal to NULL in case a second
call is attempted once entered the monitor. This requires that every
single API checks for the monitor.
This patch adds a macro that helps checking the state of the monitor and
either refactors existing checking code to use the macro or adds it in
case it was missing.
Commit f6563bc3 introduced HMP impl of the function (so that a different
uglier function could be removed). Before the HMP code is called there's
a leftover check that the monitor is JSON which inhibits the code from
working.
qemuDomainBlockJobImpl become an unmaintainable mess over the years of
adding new stuff to it. This patch starts splitting up individual
functions from it until it can be killed entirely.
In bulk this will add lines of code rather than delete them but it will
be traded for maintainability.
When using 'dimm' memory devices with qemu, some of the information
like the slot number and base address need to be reloaded from qemu
after process start so that it reflects the actual state. The state then
allows to use memory devices across migrations.
In qemu 2.3, the migration status will include 'cancelling' in the
window between when an asynchronous cancel has been requested and
when the migration is actually halted. Previously, qemu hid this
state and reported 'active'. Libvirt manages the sequence okay
even when the string is unrecognized (that is, it will report an
unknown state:
Migration: [ 69 %]^Cerror: internal error: unexpected migration status in cancelling.
but the migration is still cancelled), but recognizing the string
makes for a smoother user experience.
* src/qemu/qemu_monitor.h
(QEMU_MONITOR_MIGRATION_STATUS_CANCELLING): Add enum.
* src/qemu/qemu_monitor.c (qemuMonitorMigrationStatus): Map it.
* src/qemu/qemu_migration.c (qemuMigrationUpdateJobStatus): Adjust
clients.
* src/qemu/qemu_monitor_json.c
(qemuMonitorJSONGetMigrationStatusReply): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1199182 documents that
after a series of disk snapshots into existing destination images,
followed by active commits of the top image, it is possible for
qemu 2.2 and earlier to end up tracking a different name for the
image than what it would have had when opening the chain afresh.
That is, when starting with the chain 'a <- b <- c', the name
associated with 'b' is how it was spelled in the metadata of 'c',
but when starting with 'a', taking two snapshots into 'a <- b <- c',
then committing 'c' back into 'b', the name associated with 'b' is
now the name used when taking the first snapshot.
Sadly, older qemu doesn't know how to treat different spellings of
the same filename as identical files (it uses strcmp() instead of
checking for the same inode), which means libvirt's attempt to
commit an image using solely the names learned from qcow2 metadata
fails with a cryptic:
error: internal error: unable to execute QEMU command 'block-commit': Top image file /tmp/images/c/../b/b not found
even though the file exists. Trying to teach libvirt the rules on
which name qemu will expect is not worth the effort (besides, we'd
have to remember it across libvirtd restarts, and track whether a
file was opened via metadata or via snapshot creation for a given
qemu process); it is easier to just always directly ask qemu what
string it expects to see in the first place.
As a safety valve, we validate that any name returned by qemu
still maps to the same local file as we have tracked it, so that
a compromised qemu cannot accidentally cause us to act on an
incorrect file.
* src/qemu/qemu_monitor.h (qemuMonitorDiskNameLookup): New
prototype.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskNameLookup):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorDiskNameLookup): New function.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskNameLookup)
(qemuMonitorJSONDiskNameLookupOne): Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit)
(qemuDomainBlockJobImpl): Use it.
Signed-off-by: Eric Blake <eblake@redhat.com>
In order not to leave old error messages set, this patch refactors the
code so the error is reported only when acted upon. The only such place
already rewrites any error, so cleaning up all the error reporting in
qemuMonitorSetMemoryStatsPeriod() is enough.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Our virDomainBlockStatsFlags API uses the old approach where, when it's
called without the typed parameter array, returns the count of parameters
supported by qemu.
The supported parameter count is obtained via separate monitor calls
which is a waste since we can calculate it when gathering the data.
This patch adds code to the qemuMonitorGetAllBlockStatsInfo workers that
allows to track the count of supported fields reported by qemu and will
allow to remove the old duplicate code.
Add a different version of parser for "info blockstats" that basically
parses the same information as the existing copy of the function.
This will allow us to remove the single device version
qemuMonitorGetBlockStatsInfo in the future.
The new implementation uses few new helpers so it should be more
understandable and provides a test case to verify that it works.
Allocate the hash table in the monitor wrapper function instead of the
worker itself so that the text monitor impl that will be added in the
next patch doesn't have to duplicate it.
In commit cc41c648 I've re-factored qemuMonitorFindBalloonObjectPath, but
missed that there is a memory leak. The "nextpath" variable is
overwritten while looping in for cycle and we have to free it before next
cycle.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
QEMU internally updates the size of video memory if the domain XML had
provided too low memory size or there are some dependencies for a QXL
devices 'vgamem' and 'ram' size. We need to know about the changes and
store them into the status XML to not break migration or managedsave
through different libvirt versions.
The values would be loaded only if the "vgamem_mb" property exists for
the device. The presence of the "vgamem_mb" also tells that the
"ram_size" and "vram_size" exists for QXL devices.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The search is done recursively only through QOM object that has a type
prefixed with "child<" as this indicate that the QOM is a parent for
other QOM objects.
The usage is that you give known device name with starting path where to
search.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1178652
We will get a warning when we have a guest in paused
status (caused by kernel panic) and restart libvirtd,
warning message like this:
Qemu reported unknown VM status: 'guest-panicked'
and this seems because we set a wrong status name in
qemu_monitor.c, and from qemu qapi-schema.json file
we know this status should named 'guest-panicked'.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
A future patch will allow recursion into backing chains when
collecting block stats. This patch should not change behavior,
but merely moves out the common code that will be reused once
recursion is enabled, and adds the parameter that will turn on
recursion.
* src/qemu/qemu_monitor.h (qemuMonitorGetAllBlockStatsInfo)
(qemuMonitorBlockStatsUpdateCapacity): Add recursion parameter,
although it is ignored for now.
* src/qemu/qemu_monitor.h (qemuMonitorGetAllBlockStatsInfo)
(qemuMonitorBlockStatsUpdateCapacity): Likewise.
* src/qemu/qemu_monitor_json.h
(qemuMonitorJSONGetAllBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacity): Likewise.
* src/qemu/qemu_monitor_json.c
(qemuMonitorJSONGetAllBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacity): Add parameter, and
split...
(qemuMonitorJSONGetOneBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacityOne): ...into helpers.
(qemuMonitorJSONGetBlockStatsInfo): Update caller.
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Update caller.
* src/qemu/qemu_migration.c (qemuMigrationCookieAddNBD): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Improve the monitor function to also retrieve the guest state of
character device (if provided) so that we can refresh the state of
virtio-serial channels and perhaps react to changes in the state in
future patches.
This patch changes the returned data from qemuMonitorGetChardevInfo to
return a structure containing the pty path and the state for all the
character devices.
The change to the testsuite makes sure that the data is parsed
correctly.
New qemu added a new event that is emitted when a virtio serial channel
is opened in the guest OS. This allows us to update the state of the
port in the output-only XML element.
This patch implements the monitor callbacks and necessary handlers to
update the state in the definition.
To unify future additions that require information from "query-chardev"
rename qemuMonitorGetPtyPaths and friends to qemuMonitorGetChardevInfo
and move the allocation of the returned hash into the top level
function.
We used to set migration capabilities only when a user asked for them in
flags. This is fine when migration succeeds since the QEMU process is
killed in the end but in case migration fails or if it's cancelled, some
capabilities may remain turned on with no way to turn them off. To fix
that, migration capabilities have to be turned on if requested but
explicitly turned off in case they were not requested but QEMU supports
them.
https://bugzilla.redhat.com/show_bug.cgi?id=1163953
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Add support for bps_max and friends in the driver part.
In the part checking if a qemu is running, check if the running binary
support bps_max, if not print an error message, if yes add it to
"info" variable
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
To allow live modification of device backends in qemu libvirt needs to
be able to hot-add/remove "objects". Add monitor backend functions to
allow this.
This function will be used for hot-add/remove of RNG backends,
IOThreads, memory backing objects, etc.
NIC_RX_FILTER_CHANGED is sent by qemu any time a NIC driver in the
guest modified the NIC's RX Filter (for example, if the MAC address of
the NIC is changed by the guest).
This patch doesn't do anything useful with that event; it just sets up
all the plumbing to get news of the event into a worker thread with
all proper locking/reference counting, and provide an easy place to
add in desired functionality.
See src/qemu/EVENTHANDLERS.txt for information/instructions on adding
a libvirt-internal handler for a qemu event (using
NIC_RX_FILTER_CHANGED as an example).
This function can be called at any time to get the current status of a
guest's network device rx-filter. In particular it is useful to call
after libvirt recieves a NIC_RX_FILTER_CHANGED event - this event only
tells you that something has changed in the rx-filter, the details are
retrieved with the query-rx-filter monitor command (only available in
the json monitor). The command sent to the qemu monitor looks like this:
{"execute":"query-rx-filter", "arguments": {"name":"net2"} }'
and the results will look something like this:
{
"return": [
{
"promiscuous": false,
"name": "net2",
"main-mac": "52:54:00:98:2d:e3",
"unicast": "normal",
"vlan": "normal",
"vlan-table": [
42,
0
],
"unicast-table": [
],
"multicast": "normal",
"multicast-overflow": false,
"unicast-overflow": false,
"multicast-table": [
"33:33:ff:98:2d:e3",
"01:80:c2:00:00:21",
"01:00:5e:00:00:fb",
"33:33:ff:98:2d:e2",
"01:00:5e:00:00:01",
"33:33:00:00:00:01"
],
"broadcast-allowed": false
}
],
"id": "libvirt-14"
}
This is all parsed from JSON into a virNetDevRxFilter object for
easier consumption. (unicast-table is usually empty, but is also an
array of mac addresses similar to multicast-table).
(NB: LIBNL_CFLAGS was added to tests/Makefile.am because virnetdev.h
now includes util/virnetlink.h, which includes netlink/msg.h when
appropriate. Without LIBNL_CFLAGS, gcc can't find that file (if
libnl/netlink isn't available, LIBNL_CFLAGS will be empty and
virnetlink.h won't try to include netlink/msg.h anyway).)
While our code gathers block stats via "query-blockstats" some
information need to be gathered via "query-block". Add a helper function
that will update the blockstats structure if requested.
The current block stats code matched up the disk name with the actual
stats by the order in the data returned from qemu. This unfortunately
isn't right as qemu may return the disks in any order. Fix this by
returning a hash of stats and index them by the disk alias.
Currently we only support TCP protocol for native QEMU migration but
this is going to be changed. Let's make the code more general and remove
hardcoded TCP protocol from several places.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
If the qemu being used doesn't support JSON, then querying for IOThread
data would fail. In that case, ensure the *iothreads is NULL and return 0
as the count of iothreads available.
This patch implements the VIR_DOMAIN_STATS_BLOCK group of statistics.
To do so, a helper function to get the block stats of all the disks of
a domain is added.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Coverity complains about the calculation of the buf & len within
the PROBE macro. So to quiet things down, do the calculation prior
to usage in either write() or qemuMonitorIOWriteWithFD() calls and
then have the PROBE use the calculated values - which works.
Upstream qemu 1.4 added some drive-mirror tunables not present
when it was first introduced in 1.3. Management apps may want
to set these in some cases (for example, without tuning
granularity down to sector size, a copy may end up occupying
more bytes than the original because an entire cluster is
copied even when only a sector within the cluster is dirty,
although tuning it down results in more CPU time to do the
copy). I haven't personally needed to use the parameters, but
since they exist, and since the new API supports virTypedParams,
we might as well expose them.
Since the tuning parameters aren't often used, and omitted from
the QMP command when unspecified, I think it is safe to rely on
qemu 1.3 to issue an error about them being unsupported, rather
than trying to create a new capability bit in libvirt.
Meanwhile, all versions of qemu from 1.4 to 2.1 have a bug where
a bad granularity (such as non-power-of-2) gives a poor message:
error: internal error: unable to execute QEMU command 'drive-mirror': Invalid parameter 'drive-virtio-disk0'
because of abuse of QERR_INVALID_PARAMETER (which is supposed to
name the parameter that was given a bad value, rather than the
value passed to some other parameter). I don't see that a
capability check will help, so we'll just live with it (and it
has since been improved in upstream qemu).
* src/qemu/qemu_monitor.h (qemuMonitorDriveMirror): Add
parameters.
* src/qemu/qemu_monitor.c (qemuMonitorDriveMirror): Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDriveMirror):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDriveMirror):
Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockCopyCommon): Likewise.
(qemuDomainBlockRebase, qemuDomainBlockCopy): Adjust callers.
* src/qemu/qemu_migration.c (qemuMigrationDriveMirror): Likewise.
* tests/qemumonitorjsontest.c (qemuMonitorJSONDriveMirror): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
When QEMU fails during incoming migration after we successfully started
it (i.e., during Perform or Finish phase), we report a rather unhelpful
message
Unable to read from monitor: Connection reset by peer
We already have a code that takes error messages from QEMU's error
output but we disable it once QEMU successfully starts. This patch
postpones this until the end of Finish phase during incoming migration
so that we can report a much better error message:
internal error: early end of file from monitor: possible problem:
Unknown savevm section or instance '0000:00:05.0/virtio-balloon' 0
load of migration failed
https://bugzilla.redhat.com/show_bug.cgi?id=1090093
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
While reviewing the new virDomainBlockCopy API, Peter Krempa
pointed out that our existing design of using MiB/s for block
job bandwidth is rather coarse, especially since qemu tracks
it in bytes/s; so virDomainBlockCopy only accepts bytes/s.
But once the new API is implemented for qemu, we will be in
the situation where it is possible to set a value that cannot
be accurately reflected back to the user, because the existing
virDomainGetBlockJobInfo defaults to the coarser units.
Fortunately, we have an escape hatch; and one that has already
served us well in the past: we can use the flags argument to
specify which scale to use (see virDomainBlockResize for prior
art). This patch fixes the query side of the API; made easier
by previous patches that split the query side out from the
modification code. Later patches will address the virsh
interface, as well retrofitting all other blockjob APIs to
also accept a flag for toggling bandwidth units.
* include/libvirt/libvirt.h.in (_virDomainBlockJobInfo)
(VIR_DOMAIN_BLOCK_COPY_BANDWIDTH): Document sizing issues.
(virDomainBlockJobInfoFlags): New enum.
* src/libvirt.c (virDomainGetBlockJobInfo): Document new flag.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJobInfo): Add parameter.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJobInfo): Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJobInfo):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJobInfo)
(qemuMonitorJSONGetBlockJobInfoOne): Likewise. Don't scale here.
* src/qemu/qemu_migration.c (qemuMigrationDriveMirror): Update
callers.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot)
(qemuDomainBlockJobImpl): Likewise.
(qemuDomainGetBlockJobInfo): Likewise, and support new flag.
Signed-off-by: Eric Blake <eblake@redhat.com>
qemu treats blockjob bandwidth as a 64-bit number, in the units
of bytes/second. But we stupidly modeled block job bandwidth
after migration bandwidth, which in turn was an 'unsigned long'
and therefore subject to 32-bit vs. 64-bit interpretations, and
with a scale of MiB/s. Our code already has to convert between
the two scales, and report overflow as appropriate; although
this conversion currently lives in the monitor code. In fact,
our conversion code limited things to 63 bits, because we
checked against LLONG_MAX and reject what would be negative
bandwidth if treated as signed.
On the bright side, our use of MiB/s means that even with a
32-bit unsigned long, we still have no problem representing a
bandwidth of 2GiB/s, which is starting to be more feasible as
10-gigabit or even faster interfaces are used. And once you
get past the physical speeds of existing interfaces, any larger
bandwidth number behaves the same - effectively unlimited.
But on the low side, the granularity of 1MiB/s tuning is rather
coarse. So the new virDomainBlockJob API decided to go with
a direct 64-bit bytes/sec number instead of the scaled number
that prior blockjob APIs had used. But there is no point in
rounding this number to MiB/s just to scale it back to bytes/s
for handing to qemu.
In order to make future code sharing possible between the old
virDomainBlockRebase and the new virDomainBlockCopy, this patch
moves the scaling and overflow detection into the driver code.
Several of the block job calls that can set speed are fed
through a common interface, so it was easier to adjust all block
jobs at once, for consistency. This patch is just code motion;
there should be no user-visible change in behavior.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob)
(qemuMonitorBlockCommit, qemuMonitorDriveMirror): Change
parameter type and scale.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob)
(qemuMonitorBlockCommit, qemuMonitorDriveMirror): Move scaling
and overflow detection...
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl)
(qemuDomainBlockRebase, qemuDomainBlockCommit): ...here.
(qemuDomainBlockCopy): Use bytes/sec.
Signed-off-by: Eric Blake <eblake@redhat.com>
Another layer of overly-multiplexed code that deserves to be
split into obviously separate paths for query vs. modify.
This continues the cleanup started in commit cefe0ba.
In the process, make some tweaks to simplify the logic when
parsing the JSON reply. There should be no user-visible
semantic changes.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob): Drop parameter.
(qemuMonitorBlockJobInfo): New prototype.
(BLOCK_JOB_INFO): Drop enum.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJob)
(qemuMonitorJSONBlockJobInfo): Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Split...
(qemuMonitorBlockJobInfo): ...into second function.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Move
block info portions...
(qemuMonitorJSONGetBlockJobInfo): ...here, and rename...
(qemuMonitorJSONBlockJobInfo): ...and export.
(qemuMonitorJSONGetBlockJobInfoOne): Alter return semantics.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot)
(qemuDomainBlockJobImpl, qemuDomainGetBlockJobInfo): Adjust
callers.
* src/qemu/qemu_migration.c (qemuMigrationDriveMirror)
(qemuMigrationCancelDriveMirror): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1103245
An advice appeared there on the qemu-devel list [1]. When a domain is
suspended and then resumed guest kernel is not aware of this. So we've
introduced virDomainSetTime API that resets the time within guest
using qemu-ga. On the other hand, qemu itself is trying to make RTC
beat faster to catch the difference. But if we don't tell qemu that
guest's time was reset via the other method, both mechanisms are
applied resulting in again wrong guest time. In order to avoid summing
both corrections we need to tell qemu that it should not use the RTC
injection if the guest time is set via guest agent.
1: http://www.mail-archive.com/qemu-devel@nongnu.org/msg236435.html
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
That can be lately achieved with by having .param == NULL in the
virQEMUCapsCommandLineProps struct.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
To allow changing the name that is recorded in the top of the current
image chain used in a block pull/rebase operation, we need to specify
the backing name to qemu. This is done via the "backing-file" attribute
to the block-stream commad.
To allow changing the name that is recorded in the overlay of the TOP
image used in a block commit operation, we need to specify the backing
name to qemu. This is done via the "backing-file" attribute to the
block-commit command.
We are about to turn on support for active block commit. Although
qemu 2.0 was the first version to mostly support it, that version
mis-handles 0-length files, and doesn't have anything available for
easy probing. But qemu 2.1 fixed bugs, and made life simpler by
letting the 'top' argument be optional. Unless someone begs for
active commit with qemu 2.0, for now we are just going to enable
it only by probing for qemu 2.1 behavior (anyone backporting active
commit can also backport the optional argument behavior). This
requires qemu.git commit 7676e2c597000eff3a7233b40cca768b358f9bc9.
Although all our actual uses of block-commit supply arguments for
both base and top, we can omit both arguments and use a bogus
device string to trigger an interesting behavior in qemu. All QMP
commands first do argument validation, failing with GenericError
if a mandatory argument is missing. Once that passes, the code
in the specific command gets to do further checking, and the qemu
developers made sure that if device is the only supplied argument,
then the block-commit code will look up the device first, with a
failure of DeviceNotFound, before attempting any further argument
validation (most other validations fail with GenericError). Thus,
the category of error class can reliably be used to decipher
whether the top argument was optional, which in turn implies a
working active commit. Since we expect our bogus device string to
trigger an error either way, the code is written to return a
distinct return value without spamming the logs.
* src/qemu/qemu_monitor.h (qemuMonitorSupportsActiveCommit): New
prototype.
* src/qemu/qemu_monitor.c (qemuMonitorSupportsActiveCommit):
Implement it.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockCommit):
Allow NULL for top and base, for probing purposes.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockCommit):
Likewise, implementing the probe.
* tests/qemumonitorjsontest.c (mymain): Enable...
(testQemuMonitorJSONqemuMonitorSupportsActiveCommit): ...a new test.
Signed-off-by: Eric Blake <eblake@redhat.com>
Replace:
if (virBufferError(&buf)) {
virBufferFreeAndReset(&buf);
virReportOOMError();
...
}
with:
if (virBufferCheckError(&buf) < 0)
...
This should not be a functional change (unless some callers
misused the virBuffer APIs - a different error would be reported
then)
In "src/conf/domain_conf.h" there are many enum declarations. The
cleanup in this header filer was started, but it wasn't enough and
there are many other files that has enum variables declared. So, the
commit was starting to be big. This commit finish the cleanup in this
header file and in other files that has enum variables, parameters,
or functions declared.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
When passing migration bandwidth to QEMU, we multiply it by 1024 * 1024
to convert the speed to B/s and the result still needs to fit in
int64_t.
https://bugzilla.redhat.com/show_bug.cgi?id=1083483
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
I almost wrote a hash value free function that just called
VIR_FREE, then realized I couldn't be the first person to
do that. Sure enough, it was worth factoring into a common
helper routine.
* src/util/virhash.h (virHashValueFree): New function.
* src/util/virhash.c (virHashValueFree): Implement it.
* src/util/virobject.h (virObjectFreeHashData): New function.
* src/libvirt_private.syms (virhash.h, virobject.h): Export them.
* src/nwfilter/nwfilter_learnipaddr.c (virNWFilterLearnInit): Use
common function.
* src/qemu/qemu_capabilities.c (virQEMUCapsCacheNew): Likewise.
* src/qemu/qemu_command.c (qemuDomainCCWAddressSetCreate):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorGetBlockInfo): Likewise.
* src/qemu/qemu_process.c (qemuProcessWaitForMonitor): Likewise.
* src/util/virclosecallbacks.c (virCloseCallbacksNew): Likewise.
* src/util/virkeyfile.c (virKeyFileParseGroup): Likewise.
* tests/qemumonitorjsontest.c
(testQemuMonitorJSONqemuMonitorJSONGetBlockInfo): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
This patch adds qemuMonitorGetDumpGuestMemoryCapability, which is used to check
whether the specified dump-guest-memory format is supported by qemu.
Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Busy enterprise workloads hosted on large sized VM's tend to dirty
memory faster than the transfer rate achieved via live guest migration.
Despite some good recent improvements (& using dedicated 10Gig NICs
between hosts) the live migration may NOT converge.
Recently support was added in qemu (version 1.6) to allow a user to
choose if they wish to force convergence of their migration via a
new migration capability : "auto-converge". This feature allows for qemu
to auto-detect lack of convergence and trigger a throttle-down of the
VCPUs.
This patch includes the libvirt support needed to trigger this
feature. (Testing is in progress)
Signed-off-by: Chegu Vinod <chegu_vinod@hp.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=844378
When qemu dies early after connecting to its monitor but before we
actually try to read something from the monitor, we would just fail
domain start with useless message:
"An error occurred, but the cause is unknown"
This is because the real error gets reported in a monitor EOF handler
executing within libvirt's event loop.
The fix is to take any error set in qemuMonitor structure and propagate
it into the thread-local error when qemuMonitorClose is called and no
thread-local error is set.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Wire up all the pieces to send arbitrary qemu events to a
client using libvirt-qemu.so. If the extra bookkeeping of
generating event objects even when no one is listening turns
out to be noticeable, we can try to further optimize things
by adding a counter for how many connections are using events,
and only dump events when the counter is non-zero; but for
now, I didn't think it was worth the code complexity.
* src/qemu/qemu_driver.c
(qemuConnectDomainQemuMonitorEventRegister)
(qemuConnectDomainQemuMonitorEventDeregister): New functions.
* src/qemu/qemu_monitor.h (qemuMonitorEmitEvent): New prototype.
(qemuMonitorDomainEventCallback): New typedef.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONIOProcessEvent):
Report events.
* src/qemu/qemu_monitor.c (qemuMonitorEmitEvent): New function, to
pass events through.
* src/qemu/qemu_process.c (qemuProcessHandleEvent): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Any source file which calls the logging APIs now needs
to have a VIR_LOG_INIT("source.name") declaration at
the start of the file. This provides a static variable
of the virLogSource type.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The dtrace probe macros rely on the logging API. We can't make
the internal.h header include the virlog.h header though since
that'd be a circular include. Instead simply split the dtrace
probes into their own header file, since there's no compelling
reason for them to be in the main internal.h header.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
While investigating https://bugzilla.redhat.com/show_bug.cgi?id=1061827
I noticed that we pass user input unscathed for block-pull, but
always pass a canonical absolute name through for block-commit.
[Note that we probably _ought_ to validate that the user's request
for block-pull actually matches the backing chain, the way we already
do for block-commit - but that's a separate issue. Further note that
the ability to pass user input through unscathed allows backdoors
such as specifying a backing image that is a network URI such as
a gluster disk, instead of forcing things to the local file system;
which is an area still under active investigation on whether libvirt
needs to behave differently for network disks.]
Since qemu may write the name that the user passed in as the backing
file, a user may have a reason to want a relative file name passed
through to qemu, and always munging things to absolute prevents that.
Put another way, if you have the backing chain:
[A] <- [B(back=./A)] <- [C(back=./B)]
and commit B into A (virsh blockcommit $dom vda --base A --top B),
the metadata of C will have to be re-written. But should it be
rewritten as [C(back=./A)] or as [C(back=/path/to/A)]? Still up in
the air is whether qemu's decision should be based on whether B
and/or C had relative paths, or on whether the --base and/or
--top arguments to the command were relative paths; but if we always
pass a canonical name, we've prevented the spelling of the command
arguments from being part of the hueristics that qemu uses.
I also audited the code, and verified that we never call
qemuMonitorBlockCommit() with a NULL base, either before or after
the change to qemu_driver.c.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Preserve user's
spelling, since absolute vs. relative matters to qemu.
* src/qemu/qemu_monitor.h (qemuMonitorBlockCommit): Base is never
null.
* src/qemu/qemu_monitor.c (qemuMonitorBlockCommit): Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockCommit):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockCommit):
Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
If virDomainMemoryStats was run on a domain with virtio balloon driver
running on an old qemu which supports QMP but does not support qom-list
QMP command, libvirtd would crash. The reason is we did not check if
qemuMonitorJSONGetObjectListPaths failed and moreover we even stored its
result in an unsigned integer type.
There is a number of reported issues when we fail starting a domain.
Turns out that, in some scenarios like high load, 3 second timeout is
not enough for qemu to start up to the phase where the socket is
created. Since there is no downside of waiting longer, raise the
timeout right to 30 seconds.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Recent changes to events (commit 8a29ffcf) resulted in new compile
failures on some targets (such as ARM OMAP5):
conf/domain_event.c: In function 'virDomainEventDispatchDefaultFunc':
conf/domain_event.c:1198:30: error: cast increases required alignment of
target type [-Werror=cast-align]
conf/domain_event.c:1314:34: error: cast increases required alignment of
target type [-Werror=cast-align]
cc1: all warnings being treated as errors
The error is due to alignment; the base class is merely aligned
to the worst of 'int' and 'void*', while the child class must
be aligned to a 'long long'. The solution is to include a
'long long' (and for good measure, a function pointer) in the
base class to ensure correct alignment regardless of what a
child class may add, but to wrap the inclusion in a union so
as to not incur any wasted space. On a typical x86_64 platform,
the base class remains 16 bytes; on i686, the base class remains
12 bytes; and on the impacted ARM platform, the base class grows
from 12 bytes to 16 bytes due to the increase of alignment from
4 to 8 bytes.
Reported by Michele Paolino and others.
* src/util/virobject.h (_virObject): Use a union to ensure that
subclasses never have stricter alignment than the parent.
* src/util/virobject.c (virObjectNew, virObjectUnref)
(virObjectRef): Adjust clients.
* src/libvirt.c (virConnectRef, virDomainRef, virNetworkRef)
(virInterfaceRef, virStoragePoolRef, virStorageVolRef)
(virNodeDeviceRef, virSecretRef, virStreamRef, virNWFilterRef)
(virDomainSnapshotRef): Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorOpenInternal)
(qemuMonitorClose): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Most of our code base uses space after comma but not before;
fix the remaining uses before adding a syntax check.
* src/qemu/qemu_cgroup.c: Consistently use commas.
* src/qemu/qemu_command.c: Likewise.
* src/qemu/qemu_conf.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
* src/qemu/qemu_monitor.c: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
QEMU 1.6.0 introduced new migration status: setup
Libvirt does not expect such string in QMP and refuses to migrate with error
"unexpected migration status in setup"
This patch fixes it.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1018267
The aim of virObject refing and urefing is to tell where the object is
to be used and when is no longer needed. Hence any object shouldn't be
used after it has been unrefed, as we might be the last to hold the
reference. The better way is to call virObjectUnref() *after* the last
object usage. In this specific case, the monitor EOF handler was called
after the qemuMonitorIO called virObjectUnref. Not only that @mon was
disposed (which is not used in the handler anyway) but the @mon->vm
which is causing a SIGSEGV:
2013-11-15 10:17:54.425+0000: 20110: error : qemuMonitorIO:688 : internal error: early end of file from monitor: possible problem:
qemu-kvm: -incoming tcp:01.01.01.0:49152: Failed to bind socket: Cannot assign requested address
Program received signal SIGSEGV, Segmentation fault.
qemuProcessHandleMonitorEOF (mon=<optimized out>, vm=0x7fb728004170) at qemu/qemu_process.c:299
299 if (priv->beingDestroyed) {
(gdb) p *priv
Cannot access memory at address 0x0
(gdb) p vm
$1 = (virDomainObj *) 0x7fb728004170
(gdb) p *vm
$2 = {parent = {parent = {magic = 3735928559, refs = 0, klass = 0xdeadbeef}, lock = {lock = {__data = {__lock = 2, __count = 0, __owner = 20110, __nusers = 1, __kind = 0, __spins = 0, __list = {__prev = 0x0,
__next = 0x0}}, __size = "\002\000\000\000\000\000\000\000\216N\000\000\001", '\000' <repeats 26 times>, __align = 2}}}, pid = 0, state = {state = 0, reason = 0}, autostart = 0, persistent = 0,
updated = 0, def = 0x0, newDef = 0x0, snapshots = 0x0, current_snapshot = 0x0, hasManagedSave = false, privateData = 0x0, privateDataFreeFunc = 0x0, taint = 304}
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The qemu monitor supports retrieval of actual CPUID bits presented to
the guest using QMP monitor. Add APIs to extract these information and
tests for them.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Since the 90139a62 commit the error is copied into mon->lastError but
it's never freed from there.
==31989== 395 bytes in 1 blocks are definitely lost in loss record 877 of 978
==31989== at 0x4A06C2B: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==31989== by 0x7EAF129: strdup (in /lib64/libc-2.15.so)
==31989== by 0x50D586C: virStrdup (virstring.c:554)
==31989== by 0x50976C1: virCopyError (virerror.c:191)
==31989== by 0x5097A35: virCopyLastError (virerror.c:312)
==31989== by 0x114909A9: qemuMonitorIO (qemu_monitor.c:690)
==31989== by 0x509BEDE: virEventPollDispatchHandles (vireventpoll.c:501)
==31989== by 0x509C701: virEventPollRunOnce (vireventpoll.c:648)
==31989== by 0x509A620: virEventRunDefaultImpl (virevent.c:274)
==31989== by 0x520D21C: virNetServerRun (virnetserver.c:1112)
==31989== by 0x11F368: main (libvirtd.c:1513)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
I've noticed a SIGSEGV-ing libvirtd on the destination when the qemu
died too quickly = in Prepare phase. What is happening here is:
1) [Thread 3493] We are in qemuMigrationPrepareAny() and calling
qemuProcessStart() which subsequently calls qemuProcessWaitForMonitor()
and qemuConnectMonitor(). So far so good. The qemuMonitorOpen()
succeeds, however switching monitor to QMP mode fails as qemu died
meanwhile. That is qemuMonitorSetCapabilities() returns -1.
2013-10-08 15:54:10.629+0000: 3493: debug : qemuMonitorSetCapabilities:1356 : mon=0x14a53da0
2013-10-08 15:54:10.630+0000: 3493: debug : qemuMonitorJSONCommandWithFd:262 : Send command '{"execute":"qmp_capabilities","id":"libvirt-1"}' for write with FD -1
2013-10-08 15:54:10.630+0000: 3493: debug : virEventPollUpdateHandle:147 : EVENT_POLL_UPDATE_HANDLE: watch=17 events=13
...
2013-10-08 15:54:10.631+0000: 3493: debug : qemuMonitorSend:956 : QEMU_MONITOR_SEND_MSG: mon=0x14a53da0 msg={"execute":"qmp_capabilities","id":"libvirt-1"}
fd=-1
2013-10-08 15:54:10.631+0000: 3262: debug : virEventPollRunOnce:641 : Poll got 1 event(s)
2) [Thread 3262] The event loop is trying to do the talking to monitor.
However, qemu is dead already, remember?
2013-10-08 15:54:13.436+0000: 3262: error : qemuMonitorIORead:551 : Unable to read from monitor: Connection reset by peer
2013-10-08 15:54:13.516+0000: 3262: debug : virFileClose:90 : Closed fd 25
...
2013-10-08 15:54:13.533+0000: 3493: debug : qemuMonitorSend:968 : Send command resulted in error internal error: early end of file from monitor: possible problem:
3) [Thread 3493] qemuProcessStart() failed. No big deal. Go to the
'endjob' label and subsequently to the 'cleanup'. Since the domain is
not persistent and ret is -1, the qemuDomainRemoveInactive() is called.
This has an (unpleasant) effect of virObjectUnref()-in the @vm object.
Unpleasant because the event loop which is about to trigger EOF callback
still holds a pointer to the @vm (not the reference). See the valgrind
output below.
4) [Thread 3262] So the event loop starts triggering EOF:
2013-10-08 15:54:13.542+0000: 3262: debug : qemuMonitorIO:729 : Triggering EOF callback
2013-10-08 15:54:13.543+0000: 3262: debug : qemuProcessHandleMonitorEOF:294 : Received EOF on 0x14549110 'migt10'
And the monitor is cleaned up. This results in calling
qemuProcessHandleMonitorEOF with the @vm pointer passed. The pointer is
kept in qemuMonitor struct.
==3262== Thread 1:
==3262== Invalid read of size 4
==3262== at 0x77ECCAA: pthread_mutex_lock (in /lib64/libpthread-2.15.so)
==3262== by 0x52FAA06: virMutexLock (virthreadpthread.c:85)
==3262== by 0x52E3891: virObjectLock (virobject.c:320)
==3262== by 0x11626743: qemuProcessHandleMonitorEOF (qemu_process.c:296)
==3262== by 0x11642593: qemuMonitorIO (qemu_monitor.c:730)
==3262== by 0x52BD526: virEventPollDispatchHandles (vireventpoll.c:501)
==3262== by 0x52BDD49: virEventPollRunOnce (vireventpoll.c:648)
==3262== by 0x52BBC68: virEventRunDefaultImpl (virevent.c:274)
==3262== by 0x542D3D9: virNetServerRun (virnetserver.c:1112)
==3262== by 0x11F368: main (libvirtd.c:1513)
==3262== Address 0x14549128 is 24 bytes inside a block of size 136 free'd
==3262== at 0x4C2AF5C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==3262== by 0x529B1FF: virFree (viralloc.c:580)
==3262== by 0x52E3703: virObjectUnref (virobject.c:270)
==3262== by 0x531557E: virDomainObjListRemove (domain_conf.c:2355)
==3262== by 0x1160E899: qemuDomainRemoveInactive (qemu_domain.c:2061)
==3262== by 0x1163A0C6: qemuMigrationPrepareAny (qemu_migration.c:2450)
==3262== by 0x1163A923: qemuMigrationPrepareDirect (qemu_migration.c:2626)
==3262== by 0x11682D71: qemuDomainMigratePrepare3Params (qemu_driver.c:10309)
==3262== by 0x53B0976: virDomainMigratePrepare3Params (libvirt.c:7266)
==3262== by 0x1502D3: remoteDispatchDomainMigratePrepare3Params (remote.c:4797)
==3262== by 0x12DECA: remoteDispatchDomainMigratePrepare3ParamsHelper (remote_dispatch.h:5741)
==3262== by 0x54322EB: virNetServerProgramDispatchCall (virnetserverprogram.c:435)
The mon->vm is set in qemuMonitorOpenInternal() which is the correct
place to increase @vm ref counter. The correct place to decrease the ref
counter is then qemuMonitorDispose().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Change the monitor error code to add the ability to access the qemu log
file using a file descriptor so that we can dig in it for a more useful
error message. The error is now logged on monitor hangups and overwrites
a possible lesser error. A hangup on the monitor usualy means that qemu
has crashed and there's a significant chance it produced a useful error
message.
The functionality will be latent until the next patch.
Early VM startup errors usually produce a better error message in the
machine log file. Currently we were accessing it only when the process
exited during certain phases of startup. This will help adding a more
comprehensive error extraction for early qemu startup phases.
This patch adds infrastructure to keep a file descriptor for the machine
log file that will be used in case an error happens.
The VIR_DOMAIN_PAUSED_GUEST_PANICKED constant is badly named,
leaking the QEMU event name. Elsewhere in the API we use
'CRASHED' rather than 'PANICKED', and the addition of 'GUEST'
is redundant since all events are guest related.
Thus rename it to VIR_DOMAIN_PAUSED_CRASHED, which matches
with VIR_DOMAIN_RUNNING_CRASHED and VIR_DOMAIN_EVENT_CRASHED.
It was added in commit 14e7e0ae8d
which post-dates v1.1.0, so is safe to rename before 1.1.1
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch will add the qemuMonitorJSONGetMemoryStats() to execute a
"guest-stats" on the balloonpath using "get-qom" replacing the former
mechanism which looked through the "query-ballon" returned data for
the fields. The "query-balloon" code only returns 'actual' memory.
Rather than duplicating the existing code, have the JSON API use the
GetBalloonInfo API.
A check in the qemuMonitorGetMemoryStats() will be made to ensure the
balloon driver path has been set. Since the underlying JSON code can
return data not associated with the balloon driver, we don't fail on
a failure to get the balloonpath. Of course since we've made the check,
we can then set the ballooninit flag. Getting the path here is primarily
due to the process reconnect path which doesn't attempt to set the
collection period.
At vm startup and attach attempt to set the balloon driver statistics
collection period based on the value found in the domain xml file. This
is not done at reconnect since it's possible that a collection period
was set on the live guest and making the set period call would reset to
whatever value is stored in the config file.
Setting the stats collection period has a side effect of searching through
the qom-list output for the virtio balloon driver and making sure that it
has the right properties in order to allow setting of a collection period
and eventually fetching of statistics.
The walk through the qom-list is expensive and thus the balloonpath will
be saved in the monitor private structure as well as a flag indicating
that the initialization has already been attempted (in the event that a
path is not found, no sense to keep checking).
This processing model conforms to the qom object model model which
requires setting object properties after device startup. That is, it's
not possible to pass the period along via the startup code as it won't
be recognized.
The function being introduced is responsible for preparing and
executing 'chardev-add' qemu monitor command. Moreover, in case
of PTY chardev, the corresponding pty path is updated.
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add monitor callback API domainGuestPanic, that implements
'destroy', 'restart' and 'preserve' events of the 'on_crash'
in the XML when domain crashed.
A bug in Cygwin [1] and poor error messages from gcc [2] lead
to this confusing compilation error:
qemu/qemu_monitor.c:418:9: error: passing argument 2 of 'sendmsg' from incmpatible pointer type
/usr/include/sys/socket.h:42:11: note: expected 'const struct msghdr *' but argument is of type 'struct msghdr *'
[1] http://cygwin.com/ml/cygwin/2013-05/msg00451.html
[2] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57475
* src/qemu/qemu_monitor.c (includes): Include <sys/socket.h>
before <sys/un.h>.
Signed-off-by: Eric Blake <eblake@redhat.com>
In order to learn libvirt multiqueue several things must be done:
1) The '/dev/net/tun' device needs to be opened multiple times with
IFF_MULTI_QUEUE flag passed to ioctl(fd, TUNSETIFF, &ifr);
2) Similarly, '/dev/vhost-net' must be opened as many times as in 1)
in order to keep 1:1 ratio recommended by qemu and kernel folks.
3) The command line construction code needs to switch from 'fd=X' to
'fds=X:Y:...:Z' and from 'vhostfd=X' to 'vhostfds=X:Y:...:Z'.
4) The monitor handling code needs to learn to pass multiple FDs.
Ever since the conversion to using only QMP for probing features
of qemu 1.2 and newer, we have been unable to detect features
that are added only by additional command line options. For
example, we'd like to know if '-machine mem-merge=on' (added
in qemu 1.5) is present. To do this, we will take advantage
of qemu 1.5's query-command-line-parameters QMP call [1].
This patch wires up the framework for probing the command results;
if the QMP command is missing, or if a particular command line
option does not output any parameters (for example, -net uses
a polymorphic parser, which showed up as no parameters as of qemu
1.5), we silently treat that command as having no results.
[1] https://lists.gnu.org/archive/html/qemu-devel/2013-04/msg05180.html
* src/qemu/qemu_monitor.h (qemuMonitorGetOptions)
(qemuMonitorSetOptions)
(qemuMonitorGetCommandLineOptionParameters): New functions.
* src/qemu/qemu_monitor_json.h
(qemuMonitorJSONGetCommandLineOptionParameters): Likewise.
* src/qemu/qemu_monitor.c (_qemuMonitor): Add cache field.
(qemuMonitorDispose): Clean it.
(qemuMonitorGetCommandLineOptionParameters): Implement new function.
* src/qemu/qemu_monitor_json.c
(qemuMonitorJSONGetCommandLineOptionParameters): Likewise.
(testQemuMonitorJSONGetCommandLineParameters): Test it.
Signed-off-by: Eric Blake <eblake@redhat.com>
The source code base needs to be adapted as well. Some files
include virutil.h just for the string related functions (here,
the include is substituted to match the new file), some include
virutil.h without any need (here, the include is removed), and
some require both.
Probe for QEMU's QMP TPM support by querying the lists of
supported TPM models (query-tpm-models) and backend types
(query-tpm-types).
The setting of the capability flags following the strings
returned from the commands above is only provided in the
patch where domain_conf.c gets TPM support due to dependencies
on functions only introduced there.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
The JSON generator is able to represent only values less than LLONG_MAX, fix the
bandwidth limit checks when converting to value to catch overflows before they
reach the generator.
The VIR_ERR_NO_SUPPORT error code is reserved for cases where an
API is not implemented in a driver. It definitely should not be
used when an API execution fails due to unsupported operation.
If virCondInit fails (okay, so that's unlikely), then we end up
attempting a virObjectUnlock() on the cleanup path, even though
we don't hold a lock. This is not guaranteed to be safe. While
at it, I noticed a couple places where we were referencing mon->fd
outside locks.
* src/qemu/qemu_monitor.c (qemuMonitorOpenInternal): Minimize lock
duration. mon->watch doesn't need clean up on error.
(qemuMonitorGetBlockExtent, qemuMonitorBlockResize): Don't
dereference fd outside of lock.
This will be used with new migration scheme.
This patch creates basically just monitor stub
functions. Wiring them into something useful
is done in later patches.
This will be used with new migration scheme.
This patch creates basically just monitor stub
functions. Wiring them into something useful
is done in later patches.
As a side effect, this also fixes reporting disk migration process.
It was added to memory migration progress, which was wrong. Disk
progress has dedicated fields in virDomainJobInfo structure.
Add entry points for calling the qemu 'add-fd' and 'remove-fd'
monitor commands. There is no entry point for 'query-fdsets';
the assumption is that a developer can use
virsh qemu-monitor-command domain '{"execute":"query-fdsets"}'
when debugging issues, and that meanwhile, libvirt is responsible
enough to remember what fds it associated with what fdsets.
Likewise, on the 'add-fd' command, it is assumed that libvirt
will always pass a set id, rather than letting qemu autogenerate
the next available id number.
* src/qemu/qemu_monitor.c (qemuMonitorAddFd, qemuMonitorRemoveFd):
New functions.
* src/qemu/qemu_monitor.h (qemuMonitorAddFd, qemuMonitorRemoveFd):
New prototypes.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONAddFd)
(qemuMonitorJSONRemoveFd): New functions.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONAddFd)
(qemuMonitorJSONRemoveFd): New prototypes.
The virDomainObj, qemuAgent, qemuMonitor, lxcMonitor classes
all require a mutex, so can be switched to use virObjectLockable
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently all classes must directly inherit from virObject.
This allows for arbitrarily deep hierarchy. There's not much
to this aside from chaining up the 'dispose' handlers from
each class & providing APIs to check types.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Perform all the appropriate plumbing.
When qemu/KVM VMs are paused manually through a monitor not-owned by libvirt,
libvirt will think of them as "paused" event after they are resumed and
effectively running. With this patch the discrepancy goes away.
This is meant to address bug 892791.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Noticed these while building on FreeBSD.
* src/qemu/qemu_monitor.c (qemuMonitorBlockInfoLookup): Rename
variable to avoid 'devname' collision.
* src/qemu/qemu_driver.c (qemuDomainInterfaceStats): Mark unused
variable.
Only one error in qemu_monitor was already using the relatively
new OPERATION_UNSUPPORTED error, even though it is a better fit
for all of the messages related to options that are unsupported
due to the version of qemu in use rather than due to a user's
XML or .conf file choice. Suggested by Osier Yang.
* src/qemu/qemu_monitor.c (qemuMonitorSendFileHandle)
(qemuMonitorAddHostNetwork, qemuMonitorRemoveHostNetwork)
(qemuMonitorAttachDrive, qemuMonitorDiskSnapshot)
(qemuMonitorDriveMirror, qemuMonitorTransaction)
(qemuMonitorBlockCommit, qemuMonitorDrivePivot)
(qemuMonitorBlockJob, qemuMonitorSystemWakeup)
(qemuMonitorGetVersion, qemuMonitorGetMachines)
(qemuMonitorGetCPUDefinitions, qemuMonitorGetCommands)
(qemuMonitorGetEvents, qemuMonitorGetKVMState)
(qemuMonitorGetObjectTypes, qemuMonitorGetObjectProps)
(qemuMonitorGetTargetArch): Use better error category.
Without this patch, attempts to create a disk snapshot when qemu
is too old results in a cryptic message:
virsh # snapshot-create 23 --disk-only
error: operation failed: Failed to take snapshot: unknown command: 'snapshot_blkdev'
Now it reports:
virsh # snapshot-create 23 --disk-only
error: unsupported configuration: live disk snapshot not supported with this QEMU binary
All versions of qemu that support live disk snapshot also support
QMP (basically upstream qemu 1.1 and later, and backports to RHEL 6.2).
* src/qemu/qemu_capabilities.h (QEMU_CAPS_DISK_SNAPSHOT): New
capability.
* src/qemu/qemu_capabilities.c (qemuCaps): Track it.
(qemuCapsProbeQMPCommands): Set it.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive): Use
it.
* src/qemu/qemu_monitor.c (qemuMonitorDiskSnapshot): Simplify.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskSnapshot):
Likewise.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextDiskSnapshot):
Delete.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextDiskSnapshot):
Likewise.
https://bugzilla.redhat.com/show_bug.cgi?id=872292
Libvirt should not attempt to call a QMP command that has not been
documented in qemu.git - if future qemu introduces a command by the
same name but with subtly different semantics, then libvirt will be
broken when trying to use that command.
We also had some code that could never be reached - some of our
commands have an alternate for new vs. old qemu HMP commands; but
if we are new enough to support QMP, we only need a fallback to
the new HMP counterpart, and don't need to try for a QMP counterpart
for the old HMP version.
See also this attempt to convert the three snapshot commands to QMP:
https://lists.gnu.org/archive/html/qemu-devel/2012-07/msg01597.html
although it looks like that will still not happen before qemu 1.3.
That thread eventually decided that qemu would use the name
'save-vm' rather than 'savevm', which mitigates the fact that
libvirt's attempt to use a QMP 'savevm' would be broken, but we
might not be as lucky on the other commands.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONSetCPU)
(qemuMonitorJSONAddDrive, qemuMonitorJSONDriveDel)
(qemuMonitorJSONCreateSnapshot, qemuMonitorJSONLoadSnapshot)
(qemuMonitorJSONDeleteSnapshot): Use only HMP fallback for now.
(qemuMonitorJSONAddHostNetwork, qemuMonitorJSONRemoveHostNetwork)
(qemuMonitorJSONAttachDrive, qemuMonitorJSONGetGuestDriveAddress):
Delete; QMP implies QEMU_CAPS_DEVICE, which prefers AddNetdev,
RemoveNetdev, and AddDrive anyways (qemu_hotplug.c has all callers).
* src/qemu/qemu_monitor.c (qemuMonitorAddHostNetwork)
(qemuMonitorRemoveHostNetwork, qemuMonitorAttachDrive): Reflect
deleted commands.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONAddHostNetwork)
(qemuMonitorJSONRemoveHostNetwork, qemuMonitorJSONAttachDrive):
Likewise.
If qemuMonitorOpenUnix is called without a related pid, i.e. for
QMP probing, a connect failure can happen as the result of a race.
Without a pid there is no retry and thus we give up too early.
This changes the code to retry if no pid is supplied.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
The libvirt coding standard is to use 'function(...args...)'
instead of 'function (...args...)'. A non-trivial number of
places did not follow this rule and are fixed in this patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When there is no 'qemu-kvm' binary and the emulator used for a machine
is, for example, 'qemu-system-x86_64' that, by default, runs without
kvm enabled, libvirt still supplies '-no-kvm' option to this process,
even though it does not recognize such option (making the start of a
domain fail in that case).
This patch fixes building a command-line for QEMU machines without KVM
acceleration and is based on following assumptions:
- QEMU_CAPS_KVM flag means that QEMU is running KVM accelerated
machines by default (without explicitly requesting that using a
command-line option). It is the closest to the truth according to
the code with the only exception being the comment next to the
flag, so it's fixed in this patch as well.
- QEMU_CAPS_ENABLE_KVM flag means that QEMU is, by default, running
without KVM acceleration and in case we need KVM acceleration it
needs to be explicitly instructed to do so. This is partially
true for the past (this option essentially means that QEMU
recognizes the '-enable-kvm' option, even though it's almost the
same).
Upstream qemu 1.3 is adding two new monitor commands, 'drive-mirror'
and 'block-job-complete'[1], which can drive live block copy and
storage migration. [Additionally, RHEL 6.3 had backported an earlier
version of most of the same functionality, but under the names
'__com.redhat_drive-mirror' and '__com.redhat_drive-reopen' and with
slightly different JSON arguments, and has been using patches similar
to these upstream patches for several months now.]
The libvirt API virDomainBlockRebase as already committed for 0.9.12
is flexible enough to expose the basics of block copy, but some
additional features in the 'drive-mirror' qemu command, such as
setting error policy, setting granularity, or using a persistent
bitmap, may later require a new libvirt API virDomainBlockCopy. I
will wait to add that API until we know more about what qemu 1.3
will finally provide.
This patch caters only to the upstream qemu 1.3 interface, although
I have proven that the changes for RHEL 6.3 can be isolated to
just qemu_monitor_json.c, and the rest of this series will
gracefully handle either interface once the JSON differences are
papered over in a downstream patch.
For consistency with other block job commands, libvirt must handle
the bandwidth argument as MiB/sec from the user, even though qemu
exposes the speed argument as bytes/sec; then again, qemu rounds
up to cluster size internally, so using MiB hides the worst effects
of that rounding if you pass small numbers.
[1]https://lists.gnu.org/archive/html/qemu-devel/2012-10/msg04123.html
* src/qemu/qemu_capabilities.h (QEMU_CAPS_DRIVE_MIRROR)
(QEMU_CAPS_DRIVE_REOPEN): New bits.
* src/qemu/qemu_capabilities.c (qemuCaps): Name them.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONCheckCommands): Set
them.
(qemuMonitorJSONDriveMirror, qemuMonitorDrivePivot): New functions.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDriveMirror)
(qemuMonitorDrivePivot): Declare them.
* src/qemu/qemu_monitor.c (qemuMonitorDriveMirror)
(qemuMonitorDrivePivot): New passthroughs.
* src/qemu/qemu_monitor.h (qemuMonitorDriveMirror)
(qemuMonitorDrivePivot): Declare them.
qemu 1.3 will be adding a 'block-commit' monitor command, per
qemu.git commit ed61fc1. It matches nicely to the libvirt API
virDomainBlockCommit.
* src/qemu/qemu_capabilities.h (QEMU_CAPS_BLOCK_COMMIT): New bit.
* src/qemu/qemu_capabilities.c (qemuCapsProbeQMPCommands): Set it.
* src/qemu/qemu_monitor.h (qemuMonitorBlockCommit): New prototype.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockCommit):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorBlockCommit): Implement it.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockCommit):
Likewise.
(qemuMonitorJSONHandleBlockJobImpl)
(qemuMonitorJSONGetBlockJobInfoOne): Handle new event type.
This patch adds support for SUSPEND_DISK event; both lifecycle and
separated. The support is added for QEMU, machines are changed to
PMSUSPENDED, but as QEMU sends SHUTDOWN afterwards, the state changes
to shut-off. This and much more needs to be done in order for libvirt
to work with transient devices, wake-ups etc. This patch is not
aiming for that functionality.
After calling qemuMonitorClose(), it is still possible for
the QEMU monitor I/O event callback to get invoked. This
will trigger an error message because mon->fd has been set
to -1 at this point. Silently ignore the case where mon->fd
is -1, likewise for mon->watch being zero.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The qemuMonitorOpen method only needs a virDomainObjPtr in order
to access the QEMU pid. This is not critical when detecting the
QEMU capabilties, so can easily be skipped
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The qemuMonitorSetCapabilities() API is used to initialize the QMP
protocol capabilities. It has since been abused to initialize some
libvirt internal capabilities based on command/event existance too.
Move the latter code out into qemuCapsProbeQMP() in the QEMU
capabilities source file instead
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The qemu monitor does not require qemu_conf.h, and the
qemu capabilities code actually wants bitmap.h
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetTargetArch() method to support invocation
of the 'query-target' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetObjectProps() method to support invocation
of the 'device-list-properties' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetObjectTypes() method to support invocation
of the 'qom-list-types' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetEvents() method to support invocation
of the 'query-events' JSON monitor command. No HMP equivalent
is required, since this will only be used when JSON is available
The existing qemuMonitorJSONCheckEvents() method is refactored
to use this new method
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetCPUCommands() method to support invocation
of the 'query-commands' JSON monitor command. No HMP equivalent
is required, since this will only be used when JSON is available
The existing qemuMonitorJSONCheckCommands() method is refactored
to use this new method
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetCPUDefinitions() method to support invocation
of the 'query-cpu-definitions' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetMachines() method to support invocation
of the 'query-machines' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetVersion() method to support invocation
of the 'query-version' JSON monitor command. No HMP equivalent
is provided, since this will only be used for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Recently, there have been some improvements made to qemu so it
supports seamless migration or something very close to it.
However, it requires libvirt interaction. Once qemu is migrated,
the SPICE server needs to send its internal state to the destination.
Once it's done, it fires SPICE_MIGRATE_COMPLETED event and this
fact is advertised in 'query-spice' output as well.
We must not kill qemu until SPICE server finishes the transfer.
There are a number of process related functions spread
across multiple files. Start to consolidate them by
creating a virprocess.{c,h} file
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://www.gnu.org/licenses/gpl-howto.html recommends that
the 'If not, see <url>.' phrase be a separate sentence.
* tests/securityselinuxhelper.c: Remove doubled line.
* tests/securityselinuxtest.c: Likewise.
* globally: s/; If/. If/
Upstream qemu has raised a concern about whether dumping guest
memory by reading guest paging tables is a security hole:
https://lists.gnu.org/archive/html/qemu-devel/2012-09/msg02607.html
While auditing libvirt to see if we would be impacted, I noticed
that we had some dead code. It is simpler to nuke the dead code
and limit our monitor code to just the subset we make use of.
* src/qemu/qemu_monitor.h (QEMU_MONITOR_DUMP): Drop poorly named
and mostly-unused enum.
* src/qemu/qemu_monitor.c (qemuMonitorDumpToFd): Drop arguments.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDump): Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDump): Likewise.
* src/qemu/qemu_driver.c (qemuDumpToFd): Update caller.
Don't bother checking for the existance of the HMP passthrough
command. Just try to execute it, and propagate the failure.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The current qemu capabilities are stored in a virBitmapPtr
object, whose type is exposed to callers. We want to store
more data besides just the flags, so we need to move to a
struct type. This object will also need to be reference
counted, since we'll be maintaining a cache of data per
binary. This change introduces a 'qemuCapsPtr' virObject
class. Most of the change is just renaming types and
variables in all the callers
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Technically speaking we should wait until we receive the QMP
greeting message before attempting to send any QMP monitor
commands. Mostly we've got away with this, but there is a race
in some QEMU which cause it to SEGV if you sent it data too
soon after startup. Waiting for the QMP greeting avoids the
race
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently qemuMonitorOpen() requires an address of the QEMU
monitor. When doing QMP based capabilities detection it is
easier if a pre-opened FD can be provided, since then the
monitor can be run on the STDIO console. Add a new API
qemuMonitorOpenFD() for such usage
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add some non-null annotations to qemuMonitorOpen and also
check that the error callback is set, since it is mandatory
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the functions the parse/format, and validate PCI addresses to
their own file so they can be conveniently used in other places
besides device_conf.c
Refactoring existing code without causing any functional changes to
prepare for new code.
This patch makes the code reusable.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Per the FSF address could be changed from time to time, and GNU
recommends the following now: (http://www.gnu.org/licenses/gpl-howto.html)
You should have received a copy of the GNU General Public License
along with Foobar. If not, see <http://www.gnu.org/licenses/>.
This patch removes the explicit FSF address, and uses above instead
(of course, with inserting 'Lesser' before 'General').
Except a bunch of files for security driver, all others are changed
automatically, the copyright for securify files are not complete,
that's why to do it manually:
src/security/security_selinux.h
src/security/security_driver.h
src/security/security_selinux.c
src/security/security_apparmor.h
src/security/security_apparmor.c
src/security/security_driver.c
If QEMU supports the BALLOON_EVENT QMP event, then we can
avoid invoking 'query-balloon' when returning XML or the
domain info.
* src/qemu/qemu_capabilities.c, src/qemu/qemu_capabilities.h:
Add QEMU_CAPS_BALLOON_EVENT
* src/qemu/qemu_driver.c: Skip query-balloon in
qemudDomainGetInfo and qemuDomainGetXMLDesc if we have
QEMU_CAPS_BALLOON_EVENT set
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Check
for BALLOON_EVENT at connect to monitor. Add callback
for balloon change notifications
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h:
Add handling of BALLOON_EVENT and impl 'query-events'
check
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
dump-guest-memory is a new dump mechanism, and it can work when the
guest uses host devices. This patch adds a API to use this new
monitor command.
We will always use json mode if qemu's version is >= 0.15, so I
don't implement the API for text mode.
While unescaping the commands the commands passed through to the monitor
function qemuMonitorUnescapeArg() initialized lenght of the input string
to strlen()+1 which is fine for alloc but not for iteration of the
string.
This patch fixes the off-by-one error and drops the pointless check for
a single trailing slash that is automaticaly handled by the default
branch of switch.
When building as driver modules, it is not possible for the QEMU
driver module to reference the DTrace/SystemTAP probes linked into
the main libvirt.so. Thus we need to move the QEMU probes into a
separate file 'libvirt_qemu_probes.d'. Also rename the existing
file from 'probes.d' to 'libvirt_probes.d' while we're at it
* daemon/Makefile.am, src/internal.h: Include libvirt_probes.h
instead of probes.h
* src/Makefile.am: Add rules for libvirt_qemu_probes.d
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor_json.c,
src/qemu/qemu_monitor_text.c: Include libvirt_qemu_probes.h
* src/libvirt_probes.d: Rename from probes.d
* src/libvirt_qemu_probes.d: QEMU specific probes formerly
in probes.d
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit cdce2f42d tried to silence a compiler warning on 32-bit builds,
but the gcc shipped with RHEL 5 is old enough that the type conversion
via multiplication by 1 was insufficient for the task.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Previous attempt
didn't get past all gcc versions.
On 32-bit platforms, gcc warns that the comparison between a long
and (ULLONG_MAX/1024/1024) is always false; throwing in a type
conversion shuts up the warning.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Shut gcc up.
With RHEL 6.2, virDomainBlockPull(dom, dev, bandwidth, 0) has a race
with non-zero bandwidth: there is a window between the block_stream
and block_job_set_speed monitor commands where an unlimited amount
of data was let through, defeating the point of a throttle.
This race was first identified in commit a9d3495e, and libvirt was
able to reduce the size of the window for that race. In the meantime,
the qemu developers decided to fix things properly; per this message:
https://lists.gnu.org/archive/html/qemu-devel/2012-04/msg03793.html
the fix will be in qemu 1.1, and changes block-job-set-speed to use
a different parameter name, as well as adding a new optional parameter
to block-stream, which eliminates the race altogether.
Since our documentation already mentioned that we can refuse a non-zero
bandwidth for some hypervisors, I think the best solution is to do
just that for RHEL 6.2 qemu, so that the race is obvious to the user
(anyone using stock RHEL 6.2 binaries won't have this patch, and anyone
building their own libvirt with this patch for RHEL can also rebuild
qemu to get the modern semantics, so it is no real loss in behavior).
Meanwhile the code must be fixed to honor actual qemu 1.1 naming.
Rename the parameter to 'modern', since the naming difference now
covers more than just 'async' block-job-cancel. And while at it,
fix an unchecked integer overflow.
* src/qemu/qemu_monitor.h (enum BLOCK_JOB_CMD): Drop unused value,
rename enum to match conventions.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Reflect enum rename.
* src/qemu_qemu_monitor_json.h (qemuMonitorJSONBlockJob): Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Likewise,
and support difference between RHEL 6.2 and qemu 1.1 block pull.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Reject
bandwidth during pull with too-old qemu.
* src/libvirt.c (virDomainBlockPull, virDomainBlockRebase):
Document this.
Most of our errors complaining about an inability to support a
particular action due to qemu limitations used CONFIG_UNSUPPORTED,
but we had a few outliers. Reported by Jiri Denemark.
* src/qemu/qemu_command.c (qemuBuildDriveDevStr): Prefer
CONFIG_UNSUPPORTED.
* src/qemu/qemu_driver.c (qemuDomainReboot)
(qemuDomainBlockJobImpl): Likewise.
* src/qemu/qemu_hotplug.c (qemuDomainAttachPciControllerDevice):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorTransaction)
(qemuMonitorBlockJob, qemuMonitorSystemWakeup): Likewise.
RHEL 6.2 was released with an early version of block jobs, which only
worked on the qed file format, where the commands were spelled with
underscore (contrary to QMP style), and where 'block_job_cancel' was
synchronous and did not trigger an event.
The upcoming qemu 1.1 release has fixed these short-comings [1][2]:
the commands now work on multiple file types, are spelled with dash,
and 'block-job-cancel' is asynchronous and emits an event upon conclusion.
[1]qemu commit 370521a1d6f5537ea7271c119f3fbb7b0fa57063
[2]https://lists.gnu.org/archive/html/qemu-devel/2012-04/msg01248.html
This patch recognizes the new spellings, and fixes virDomainBlockRebase
to give a graceful error when talking to a too-old qemu on a partial
rebase attempt. Fixes for the new semantics will come later. This
patch also removes a bogus ATTRIBUTE_NONNULL mistakenly added in
commit 10ec36e2.
* src/qemu/qemu_capabilities.h (QEMU_CAPS_BLOCKJOB_SYNC)
(QEMU_CAPS_BLOCKJOB_ASYNC): New bits.
* src/qemu/qemu_capabilities.c (qemuCaps): Name them.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONCheckCommands): Set
them.
(qemuMonitorJSONBlockJob): Manage both command names.
(qemuMonitorJSONDiskSnapshot): Minor formatting fix.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob): Alter signature.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJob): Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Pass through
capability bit.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Update callers.
The oVirt developers have stated that the real reasons they want
to have qemu reuse existing volumes when creating a snapshot are:
1. the management framework is set up so that creation has to be
done from a central node for proper resource tracking, and having
libvirt and/or qemu create things violates the framework, and
2. qemu defaults to creating snapshots with an absolute path to
the backing file, but oVirt wants to manage a backing chain that
uses just relative names, to allow for easier migration of a chain
across storage locations.
When 0.9.10 added VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT (commit
4e9953a4), it only addressed point 1, but libvirt was still using
O_TRUNC which violates point 2. Meanwhile, the new qemu
'transaction' monitor command includes a new optional mode argument
that will force qemu to reuse the metadata of the file it just
opened (with the burden on the caller to have valid metadata there
in the first place). So, this tweaks the meaning of the flag to
cover both points as intended for use by oVirt. It is not strictly
backward-compatible to 0.9.10 behavior, but it can be argued that
the O_TRUNC of 0.9.10 was a bug.
Note that this flag is all-or-nothing, and only selects between
'existing' and the default 'absolute-paths'. A more flexible
approach that would allow per-disk selections, as well as adding
support for the 'no-backing-file' mode, would be possible by
extending the <domainsnapshot> xml to have a per-disk mode, but
until we have a management application expressing a need for that
additional complexity, it is not worth doing.
* src/libvirt.c (virDomainSnapshotCreateXML): Tweak documentation.
* src/qemu/qemu_monitor.h (qemuMonitorDiskSnapshot): Add
parameters.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskSnapshot):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorDiskSnapshot): Pass them
through.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskSnapshot): Use
new monitor command arguments.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive)
(qemuDomainSnapshotCreateSingleDiskActive): Adjust callers.
(qemuDomainSnapshotDiskPrepare): Allow qed, modify rules on reuse.
QEmu 1.1 is adding a 'transaction' command to the JSON monitor.
Each element of a transaction corresponds to a top-level command,
with the additional guarantee that the transaction flushes all
pending I/O, then guarantees that all actions will be successful
as a group or that failure will roll back the state to what it
was before the monitor command. The difference between a
top-level command:
{ "execute": "blockdev-snapshot-sync", "arguments":
{ "device": "virtio0", ... } }
and a transaction:
{ "execute": "transaction", "arguments":
{ "actions": [
{ "type": "blockdev-snapshot-sync", "data":
{ "device": "virtio0", ... } } ] } }
is just a couple of changed key names and nesting the shorter
command inside a JSON array to the longer command. This patch
just adds the framework; the next patch will actually use a
transaction.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONMakeCommand): Move
guts...
(qemuMonitorJSONMakeCommandRaw): ...into new helper. Add support
for array element.
(qemuMonitorJSONTransaction): New command.
(qemuMonitorJSONDiskSnapshot): Support use in a transaction.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskSnapshot): Add
argument.
(qemuMonitorJSONTransaction): New declaration.
* src/qemu/qemu_monitor.h (qemuMonitorTransaction): Likewise.
(qemuMonitorDiskSnapshot): Add argument.
* src/qemu/qemu_monitor.c (qemuMonitorTransaction): New wrapper.
(qemuMonitorDiskSnapshot): Pass argument on.
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive): Update caller.
This patch introduces a new event type for the QMP event
SUSPEND:
VIR_DOMAIN_EVENT_ID_PMSUSPEND
The event doesn't take any data, but considering there might
be reason for wakeup in future, the callback definition is:
typedef void
(*virConnectDomainEventSuspendCallback)(virConnectPtr conn,
virDomainPtr dom,
int reason,
void *opaque);
"reason" is unused currently, always passes "0".
This patch introduces a new event type for the QMP event
WAKEUP:
VIR_DOMAIN_EVENT_ID_PMWAKEUP
The event doesn't take any data, but considering there might
be reason for wakeup in future, the callback definition is:
typedef void
(*virConnectDomainEventWakeupCallback)(virConnectPtr conn,
virDomainPtr dom,
int reason,
void *opaque);
"reason" is unused currently, always passes "0".
This patch introduces a new event type for the QMP event
DEVICE_TRAY_MOVED, which occurs when the tray of a removable
disk is moved (i.e opened or closed):
VIR_DOMAIN_EVENT_ID_TRAY_CHANGE
The event's data includes the device alias and the reason
for tray status' changing, which indicates why the tray
status was changed. Thus the callback definition for the event
is:
enum {
VIR_DOMAIN_EVENT_TRAY_CHANGE_OPEN = 0,
VIR_DOMAIN_EVENT_TRAY_CHANGE_CLOSE,
\#ifdef VIR_ENUM_SENTINELS
VIR_DOMAIN_EVENT_TRAY_CHANGE_LAST
\#endif
} virDomainEventTrayChangeReason;
typedef void
(*virConnectDomainEventTrayChangeCallback)(virConnectPtr conn,
virDomainPtr dom,
const char *devAlias,
int reason,
void *opaque);
Using 'unsigned long' for memory values is risky on 32-bit platforms,
as a PAE guest can have more than 4GiB memory. Our API is
(unfortunately) locked at 'unsigned long' and a scale of 1024, but
the rest of our system should consistently use 64-bit values,
especially since the previous patch centralized overflow checking.
* src/conf/domain_conf.h (_virDomainDef): Always use 64-bit values
for memory. Change hugepage_backed to a bool.
* src/conf/domain_conf.c (virDomainDefParseXML)
(virDomainDefCheckABIStability, virDomainDefFormatInternal): Fix
clients.
* src/vmx/vmx.c (virVMXFormatConfig): Likewise.
* src/xenxs/xen_sxpr.c (xenParseSxpr, xenFormatSxpr): Likewise.
* src/xenxs/xen_xm.c (xenXMConfigGetULongLong): New function.
(xenXMConfigGetULong, xenXMConfigSetInt): Avoid truncation.
(xenParseXM, xenFormatXM): Fix clients.
* src/phyp/phyp_driver.c (phypBuildLpar): Likewise.
* src/openvz/openvz_driver.c (openvzDomainSetMemoryInternal):
Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainDefineXML): Likewise.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Likewise.
* src/qemu/qemu_process.c (qemuProcessStart): Likewise.
* src/qemu/qemu_monitor.h (qemuMonitorGetBalloonInfo): Likewise.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONGetBalloonInfo):
Likewise.
* src/qemu/qemu_driver.c (qemudDomainGetInfo)
(qemuDomainGetXMLDesc): Likewise.
* src/uml/uml_conf.c (umlBuildCommandLine): Likewise.
This actually wires up the new optional parameter to block_stream:
http://wiki.qemu.org/Features/LiveBlockMigration/ImageStreamingAPI
The error checking is still sparse, since libvirt must not use
qemu-img or header probing on a qcow2 file in use by qemu to
check if the backing file name is valid; so for now, libvirt is
relying on qemu to diagnose an incorrect backing name. Fixing this
will require libvirt to track the entire backing file chain at the
time qemu is started and keeps it updated with snapshot and pull
operations.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Add
parameter, and update callers.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJob): Update
signature.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob): Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Update caller.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Likewise.
Block job commands are not part of upstream qemu until 1.1; and
proper support of job completion and cancellation depends on being
able to receive QMP events, which implies the JSON monitor.
Additionally, some early versions of block job commands were
backported to RHEL qemu, but these versions lacked asynchronous
job cancellation and partial block pull, so there are several
patches that will still be needed in this area of libvirt code
to support both flavors of block job commands.
Due to earlier patches in libvirt, we are guaranteed that all versions
of qemu that support block job commands already require libvirt to
use the JSON monitor. That means that the text version of block jobs
will not be used, and having to refactor two copies of the block job
handlers makes no sense. So instead, we delete the text handlers.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Drop text monitor
support.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextBlockJob): Delete.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextParseBlockJobOne)
(qemuMonitorTextParseBlockJob, qemuMonitorTextBlockJob):
Likewise.
QMP commands don't need to be escaped since converting them to json
also escapes special characters. When a QMP command fails, however,
libvirt falls back to HMP commands. These fallback functions
(qemuMonitorText*) do their own escaping, and pass the result directly
to qemuMonitorHMPCommandWithFd. If the monitor is in json mode, these
pre-escaped commands will be escaped again when converted to json,
which can result in the wrong arguments being sent.
For example, a filename test\file would be sent in json as
test\\file.
This prevented attaching an image file with a " or \ in its name in
qemu 1.0.50, and also broke rbd attachment (which uses backslashes to
escape some internal arguments.)
Reported-by: Masuko Tomoya <tomoya.masuko@gmail.com>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
using 'system-wakeup' monitor command. It is supported only in JSON,
as we are enabling it if possible. Moreover, this command is available
in qemu-1.1+ which definitely has JSON.
In the future (my next patch in fact) we may want to make
decisions depending on qemu having a monitor command or not.
Therefore, we want to set qemuCaps flag instead of querying
on the monitor each time we are about to make that decision.
QEMU always sends details about all available block devices as an answer
for "info block"/"query-block" command. On the other hand, our
qemuMonitorGetBlockInfo was made for a single block devices queries
only. Thus, when asking for multiple devices, we asked qemu multiple
times to always get the same answer from which different parts were
filtered. This patch makes qemuMonitorGetBlockInfo return a hash table
of all block devices, which may later be used for getting details about
specific devices.
If an async job run on a domain will stop the domain at the end of the
job, a concurrently run query job can hang in qemu monitor and nothing
can be done with that domain from this point on. An attempt to start
such domain results in "Timed out during operation: cannot acquire state
change lock" error.
However, quite a few things have to happen at the right time... There
must be an async job running which stops a domain at the end. This race
was reported with dump --crash but other similar jobs, such as
(managed)save and migration, should be able to trigger this bug as well.
While this async job is processing its last monitor command, that is a
query-migrate to which qemu replies with status "completed", a new
libvirt API that results in a query job must arrive and stay waiting
until the query-migrate command finishes. Once query-migrate is done but
before the async job closes qemu monitor while stopping the domain, the
other thread needs to wake up and call qemuMonitorSend to send its
command to qemu. Before qemu gets a chance to respond to this command,
the async job needs to close the monitor. At this point, the query job
thread is waiting for a condition that no-one will ever signal so it
never finishes the job.
Implement the block I/O throttle setting and getting support to qemu
driver.
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Implements functions for both HMP and QMP mode.
For HMP mode, qemu uses "M" as the units by default, so the passed "sized"
is divided by 1024.
For QMP mode, qemu uses "Bytes" as the units by default, the passed "sized"
is multiplied by 1024.
All of the monitor functions return -1 on failure, 0 on success, or -2 if
not supported.
The QEMU monitor command 'add_client' can be used to connect to
a VNC or SPICE graphics display. This allows for implementation
of the virDomainOpenGraphics API
* src/qemu/qemu_driver.c: Implement virDomainOpenGraphics
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Add binding for 'add_client' command
This change adds some systemtap/dtrace probes to the QEMU monitor
client code. In particular it allows watching of all operations
for a VM
* examples/systemtap/qemu-monitor.stp: Watch all monitor commands
* src/Makefile.am: Passing libdir/bindir/sbindir to dtrace2systemtap.pl
* src/dtrace2systemtap.pl: Accept libdir/bindir/sbindir as args
and look for '# binary:' comment to mark probes against libvirtd
vs libvirt.so
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor_json.c,
src/qemu/qemu_monitor_text.c: Add probes for key functions
If the daemon is restarted so we reconnect to monitor, cdrom media
can be ejected. In that case we don't want to show it in domain xml,
or require it on migration destination.
To check for disk status use 'info block' monitor command.
This patch adds handlers for modification of guest's interface
link state. Both HMP and QMP commands are supported, but as the
link state functionality is from the beginning supported in QMP
the HMP code will probably never be used.
The mainly changes are:
1) Update qemuMonitorGetBlockStatsInfo and it's children (Text/JSON)
functions to return the value of new latency fields.
2) Add new function qemuMonitorGetBlockStatsParamsNumber, which is
to count how many parameters the underlying QEMU supports.
3) Update virDomainBlockStats in src/qemu/qemu_driver.c to be
compatible with the changes by 1).
No one uses this yet, but it will be important once
virDomainSnapshotCreateXML learns a VIR_DOMAIN_SNAPSHOT_DISK_ONLY
flag, and the xml allows passing in the new file names.
* src/qemu/qemu_monitor.h (qemuMonitorDiskSnapshot): New prototype.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextDiskSnapshot):
Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskSnapshot):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorDiskSnapshot): New
function.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskSnapshot):
Likewise.
When an operation started by virDomainBlockPull completes (either with
success or with failure), raise an event to indicate the final status.
This API allow users to avoid polling on virDomainGetBlockJobInfo if
they would prefer to use an event mechanism.
* daemon/remote.c: Dispatch events to client
* include/libvirt/libvirt.h.in: Define event ID and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle the new event
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for block_stream completion and emit a libvirt block pull event
* src/remote/remote_driver.c: Receive and dispatch events to application
* src/remote/remote_protocol.x: Wire protocol definition for the event
* src/remote_protocol-structs: structure definitions for protocol verification
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for BLOCK_STREAM_COMPLETED event
from QEMU monitor
The virDomainBlockPull* family of commands are enabled by the
following HMP/QMP commands: 'block_stream', 'block_job_cancel',
'info block-jobs' / 'query-block-jobs', and 'block_job_set_speed'.
* src/qemu/qemu_driver.c src/qemu/qemu_monitor_text.[ch]: implement disk
streaming by using the proper qemu monitor commands.
* src/qemu/qemu_monitor_json.[ch]: implement commands using the qmp monitor
When qemuMonitorCloseFileHandle is called in error path, we need to
preserve the original error since a possible further error when running
closefd monitor command is not very useful to users.
Continuation of commit 313ac7fd, and enforce things with a syntax
check.
Technically, virNetServerClientCalculateHandleMode is not printing
a mode_t, but rather a collection of VIR_EVENT_HANDLE_* bits;
however, these bits are < 8, so there is no different in the
output, and that was the easiest way to silence the new syntax check.
* cfg.mk (sc_flags_debug): New syntax check.
(exclude_file_name_regexp--sc_flags_debug): Add exemptions.
* src/fdstream.c (virFDStreamOpenFileInternal): Print flags in
hex, mode_t in octal.
* src/libvirt-qemu.c (virDomainQemuMonitorCommand)
(virDomainQemuAttach): Likewise.
* src/locking/lock_driver_nop.c (virLockManagerNopInit): Likewise.
* src/locking/lock_driver_sanlock.c (virLockManagerSanlockInit):
Likewise.
* src/locking/lock_manager.c: Likewise.
* src/qemu/qemu_migration.c: Likewise.
* src/qemu/qemu_monitor.c: Likewise.
* src/rpc/virnetserverclient.c
(virNetServerClientCalculateHandleMode): Print mode with %o.
When attaching to an external QEMU process, it is neccessary
to check if the process is using KVM or not. This can be done
using a monitor command
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
API for checking if KVM is enabled
The 'char control[CMSG_SPACE(sizeof(int))];' was not being
wiped, so could potentially contain uninitialized bytes.
While this was harmless in this case, it caused complaints
from valgrind
* src/qemu/qemu_monitor.c: memset 'control' variable
in qemuMonitorIOWriteWithFD
For controlled shutdown we issue a 'system_powerdown' command
to the QEMU monitor. This triggers an ACPI event which (most)
guest OS wire up to a controlled shutdown. There is no equiv
ACPI event to trigger a controlled reboot. This patch attempts
to fake a reboot.
- In qemuDomainObjPrivatePtr we have a bool fakeReboot
flag.
- The virDomainReboot method sets this flag and then
triggers a normal 'system_powerdown'.
- The QEMU process is started with '-no-shutdown'
so that the guest CPUs pause when it powers off the
guest
- When we receive the 'POWEROFF' event from QEMU JSON
monitor if fakeReboot is not set we invoke the
qemuProcessKill command and shutdown continues
normally
- If fakeReboot was set, we spawn a background thread
which issues 'system_reset' to perform a warm reboot
of the guest hardware. Then it issues 'cont' to
start the CPUs again
* src/qemu/qemu_command.c: Add -no-shutdown flag if
we have JSON support
* src/qemu/qemu_domain.h: Add 'fakeReboot' flag to
qemuDomainObjPrivate struct
* src/qemu/qemu_driver.c: Fake reboot using the
system_powerdown command if JSON support is available
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
binding for system_reset command
* src/qemu/qemu_process.c: Reset the guest & start CPUs if
fakeReboot is set
Since we virEventRegisterDefaultImpl is now a public API, callers need
a way to invoke the default registered Handle and Timeout functions. We
already have general functions for these internally, so promote
them to the public API.
v2:
Actually add APIs to libvirt.h
When an operation started by virDomainBlockPullAll completes (either with
success or with failure), raise an event to indicate the final status. This
allows an API user to avoid polling on virDomainBlockPullInfo if they would
prefer to use the event mechanism.
* daemon/remote.c: Dispatch events to client
* include/libvirt/libvirt.h.in: Define event ID and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle the new event
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for block_stream completion and emit a libvirt block pull event
* src/remote/remote_driver.c: Receive and dispatch events to application
* src/remote/remote_protocol.x: Wire protocol definition for the event
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for BLOCK_STREAM_COMPLETED event
from QEMU monitor
Signed-off-by: Adam Litke <agl@us.ibm.com>
The virDomainBlockPull* family of commands are enabled by the
'block_stream' and 'info block_stream' qemu monitor commands.
* src/qemu/qemu_driver.c src/qemu/qemu_monitor_text.[ch]: implement disk
streaming by using the stream and info stream text monitor commands
* src/qemu/qemu_monitor_json.[ch]: implement commands using the qmp monitor
Signed-off-by: Adam Litke <agl@us.ibm.com>
Acked-by: Daniel P. Berrange <berrange@redhat.com>
The below patch decreases the response time of libvirt to errors reported by Qemu upon startup by checking whether the qemu process is still alive while polling for the local socket to show up.
This patch also introduces a special handling of signal for the Win32 part of virKillProcess.
Commit 4454a9efc7 introduced bad
behaviour on the VIR_EVENT_HANDLE_ERROR condition. This condition
is only hit when an invalid FD is used in poll() (typically due
to a double-close bug). The QEMU monitor code was treating this
condition as non-fatal, and thus libvirt would poll() in a fast
loop forever burning 100% CPU. VIR_EVENT_HANDLE_ERROR must be
handled in the same way as VIR_EVENT_HANDLE_HANGUP, killing the
QEMU instance.
* src/qemu/qemu_monitor.c: Treat VIR_EVENT_HANDLE_ERROR as EOF
Currently the QEMU monitor I/O handler code uses errno values
to report errors. This results in a sub-optimal error messages
on certain conditions, in particular when parsing JSON strings
malformed data simply results in 'EINVAL'.
This changes the code to use the standard libvirt error reporting
APIs. The virError is stored against the qemuMonitorPtr struct,
and when a monitor API is run, any existing stored error is copied
into that thread's error local
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_text.c: Use
virError APIs for all monitor I/O handling code
Currently whenever there is any failure with parsing the monitor,
this is treated in the same was as end-of-file (ie QEMU quit).
The domain is terminated, if not already dead.
With this change, failures in parsing the monitor stream do not
result in the death of QEMU. The guest continues running unchanged,
but all further use of the monitor will be disabled.
The VMM_FAILURE event will be emitted, and the mgmt application
can decide when to kill/restart the guest to re-gain control
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Run a
different callback for monitor EOF vs error conditions.
* src/qemu/qemu_process.c: Emit VMM_FAILURE event when monitor
fails
Use the graphics information from the QEMU migration cookie to
issue a 'client_migrate_info' monitor command to QEMU. This causes
the SPICE client to automatically reconnect to the target host
when migration completes
* src/qemu/qemu_migration.c: Set data for SPICE client relocation
before starting migration on src
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
new qemuMonitorGraphicsRelocate() command
These VIR_XXXX0 APIs make us confused, use the non-0-suffix APIs instead.
How do these coversions works? The magic is using the gcc extension of ##.
When __VA_ARGS__ is empty, "##" will swallow the "," in "fmt," to
avoid compile error.
example: origin after CPP
high_level_api("%d", a_int) low_level_api("%d", a_int)
high_level_api("a string") low_level_api("a string")
About 400 conversions.
8 special conversions:
VIR_XXXX0("") -> VIR_XXXX("msg") (avoid empty format) 2 conversions
VIR_XXXX0(string_literal_with_%) -> VIR_XXXX(%->%%) 0 conversions
VIR_XXXX0(non_string_literal) -> VIR_XXXX("%s", non_string_literal)
(for security) 6 conversions
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
We already have virAsprintf, so picking a similar name helps for
seeing a similar purpose. Furthermore, the prefix V before printf
generally implies 'va_list', even though this variant was '...', and
the old name got in the way of adding a new va_list version.
global rename performed with:
$ git grep -l virBufferVSprintf \
| xargs -L1 sed -i 's/virBufferVSprintf/virBufferAsprintf/g'
then revert the changes in ChangeLog-old.
If qemu quited unexpectedly when we call qemuMonitorJSONHMP(),
libvirt will crash.
Steps to reproduce this bug:
1. use gdb to attach libvirtd, and set a breakpoint in the function
qemuMonitorSetCapabilities()
2. start a vm
3. let the libvirtd to run until qemuMonitorJSONSetCapabilities() returns.
4. kill the qemu process
5. continue running libvirtd
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
If the monitor met a error, and we will call qemuProcessHandleMonitorEOF().
But we may try to send monitor command after qemuProcessHandleMonitorEOF()
returned. Then libvirtd will be blocked in qemuMonitorSend().
Steps to reproduce this bug:
1. use gdb to attach libvirtd, and set a breakpoint in the function
qemuConnectMonitor()
2. start a vm
3. let the libvirtd to run until qemuMonitorOpen() returns.
4. kill the qemu process
5. continue running libvirtd
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Add the compiler attribute to ensure we don't introduce any more
ref bugs like were just patched in commit 9741f34, then explicitly
mark the remaining places in code that are safe.
* src/qemu/qemu_monitor.h (qemuMonitorUnref): Mark
ATTRIBUTE_RETURN_CHECK.
* src/conf/domain_conf.h (virDomainObjUnref): Likewise.
* src/conf/domain_conf.c (virDomainObjParseXML)
(virDomainLoadStatus): Fix offenders.
* src/openvz/openvz_conf.c (openvzLoadDomains): Likewise.
* src/vmware/vmware_conf.c (vmwareLoadDomains): Likewise.
* src/qemu/qemu_domain.c (qemuDomainObjBeginJob)
(qemuDomainObjBeginJobWithDriver)
(qemuDomainObjExitRemoteWithDriver): Likewise.
* src/qemu/qemu_monitor.c (QEMU_MONITOR_CALLBACK): Likewise.
Suggested by Daniel P. Berrange.
A future patch will change reference counting idioms; consolidating
this pattern now makes the next patch smaller (touch only the new
macro rather than every caller).
* src/qemu/qemu_monitor.c (QEMU_MONITOR_CALLBACK): New helper.
(qemuMonitorGetDiskSecret, qemuMonitorEmitShutdown)
(qemuMonitorEmitReset, qemuMonitorEmitPowerdown)
(qemuMonitorEmitStop, qemuMonitorEmitRTCChange)
(qemuMonitorEmitWatchdog, qemuMonitorEmitIOError)
(qemuMonitorEmitGraphics): Use it to reduce duplication.
With only a single caller to these two monitor commands, I
didn't need to wrap a new WithFds version, but just change
the command itself.
* src/qemu/qemu_monitor.h (qemuMonitorAddNetdev)
(qemuMonitorAddHostNetwork): Add parameters.
* src/qemu/qemu_monitor.c (qemuMonitorAddNetdev)
(qemuMonitorAddHostNetwork): Add support for fd passing.
* src/qemu/qemu_hotplug.c (qemuDomainAttachNetDevice): Use it to
simplify code.
This is also a bug fix - on the error path, qemu_hotplug would
leave the configfd file leaked into qemu. At least the next
attempt to hotplug a PCI device would reuse the same fdname,
and when the qemu getfd monitor command gets a new fd by the
same name as an earlier one, it closes the earlier one, so there
is no risk of qemu running out of fds.
* src/qemu/qemu_monitor.h (qemuMonitorAddDeviceWithFd): New
prototype.
* src/qemu/qemu_monitor.c (qemuMonitorAddDevice): Move guts...
(qemuMonitorAddDeviceWithFd): ...to new function, and add support
for fd passing.
* src/qemu/qemu_hotplug.c (qemuDomainAttachHostPciDevice): Use it
to simplify code.
Suggested by Daniel P. Berrange.
qemu_monitor was already returning -1 and setting errno to EINVAL
on any attempt to send an fd without a unix socket, but this was
a silent failure in the case of qemuDomainAttachHostPciDevice.
Meanwhile, qemuDomainAttachNetDevice was doing some sanity checking
for a better error message; it's better to consolidate that to a
central point in the API.
* src/qemu/qemu_hotplug.c (qemuDomainAttachNetDevice): Move sanity
checking...
* src/qemu/qemu_monitor.c (qemuMonitorSendFileHandle): ...into
central location.
Suggested by Chris Wright.
Steps to reproduce this bug:
# virsh qemu-monitor-command domain 'cpu_set 2 online' --hmp
The domain has 2 cpus, and we try to set the third cpu online.
The qemu crashes, and this command will hang.
The reason is that the refs is not 1 when we unwatch the monitor.
We lock the monitor, but we do not unlock it. So virCondWait()
will be blocked.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
POSIX states about dd:
If the bs=expr operand is specified and no conversions other than
sync, noerror, or notrunc are requested, the data returned from each
input block shall be written as a separate output block; if the read
returns less than a full block and the sync conversion is not
specified, the resulting output block shall be the same size as the
input block. If the bs=expr operand is not specified, or a conversion
other than sync, noerror, or notrunc is requested, the input shall be
processed and collected into full-sized output blocks until the end of
the input is reached.
Since we aren't using conv=sync, there is no zero-padding, but our
use of bs= means that a short read results in a short write. If
instead we use ibs= and obs=, then short reads are collected and dd
only has to do a single write, which can make dd more efficient.
* src/qemu/qemu_monitor.c (qemuMonitorMigrateToFile):
Avoid 'dd bs=', since it can cause short writes.
JSON monitor command implementation can now just directly call text
monitor implementation and it will be automatically encapsulated into
QMP's human-monitor-command.
Done mechanically with:
$ git grep -l '\bDEBUG0\? *(' | xargs -L1 sed -i 's/\bDEBUG0\? *(/VIR_&/'
followed by manual deletion of qemudDebug in daemon/libvirtd.c, along
with a single 'make syntax-check' fallout in the same file, and the
actual deletion in src/util/logging.h.
* src/util/logging.h (DEBUG, DEBUG0): Delete.
* daemon/libvirtd.h (qemudDebug): Likewise.
* global: Change remaining clients over to VIR_DEBUG counterpart.
Suspending a VM which contains shell meta characters doesn't work with
libvirt-0.8.7:
/var/log/libvirt/qemu/andreas_231-ne\ doch\ nicht.log:
sh: -c: line 0: syntax error near unexpected token `doch'
sh: -c: line 0: `cat | { dd bs=4096 seek=1 if=/dev/null && dd bs=1048576; }
Although target="andreas_231-ne doch nicht" contains shell meta
characters (here: blanks), they are not properly escaped by
src/qemu/qemu_monitor_{json,text}.c#qemuMonitor{JSON,Text}MigrateToFile()
First, the filename needs to be properly escaped for the shell, than
this command line has to be properly escaped for qemu again.
For this to work, remove the old qemuMonitorEscapeArg() wrapper, rename
qemuMonitorEscape() to it removing the handling for shell=TRUE, and
implement a new qemuMonitorEscapeShell() returning strings using single
quotes.
Using double quotes or escaping special shell characters with backslashes
would also be possible, but the set of special characters heavily
depends on the concrete shell (dsh, bash, zsh) and its setting (history
expansion, interactive use, ...)
Signed-off-by: Philipp Hahn <hahn@univention.de>
Currently users who want to use virDomainQemuMonitorCommand() API or
it's virsh equivalent has to use the same protocol as libvirt uses for
communication to qemu. Since the protocol is QMP with current qemu and
HMP much more usable for humans, one ends up typing something like the
following:
virsh qemu-monitor-command DOM \
'{"execute":"human-monitor-command","arguments":{"command-line":"info kvm"}}'
which is not a very convenient way of debugging qemu.
This patch introduces --hmp option to qemu-monitor-command, which says
that the provided command is in HMP. If libvirt uses QMP to talk with
qemu, the command will automatically be converted into QMP. So the
example above is simplified to just
virsh qemu-monitor-command --hmp DOM "info kvm"
Also the result is converted from
{"return":"kvm support: enabled\r\n"}
to just plain HMP:
kvm support: enabled
If libvirt talks to qemu in HMP, --hmp flag is obviously a noop.
Currently libvirt doesn't confirm whether the guest has responded to the
disk removal request. In some cases this can leave the guest with
continued access to the device while the mgmt layer believes that it has
been removed. With a recent qemu monitor command[1] we can
deterministically revoke a guests access to the disk (on the QEMU side)
to ensure no futher access is permitted.
This patch adds support for the drive_del() command and introduces it
in the disk removal paths. If the guest is running in a QEMU without this
command we currently explicitly check for unknown command/CommandNotFound
and log the issue.
If QEMU supports the command we issue the drive_del command after we attempt
to remove the device. The guest may respond and remove the block device
before we get to attempt to call drive_del. In that case, we explicitly check
for 'Device not found' from the monitor indicating that the target drive
was auto-deleted upon guest responds to the device removal notification.
1. http://thread.gmane.org/gmane.comp.emulators.qemu/84745
Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
Currently libvirt doesn't confirm whether the guest has responded to the
disk removal request. In some cases this can leave the guest with
continued access to the device while the mgmt layer believes that it has
been removed. With a recent qemu monitor command[1] we can
deterministically revoke a guests access to the disk (on the QEMU side)
to ensure no futher access is permitted.
This patch adds support for the drive_unplug() command and introduces it
in the disk removal paths. There is some discussion to be had about how
to handle the case where the guest is running in a QEMU without this
command (and the fact that we currently don't have a way of detecting
what monitor commands are available).
Changes since v2:
- use VIR_ERROR to report when unplug command not found
Changes since v1:
- return > 0 when command isn't present, < 0 on command failure
- detect when drive_unplug command isn't present and log error
instead of failing entire command
Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
I am replacing the last instances of close() I found with VIR_CLOSE() / VIR_FORCE_CLOSE respectively.
The first part patches virsh, which I missed out on previously.
The 2nd patch I had left out intentionally to look at it more carefully:
The 'closed' variable could be easily removed since it wasn't used anywhere else. The possible race condition that could result from the filedescriptor being closed and not set to -1 (and possibly let us write into 'something' totally different if the fd was allocated by another thread) seems to be prevented by the qemuMonitorLock() already placed around the code that reads from or writes to the fd. So the change of this code as shown in the patch should not have any side-effects.
QEMU allows forcing a CDROM eject even if the guest has locked the device.
Expose this via a new UpdateDevice flag, VIR_DOMAIN_DEVICE_MODIFY_FORCE.
This has been requested for RHEV:
https://bugzilla.redhat.com/show_bug.cgi?id=626305
v2: Change flag name, bool cleanups
Using automated replacement with sed and editing I have now replaced all
occurrences of close() with VIR_(FORCE_)CLOSE() except for one, of
course. Some replacements were straight forward, others I needed to pay
attention. I hope I payed attention in all the right places... Please
have a look. This should have at least solved one more double-close
error.
Implement the qemu driver's virDomainQemuMonitorCommand
and hook it into the API entry point.
Changes since v1:
- Rename the (external) qemuMonitorCommand to qemuDomainMonitorCommand
- Add virCheckFlags to qemuDomainMonitorCommand
Changes since v2:
- Drop ATTRIBUTE_UNUSED from the flags
Changes since v3:
- Add a flag to priv so we only print out monitor command warning once. Note
that this has not been plumbed into qemuDomainObjPrivateXMLFormat or
qemuDomainObjPrivateXMLParse, which means that if you run a monitor command,
restart libvirtd, and then run another monitor command, you may get an
an erroneous VIR_INFO. It's a pretty minor matter, and I didn't think it
warranted the additional code.
- Add BeginJob/EndJob calls around EnterMonitor/ExitMonitor
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Some, but not all, codepaths in the qemuMonitorOpen() method
would trigger the destroy callback. The caller does not expect
this to be invoked if construction fails, only during normal
release of the monitor. This resulted in a possible double-unref
of the virDomainObjPtr, because the caller explicitly unrefs
the virDomainObjPtr if qemuMonitorOpen() fails
* src/qemu/qemu_monitor.c: Don't invoke destroy callback from
qemuMonitorOpen() failure paths
The patches for shared storage migration were not correctly written
for json mode. Thus the 'blk' and 'inc' parameters were never being
set. In addition they didn't set the QEMU_MONITOR_MIGRATE_BACKGROUND
so migration was synchronous. Due to multiple bugs in QEMU's JSON
impl this wasn't noticed because it treated the sync migration requst
as asynchronous anyway. Finally 'background' parameter was converted
to take arbitrary flags but not renamed, and not all uses were changed
to unsigned int.
* src/qemu/qemu_driver.c: Set QEMU_MONITOR_MIGRATE_BACKGROUND in
doNativeMigrate
* src/qemu/qemu_monitor_json.c: Process QEMU_MONITOR_MIGRATE_NON_SHARED_DISK
and QEMU_MONITOR_MIGRATE_NON_SHARED_INC flags
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.h, src/qemu/qemu_monitor_text.c,
src/qemu/qemu_monitor_text.h: change 'int background' to
'unsigned int flags' in migration APIs. Add logging of flags
parameter
The current code pattern requires that callers of qemuMonitorClose
check for the return value == 0, and if so, set priv->mon = NULL
and release the reference held on the associated virDomainObjPtr
The change d84bb6d6a3 violated that
requirement, meaning that priv->mon never gets set to NULL, and
a reference count is leaked on virDomainObjPtr.
This design was a bad one, so remove the need to check the return
valueof qemuMonitorClose(). Instead allow registration of a
callback that's invoked just when the last reference on qemuMonitorPtr
is released.
Finally there was a potential reference leak in qemuConnectMonitor
in the failure path.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add a destroy
callback invoked from qemuMonitorFree
* src/qemu/qemu_driver.c: Use the destroy callback to release the
reference on virDomainObjPtr when the monitor is freed. Fix other
potential reference count leak in connecting to monitor
The virDomainGetBlockInfo API allows query physical block
extent and allocated block extent. These are normally the
same value unless storing a special format like qcow2
inside a block device. In this scenario we can query QEMU
to get the actual allocated extent.
Since last time:
- Return fatal error in text monitor
- Only invoke monitor command for block devices
- Fix error handling JSON code
* src/qemu/qemu_driver.c: Fill in block aloction extent when VM
is running
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
API to query the highest block extent via info blockstats
History has shown that there are frequent bugs in the QEMU driver
code leading to the monitor being invoked with a NULL pointer.
Although the QEMU driver code should always report an error in
this case before invoking the monitor, as a safety net put in a
generic check in the monitor code entry points.
* src/qemu/qemu_monitor.c: Safety net to check for NULL monitor
object
QEMU is gaining a new monitor command netdev_add for hotplugging
NICs using the netdev backend code. We already support this on
the command this, though it is disabled. This adds support for
hotplug too, also to remain disabled until 0.13 QEMU is released
* src/qemu/qemu_driver.c: Support netdev hotplug for NICs
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
support for netdev_add and netdev_remove commands
When closing a monitor using qemuMonitorClose(), we are aware of
the possibility the monitor is still being used somewhere:
/* NB: ordinarily one might immediately set mon->watch to -1
* and mon->fd to -1, but there may be a callback active
* that is still relying on these fields being valid. So
* we merely close them, but not clear their values and
* use this explicit 'closed' flag to track this state */
but since we call virEventAddHandle() on that monitor without increasing
its ref counter, the monitor is still freed which makes possible users
of it quite unhappy. The unhappiness can lead to a hang if qemuMonitorIO
tries to lock mutex which no longer exists.
Support for live migration between hosts that do not share storage was
added to qemu-kvm release 0.12.1.
It supports two flags:
-b migration without shared storage with full disk copy
-i migration without shared storage with incremental copy (same base image
shared between source and destination).
I tested the live migration without shared storage (both flags) for native
and p2p with and without tunnelling. I also verified that the fix doesn't
affect normal migration with shared storage.
This introduces a new event type
VIR_DOMAIN_EVENT_ID_IO_ERROR_REASON
This event is the same as the previous VIR_DOMAIN_ID_IO_ERROR
event, but also includes a string describing the cause of
the event.
Thus there is a new callback definition for this event type
typedef void (*virConnectDomainEventIOErrorReasonCallback)(virConnectPtr conn,
virDomainPtr dom,
const char *srcPath,
const char *devAlias,
int action,
const char *reason,
void *opaque);
This is currently wired up to the QEMU block IO error events
* daemon/remote.c: Dispatch IO error events to client
* examples/domain-events/events-c/event-test.c: Watch for
IO error events
* include/libvirt/libvirt.h.in: Define new IO error event ID
and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle IO error events
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for block IO errors and emit a libvirt IO error event
* src/remote/remote_driver.c: Receive and dispatch IO error
events to application
* src/remote/remote_protocol.x: Wire protocol definition for
IO error events
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for BLOCK_IO_ERROR event
from QEMU monitor
The save process was relying on use of the shell >> append
operator to ensure the save data was placed after the libvirt
header + XML. This doesn't work for block devices though.
Replace this code with use of 'dd' and its 'seek' parameter.
This means that we need to pad the header + XML out to a
multiple of dd block size (in this case we choose 512).
The qemuMonitorMigateToCommand() monitor API is used for both
save/coredump, and migration via UNIX socket. We can't simply
switch this to use 'dd' since this causes problems with the
migration usage. Thus, create a dedicated qemuMonitorMigateToFile
which can accept an filename + offset, and remove the filename
from the current qemuMonitorMigateToCommand() API
* src/qemu/qemu_driver.c: Switch to qemuMonitorMigateToFile
for save and core dump
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Create
a new qemuMonitorMigateToFile, separate from the existing
qemuMonitorMigateToCommand to allow handling file offsets
The parameter for the qemuMonitorDeviceDel() is a device alias,
not a device config string. Rename the parameter reflect this
and avoid confusion to readers.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Rename devicestr to devalias in qemuMonitorDeviceDel()
The QEMU driver is mistakenly calling directly into the text
mode monitor for the domain memory stats query.
* src/qemu/qemu_driver.c: Replace qemuMonitorTextGetMemoryStats with
qemuMonitorGetMemoryStats
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add the new
wrapper for qemuMonitorGetMemoryStats
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h: Add
qemuMonitorJSONGetMemoryStats implementation
Use the new virDomainUpdateDeviceFlags API to allow the VNC password
to be changed on the fly
* src/internal.h: Define STREQ_NULLABLE() which is like STREQ()
but does not crash if either argument is NULL, and treats two
NULLs as equal.
* src/libvirt_private.syms: Export virDomainGraphicsTypeToString
* src/qemu/qemu_driver.c: Support VNC password change on a live
machine
* src/qemu/qemu_monitor.c: Disable crazy debugging info. Treat a
NULL password as "" (empty string), allowing passwords to be
disabled in the monitor
This introduces a new event type
VIR_DOMAIN_EVENT_ID_GRAPHICS
The same event can be emitted in 3 scenarios
typedef enum {
VIR_DOMAIN_EVENT_GRAPHICS_CONNECT = 0,
VIR_DOMAIN_EVENT_GRAPHICS_INITIALIZE,
VIR_DOMAIN_EVENT_GRAPHICS_DISCONNECT,
} virDomainEventGraphicsPhase;
Connect/disconnect are triggered at socket accept/close.
The initialize phase is immediately after the protocol
setup and authentication has completed. ie when the
client is authorized and about to start interacting with
the graphical desktop
This event comes with *a lot* of potential information
- IP address, port & address family of client
- IP address, port & address family of server
- Authentication scheme (arbitrary string)
- Authenticated subject identity. A subject may have
multiple identities with some authentication schemes.
For example, vencrypt+sasl results in a x509dname
and saslUsername identities.
This results in a very complicated callback :-(
typedef enum {
VIR_DOMAIN_EVENT_GRAPHICS_ADDRESS_IPV4,
VIR_DOMAIN_EVENT_GRAPHICS_ADDRESS_IPV6,
} virDomainEventGraphicsAddressType;
struct _virDomainEventGraphicsAddress {
int family;
const char *node;
const char *service;
};
typedef struct _virDomainEventGraphicsAddress virDomainEventGraphicsAddress;
typedef virDomainEventGraphicsAddress *virDomainEventGraphicsAddressPtr;
struct _virDomainEventGraphicsSubject {
int nidentity;
struct {
const char *type;
const char *name;
} *identities;
};
typedef struct _virDomainEventGraphicsSubject virDomainEventGraphicsSubject;
typedef virDomainEventGraphicsSubject *virDomainEventGraphicsSubjectPtr;
typedef void (*virConnectDomainEventGraphicsCallback)(virConnectPtr conn,
virDomainPtr dom,
int phase,
virDomainEventGraphicsAddressPtr local,
virDomainEventGraphicsAddressPtr remote,
const char *authScheme,
virDomainEventGraphicsSubjectPtr subject,
void *opaque);
The wire protocol is similarly complex
struct remote_domain_event_graphics_address {
int family;
remote_nonnull_string node;
remote_nonnull_string service;
};
const REMOTE_DOMAIN_EVENT_GRAPHICS_IDENTITY_MAX = 20;
struct remote_domain_event_graphics_identity {
remote_nonnull_string type;
remote_nonnull_string name;
};
struct remote_domain_event_graphics_msg {
remote_nonnull_domain dom;
int phase;
remote_domain_event_graphics_address local;
remote_domain_event_graphics_address remote;
remote_nonnull_string authScheme;
remote_domain_event_graphics_identity subject<REMOTE_DOMAIN_EVENT_GRAPHICS_IDENTITY_MAX>;
};
This is currently implemented in QEMU for the VNC graphics
protocol, but designed to be usable with SPICE graphics in
the future too.
* daemon/remote.c: Dispatch graphics events to client
* examples/domain-events/events-c/event-test.c: Watch for
graphics events
* include/libvirt/libvirt.h.in: Define new graphics event ID
and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle graphics events
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for VNC events and emit a libvirt graphics event
* src/remote/remote_driver.c: Receive and dispatch graphics
events to application
* src/remote/remote_protocol.x: Wire protocol definition for
graphics events
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for VNC_CONNECTED,
VNC_INITIALIZED & VNC_DISCONNETED events from QEMU monitor
This introduces a new event type
VIR_DOMAIN_EVENT_ID_IO_ERROR
This event includes the action that is about to be taken
as a result of the watchdog triggering
typedef enum {
VIR_DOMAIN_EVENT_IO_ERROR_NONE = 0,
VIR_DOMAIN_EVENT_IO_ERROR_PAUSE,
VIR_DOMAIN_EVENT_IO_ERROR_REPORT,
} virDomainEventIOErrorAction;
In addition it has the source path of the disk that had the
error and its unique device alias. It does not include the
target device name (/dev/sda), since this would preclude
triggering IO errors from other file backed devices (eg
serial ports connected to a file)
Thus there is a new callback definition for this event type
typedef void (*virConnectDomainEventIOErrorCallback)(virConnectPtr conn,
virDomainPtr dom,
const char *srcPath,
const char *devAlias,
int action,
void *opaque);
This is currently wired up to the QEMU block IO error events
* daemon/remote.c: Dispatch IO error events to client
* examples/domain-events/events-c/event-test.c: Watch for
IO error events
* include/libvirt/libvirt.h.in: Define new IO error event ID
and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle IO error events
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for block IO errors and emit a libvirt IO error event
* src/remote/remote_driver.c: Receive and dispatch IO error
events to application
* src/remote/remote_protocol.x: Wire protocol definition for
IO error events
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for BLOCK_IO_ERROR event
from QEMU monitor
This introduces a new event type
VIR_DOMAIN_EVENT_ID_WATCHDOG
This event includes the action that is about to be taken
as a result of the watchdog triggering
typedef enum {
VIR_DOMAIN_EVENT_WATCHDOG_NONE = 0,
VIR_DOMAIN_EVENT_WATCHDOG_PAUSE,
VIR_DOMAIN_EVENT_WATCHDOG_RESET,
VIR_DOMAIN_EVENT_WATCHDOG_POWEROFF,
VIR_DOMAIN_EVENT_WATCHDOG_SHUTDOWN,
VIR_DOMAIN_EVENT_WATCHDOG_DEBUG,
} virDomainEventWatchdogAction;
Thus there is a new callback definition for this event type
typedef void (*virConnectDomainEventWatchdogCallback)(virConnectPtr conn,
virDomainPtr dom,
int action,
void *opaque);
* daemon/remote.c: Dispatch watchdog events to client
* examples/domain-events/events-c/event-test.c: Watch for
watchdog events
* include/libvirt/libvirt.h.in: Define new watchdg event ID
and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle watchdog events
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for watchdogs and emit a libvirt watchdog event
* src/remote/remote_driver.c: Receive and dispatch watchdog
events to application
* src/remote/remote_protocol.x: Wire protocol definition for
watchdog events
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for WATCHDOG event
from QEMU monitor
This introduces a new event type
VIR_DOMAIN_EVENT_ID_RTC_CHANGE
This event includes the new UTC offset measured in seconds.
Thus there is a new callback definition for this event type
typedef void (*virConnectDomainEventRTCChangeCallback)(virConnectPtr conn,
virDomainPtr dom,
long long utcoffset,
void *opaque);
If the guest XML configuration for the <clock> is set to
offset='variable', then the XML will automatically be
updated with the new UTC offset value. This ensures that
during migration/save/restore the new offset is preserved.
* daemon/remote.c: Dispatch RTC change events to client
* examples/domain-events/events-c/event-test.c: Watch for
RTC change events
* include/libvirt/libvirt.h.in: Define new RTC change event ID
and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle RTC change events
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for RTC changes and emit a libvirt RTC change event
* src/remote/remote_driver.c: Receive and dispatch RTC change
events to application
* src/remote/remote_protocol.x: Wire protocol definition for
RTC change events
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for RTC_CHANGE event
from QEMU monitor
QEMU has a monitor command 'set_cpu' which allows a specific
CPU to be toggled between online& offline state. libvirt CPU
hotplug does not work in terms of individual indexes CPUs.
Thus to support this, we iteratively toggle the online state
when the total number of vCPUs is adjusted via libvirt
NB, currently untested since QEMU segvs when running this!
* src/qemu/qemu_driver.c: Toggle online state for CPUs when
doing hotplug
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
monitor API for toggling a CPU's online status via 'set_cpu
when the underlying qemu supports the drive/device model and the
controller has been added this way.
* src/qemu/qemu_driver.c: use qemuMonitorDelDevice() when detaching
PCI controller and if supported
* src/qemu/qemu_monitor.[ch]: add new qemuMonitorDelDevice() function
* src/qemu/qemu_monitor_json.[ch]: JSON backend for DelDevice command
* src/qemu/qemu_monitor_text.[ch]: Text backend for DelDevice command
When in JSON mode, QEMU requires that 'qmp_capabilities' is run as
the first command in the monitor. This is a no-op when run in the
text mode monitor
* src/qemu/qemu_driver.c: Run capabilities negotiation when
connecting to the monitor
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h: Add
support for the 'qmp_capabilities' command, no-op in text mode.
The old text mode monitor prompts for a password when disks are
encrypted. This interactive approach doesn't work for JSON mode
monitor. Thus there is a new 'block_passwd' command that can be
used.
* src/qemu/qemu_driver.c: Split out code for looking up a disk
secret from findVolumeQcowPassphrase, into a new method
getVolumeQcowPassphrase. Enhance qemuInitPasswords() to also
set the disk encryption password via the monitor
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
support for the 'block_passwd' monitor command.
The way QEMU is started has been changed to use '-device' and
the new style '-drive' syntax. This needs to be mirrored in
the hotplug code, requiring addition of two new APIs.
* src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c: Define APIs
qemuMonitorAddDevice() and qemuMonitorAddDrive()
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Implement the new monitor APIs
* src/util/json.c, src/util/json.h: Declare returned strings
to be const
* src/qemu/qemu_monitor.c: Wire up JSON mode for qemuMonitorGetPtyPaths
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h: Fix
const correctness. Add missing error message in the function
qemuMonitorJSONGetAllPCIAddresses. Add implementation of the
qemuMonitorGetPtyPaths function calling 'query-chardev'.
Hotunplug of devices requires that we know their PCI address. Even
hotplug of SCSI drives, required that we know the PCI address of
the SCSI controller to attach the drive to. We can find this out
by running 'info pci' and then correlating the vendor/product IDs
with the devices we booted with.
Although this approach is somewhat fragile, it is the only viable
option with QEMU < 0.12, since there is no way for libvirto set
explicit PCI addresses when creating devices in the first place.
For QEMU > 0.12, this code will not be used.
* src/qemu/qemu_driver.c: Assign all dynamic PCI addresses on
startup of QEMU VM, matching vendor/product IDs
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
API for fetching PCI device address mapping
The current SCSI hotplug support attaches a brand new SCSI controller
for every disk. This is broken because the semantics differ from those
used when starting the VM initially. In the latter case, each SCSI
controller is filled before a new one is added.
If the user specifies an high drive index (sdazz) then at initial
startup, many intermediate SCSI controllers may be added with no
drives.
This patch changes SCSI hotplug so that it exactly matches the
behaviour of initial startup. First the SCSI controller number is
determined for the drive to be hotplugged. If any controller upto
and including that controller number is not yet present, it is
attached. Then finally the drive is attached to the last controller.
NB, this breaks SCSI hotunplug, because there is no 'drive_del'
command in current QEMU. Previous SCSI hotunplug was broken in
any case because it was unplugging the entire controller, not
just the drive in question.
A future QEMU will allow proper SCSI hotunplug of a drive.
This patch is derived from work done by Wolfgang Mauerer on disk
controllers.
* src/qemu/qemu_driver.c: Fix SCSI hotplug to add a drive to
the correct controller, instead of just attaching a new
controller.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
support for 'drive_add' command
This patch allows for explicit hotplug/unplug of SCSI controllers.
Ordinarily this is not required, since QEMU/libvirt will attach
a new SCSI controller whenever one is required. Allowing explicit
hotplug of controllers though, enables the caller to specify a
static PCI address, instead of auto-assigning the next available
PCI slot. Or it will when we have static PCI addressing.
This patch is derived from Wolfgang Mauerer's disk controller
patch series.
* src/qemu/qemu_driver.c: Support hotplug & unplug of SCSI
controllers
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
new API for attaching PCI SCSI controllers
Convert the QEMU monitor APIs over to use virDomainDeviceAddress
structs for passing addresses in/out, instead of individual bits.
This makes the number of parameters smaller & easier to deal with.
No functional change
* src/qemu/qemu_driver.c, src/qemu/qemu_monitor.c,
src/qemu/qemu_monitor.h, src/qemu/qemu_monitor_text.c,
src/qemu/qemu_monitor_text.h: Change monitor hotplug APIs to
take an explicit address ptr for all host/guest addresses
This change makes the QEMU driver get pty paths from the output of the
monitor 'info chardev' command. This output is structured, and contains
both the name of the device and the path on the same line. This is
considerably more reliable than parsing the startup log output, which
requires the parsing code to know which order QEMU will print pty
information in.
Note that we still need to parse the log output as the monitor itself
may be on a pty. This should be rare, however, and the new code will
replace all pty paths parsed by the log output method once the monitor
is available.
* src/qemu/qemu_monitor.(c|h) src/qemu_monitor_text.(c|h): Implement
qemuMonitorGetPtyPaths().
* src/qemu/qemu_driver.c: Get pty path information using
qemuMonitorGetPtyPaths().
The QEMU 0.10.0 release (and possibly other 0.10.x) has a bug where
it sometimes/often forgets to display the initial monitor greeting
line, soley printing a (qemu). This in turn confuses the text
console parsing because it has a '(qemu)' it is not expecting. The
confusion results in a negative malloc. Bad things follow.
This re-writes the text console handling to be more robust. The key
idea is that it should only look for a (qemu), once it has seen the
original command echo'd back. This ensures it'll skip the bogus stray
(qemu) with broken QEMUs.
* src/qemu/qemu_monitor.c: Add some (disabled) debug code
* src/qemu/qemu_monitor_text.c: Re-write way command replies
are detected
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add callbacks
for reset, shutdown, poweroff and stop events. Add convenience
methods for emiting those events
With addition of events there will be alot of callbacks.
To avoid having to add many APIs to register callbacks,
provide them all at once in a big table
* src/qemu/qemu_driver.c: Pass in a callback table to QEMU
monitor code
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h Replace
the EOF and disk secret callbacks with a callback table
Initial support for the new QEMU monitor protocol using JSON
as the data encoding format instead of plain text
* po/POTFILES.in: Add src/qemu/qemu_monitor_json.c
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Hack to turn on QMP
mode. Replace with a version number check on >= 0.12 later
* src/qemu/qemu_monitor.c: Delegate to json monitor if enabled
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h: Add
impl of QMP protocol
* src/Makefile.am: Add src/qemu/qemu_monitor_json.{c,h}
Now that drivers are using a private domain object state blob,
the virDomainObjFormat/Parse methods are no longer able to
directly serialize all neccessary state to/from XML. It is
thus neccessary to introduce a pair of callbacks fo serializing
private state.
The code for serializing vCPU PIDs and the monitor device
config can now move out of domain_conf.c and into the
qemu_driver.c where they belong.
* src/conf/capabilities.h: Add callbacks for serializing private
state to/from XML
* src/conf/domain_conf.c, src/conf/domain_conf.h: Remove the
monitor, monitor_chr, monitorWatch, nvcpupids and vcpupids
fields from virDomainObjPtr. Remove code that serialized
those fields
* src/libvirt_private.syms: Export virXPathBoolean
* src/qemu/qemu_driver.c: Add callbacks for serializing monitor
and vcpupid data to/from XML
* src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c: Pass monitor
char device config into qemuMonitorOpen directly.
The current QEMU disk media change does not support setting the
disk format. The new JSON monitor will support this, so add an
extra parameter to pass this info in
* src/qemu/qemu_driver.c: Pass in disk format when changing media
* src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Add a 'format' arg to qemuMonitorChangeMedia()
The qemuMonitorEscape() method, and the VIR_ENUM for migration
status will be needed by the JSON monitor too, so move that code
into the shared qemu_monitor.c file instead of qemu_monitor_text.c
* src/qemu/qemu_monitor.h: Declare qemuMonitorMigrationStatus enum
and qemuMonitorEscapeArg and qemuMonitorEscapeShell methods
* src/qemu/qemu_monitor.c: Implement qemuMonitorMigrationStatus enum
and qemuMonitorEscapeArg and qemuMonitorEscapeShell methods
* src/qemu/qemu_monitor_text.c: Remove above methods/enum
If QEMU shuts down while we're in the middle of processing a
monitor command, the monitor will be freed, and upon cleaning
up we attempt to do qemuMonitorUnlock(priv->mon) when priv->mon
is NULL.
To address this we introduce proper reference counting into
the qemuMonitorPtr object, and hold an extra reference whenever
executing a command.
* src/qemu/qemu_driver.c: Hold a reference on the monitor while
executing commands, and only NULL-ify the priv->mon field when
the last reference is released
* src/qemu/qemu_monitor.h, src/qemu/qemu_monitor.c: Add reference
counting to handle safe deletion of monitor objects
The QEMU monitor open method would not take a reference on
the virDomainObjPtr until it had successfully opened the
monitor. The cleanup code upon failure to open though would
call qemuMonitorClose() which would in turn decrement the
reference count. This caused the virDoaminObjPtr to be mistakenly
freed and then the whole driver crashes
* src/qemu/qemu_monitor.c: Fix reference counting in
qemuMonitorOpen
Change the QEMU monitor file handle watch to poll for both
read & write events, as well as EOF. All I/O to/from the
QEMU monitor FD is now done in the event callback thread.
When the QEMU driver needs to send a command, it puts the
data to be sent into a qemuMonitorMessagePtr object instance,
queues it for dispatch, and then goes to sleep on a condition
variable. The event thread sends all the data, and then waits
for the reply to arrive, putting the response / error data
back into the qemuMonitorMessagePtr and notifying the condition
variable.
There is a temporary hack in the disk passphrase callback to
avoid acquiring the domain lock. This avoids a deadlock in
the command processing, since the domain lock is still held
when running monitor commands. The next commit will remove
the locking when running commands & thus allow re-introduction
of locking the disk passphrase callback
* src/qemu/qemu_driver.c: Temporarily don't acquire lock in
disk passphrase callback. To be reverted in next commit
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Remove
raw I/O functions, and a generic qemuMonitorSend() for
invoking a command
* src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Remove all low level I/O, and use the new qemuMonitorSend()
API. Provide a qemuMonitorTextIOProcess() method for detecting
command/reply/prompt boundaries in the monitor data stream