Commit Graph

1746 Commits

Author SHA1 Message Date
Peter Krempa
6bc80fa86d conf: numa: Rename virDomainNumatune to virDomainNuma
The structure will gradually become the only place for NUMA related
config, thus rename it appropriately.
2015-02-20 17:43:04 +01:00
Peter Krempa
0df2f0404f qemu: Exit job on error path of qemuDomainSetVcpusFlags()
Commit e105dc9814 moved some code but
didn't adjust the jump labels so that the job would be terminated.
2015-02-18 18:17:54 +01:00
Luyao Huang
337265bb52 qemu: fix vm deadlock when try to use numatune in session mode
https://bugzilla.redhat.com/show_bug.cgi?id=1126762

Commit 43b67f introduced a deadlock issue when we use numatune
to change numa settings to a vm in session mode.

Jump to endjob instead of jump to cleanup.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
2015-02-17 11:08:00 +01:00
Daniel P. Berrange
9358b63a0d qemu: do upfront check for vcpupids being null when querying pinning
The qemuDomainHelperGetVcpus attempted to report an error when the
vcpupids info was NULL. Unfortunately earlier code would clamp the
value of 'maxinfo' to 0 when nvcpupids was 0, so the error reporting
would end up being skipped.

This lead to 'virsh vcpuinfo <dom>' just returning an empty list
instead of giving the user a clear error.
2015-02-12 10:02:50 +00:00
Martin Kletzander
104ba5966a qemu: Add support for setting vCPU and I/O thread scheduler setting
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1178986

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2015-02-11 17:30:07 +01:00
John Ferlan
92f09dab50 qemu: qemuOpenFileAs - set flag VIR_FILE_OPEN_FORCE_MODE
In the event we're falling into the code that tries to create the file
in a forked environment (VIR_FILE_OPEN_FORK) we pass different mode bits,
but those are never set because the virFileOpenForceOwnerMode has a check
if the OPEN_FORCE_MODE bit is set before attempting to change the mode.

Since this is a special case it seems reasonable to set u+rw,g+rw,o
2015-02-11 07:29:29 -05:00
Ján Tomko
a7c9c7a6ab Fix qemu job handling in SetSchedulerParameters
Commit c5ee5cf added a job to SetSchedulerParameters, but
forgot to change one label in the SCHED_RANGE_CHECK macro.
2015-02-10 14:36:03 +01:00
Luyao Huang
862473fa12 qemu: Implement random number generator hotunplug
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2015-02-10 13:05:23 +01:00
Luyao Huang
980b265d08 qemu: Implement random number generator hotplug
Export the required helpers and add backend code to hotplug RNG devices.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2015-02-10 13:05:22 +01:00
Peter Krempa
fe6acfbd0e qemu: Implement random number generator cold (un)plug
Add support for using the attach/detach device APIs on the inactive
configuration to add RNG devices.
2015-02-10 13:05:22 +01:00
Daniel P. Berrange
b38da58423 Make tests independant of system page size
Some code paths have special logic depending on the page size
reported by sysconf, which in turn affects the test results.
We must mock this so tests always have a consistent page size.
2015-02-02 20:27:43 +00:00
Tony Krowiak
79a8769479 qemu: change macvtap device options in response to NIC_RX_FILTER_CHANGED
This patch enables synchronization of the host macvtap
device options with the guest device's in response to the
NIC_RX_FILTER_CHANGED event.

The following device options will be synchronized:
* PROMISC
* MULTICAST
* ALLMULTI

Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-01-30 13:16:28 +01:00
John Ferlan
7879d03197 qemu: Don't unconditionally delete file in qemuOpenFileAs
https://bugzilla.redhat.com/show_bug.cgi?id=1158034

If we're expecting to create a file somewhere and that fails for some
reason during qemuOpenFileAs, then we unlink the path we're attempting
to create leaving no way to determine what the "existing" privileges,
protections, or labels are that caused the failure (open, change owner
and group, change mode, etc.).

Furthermore, if we fall into the path where we'll be opening / creating
the file using VIR_FILE_OPEN_FORK, we need to first unlink/delete the file
we created in the first path; otherwise, the attempt by the child process
to open as some specific user:group may fail because the file was already
created using nfsnobody:nfsnobody. Again, if we didn't create the file we
don't want to blindly delete what already exists. Thus, a second reason for
the original check to set need_unlink to false when we find the file with
CREAT set, but already existing.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2015-01-29 15:37:34 -05:00
John Ferlan
8ff383366b qemu: Adjust EndAsyncJob for qemuDomainSaveInternal error path
Commit id '540c339a' to fix issues with reference counting and transient
domains moved the qemuDomainObjEndAsyncJob call prior to the attempt to
restart the guest CPU's resulting in an error:

    error: Failed to save domain rhel70 to /tmp/pl/rhel70.save
    error: internal error: unexpected async job 3

when (ret != 0) - eg, the error path from qemuDomainSaveMemory.

This patch will adjust the logic to call the EndAsyncJob only after
we've tried to restart the guest CPUs. It also needs to adjust the
test for qemuDomainRemoveInactive to add the ret == 0 condition.

Additionally, if we get to endjob: because of some error earlier, then
we need to save that error in the event the CPU restart logic fails.
We don't want to return the error from CPU restart failure, rather we
want to return the error from the failed save that caused us to fall
into the retry to start the CPU logic.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2015-01-29 12:10:41 -05:00
Luyao Huang
f76df311e8 qemu: fix cannot set graphic passwd via qemuDomainSaveImageDefineXML
https://bugzilla.redhat.com/show_bug.cgi?id=1183890

When we try to update a xml to a image file, we will clear the
graphics passwd settings, because we do not pass VIR_DOMAIN_XML_SECURE
to qemuDomainDefCopy, qemuDomainDefFormatBuf won't format the passwd.

Add VIR_DOMAIN_XML_SECURE flag when we call qemuDomainDefCopy
in qemuDomainSaveImageUpdateDef.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
2015-01-28 16:56:34 +01:00
Daniel P. Berrange
f7afeddce9 qemu: report TAP device indexes to systemd
Record the index of each TAP device created and report them to
systemd, so they show up in machinectl status for the VM.
2015-01-27 13:57:02 +00:00
Daniel P. Berrange
55ea7be7d9 Removing probing of secondary drivers
For stateless, client side drivers, it is never correct to
probe for secondary drivers. It is only ever appropriate to
use the secondary driver that is associated with the
hypervisor in question. As a result the ESX & HyperV drivers
have both been forced to do hacks where they register no-op
drivers for the ones they don't implement.

For stateful, server side drivers, we always just want to
use the same built-in shared driver. The exception is
virtualbox which is really a stateless driver and so wants
to use its own server side secondary drivers. To deal with
this virtualbox has to be built as 3 separate loadable
modules to allow registration to work in the right order.

This can all be simplified by introducing a new struct
recording the precise set of secondary drivers each
hypervisor driver wants

struct _virConnectDriver {
    virHypervisorDriverPtr hypervisorDriver;
    virInterfaceDriverPtr interfaceDriver;
    virNetworkDriverPtr networkDriver;
    virNodeDeviceDriverPtr nodeDeviceDriver;
    virNWFilterDriverPtr nwfilterDriver;
    virSecretDriverPtr secretDriver;
    virStorageDriverPtr storageDriver;
};

Instead of registering the hypervisor driver, we now
just register a virConnectDriver instead. This allows
us to remove all probing of secondary drivers. Once we
have chosen the primary driver, we immediately know the
correct secondary drivers to use.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-01-27 12:02:04 +00:00
Daniel P. Berrange
7b1ba9566b Remove use of nwfilterPrivateData from nwfilter driver
The nwfilter driver can rely on its global state instead
of the connect private data.
2015-01-27 12:02:03 +00:00
Peter Krempa
d13f56f08a qemu: Fix job handling in qemuDomainSetMetadata
The code modifies the domain configuration but doesn't take a MODIFY
type job to do so.
2015-01-27 10:39:21 +01:00
Peter Krempa
fb2ed975c3 qemu: Fix job type in qemuDomainGetBlockIoTune
The function just queries status so there's no need for a MODIFY type
job.
2015-01-27 10:39:21 +01:00
Peter Krempa
c5ee5cfb18 qemu: Fix job handling in qemuDomainSetSchedulerParametersFlags
The code modifies the domain configuration but doesn't take a MODIFY
type job to do so.
2015-01-27 10:38:47 +01:00
Peter Krempa
4fd7a72075 qemu: Fix job handling in qemuDomainSetMemoryParameters
The code modifies the domain configuration but doesn't take a MODIFY
type job to do so.
2015-01-27 10:24:04 +01:00
Peter Krempa
e3e72743df qemu: Fix job handling in qemuDomainSetAutostart
The code modifies the domain configuration but doesn't take a MODIFY
type job to do so.

This patch also fixes a few very long lines of code around the touched
parts.
2015-01-27 10:24:04 +01:00
Peter Krempa
79e5603307 qemu: Fix job handling in qemuDomainPinEmulator
The code modifies the domain configuration but doesn't take a MODIFY
type job to do so.
2015-01-27 10:24:04 +01:00
Peter Krempa
46d950443d qemu: Fix job handling in qemuDomainPinVcpuFlags
The domain modifies the domain configuration but doesn't take a MODIFY
type job to do it.
2015-01-27 10:24:03 +01:00
Peter Krempa
b347c0c2a3 CVE-2015-0236: qemu: Check ACLs when dumping security info from snapshots
The ACL check didn't check the VIR_DOMAIN_XML_SECURE flag and the
appropriate permission for it. Found via code inspection while fixing
permissions for save images.
2015-01-22 14:32:54 +01:00
Peter Krempa
03c3c0c874 CVE-2015-0236: qemu: Check ACLs when dumping security info from save image
The ACL check didn't check the VIR_DOMAIN_XML_SECURE flag and the
appropriate permission for it.
2015-01-22 14:32:54 +01:00
Ján Tomko
5c703ca396 Always check return value of qemuDomainObjExitMonitor
Depending on the context, either error out if the domain
has disappeared in the meantime, or just ignore the value
to allow marking the function as ATTRIBUTE_RETURN_CHECK.
2015-01-19 10:12:32 +01:00
Ján Tomko
6edb97f29a Fix vmdef usage after domain crash in monitor on device detach
https://bugzilla.redhat.com/show_bug.cgi?id=1161024

In the device type-specific functions, exit early
if the domain has disappeared, because the cleanup
should have been done by qemuProcessStop.

Check the return value in processDeviceDeletedEvent
and qemuProcessUpdateDevices.

Skip audit and removing the device from live def because
it has already been cleaned up.
2015-01-19 10:12:07 +01:00
Daniel P. Berrange
dd69a14f90 Add support for schema validation when passing in XML
The virDomainDefineXMLFlags and virDomainCreateXML APIs both
gain new flags allowing them to be told to validate XML.
This updates all the drivers to turn on validation in the
XML parser when the flags are set
2015-01-15 16:40:27 +00:00
Ján Tomko
3f21398437 Fix vmdef usage while in monitor in BlockStat* APIs
Make a local copy of the disk alias instead of pointing
to the domain definition, which might get freed if
the domain dies while we're in monitor.

Also exit early if that happens.
2015-01-14 19:30:32 +01:00
Ján Tomko
051add2ff9 Fix vmdef usage while in monitor in qemuDomainHotplugVcpus
Exit the monitor right after we've done with it to get
the virDomainObjPtr lock back, otherwise we might be accessing
vm->def while it's being cleaned up by qemuProcessStop.

If the domain crashed while we were in the monitor, exit
early instead of changing vm->def which is now the persistent
definition.
2015-01-14 19:30:32 +01:00
Pavel Hrdina
e105dc9814 qemu_driver: fix setting vcpus for offline domain
Commit e3435caf fixed hot-plugging of vcpus with strict memory pinning
on NUMA hosts, but unfortunately it also broke updating number of vcpus
for offline guests using our API.

The issue is that we try to create a cpu cgroup for non-running guest
which fails as there are no cgroups for that domain. We should create
cgroups and update cpuset.mems only if we are hot-plugging.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2015-01-14 10:34:20 +01:00
Shanzhi Yu
9f974858dd qemu: snapshot: inactive external snapshot can't work after libvirtd restart
When create inactive external snapshot, after update disk definitions,
virDomainSaveConfig is needed, if not after restart libvirtd the new snapshot
file definitions in xml will be lost.

Reproduce steps:

1. prepare a shut off guest
$ virsh domstate rhel7 && virsh domblklist rhel7
shut off

Target     Source
------------------------------------------------
vda        /var/lib/libvirt/images/rhel7.img

2. create external disk snapshot
$ virsh snapshot-create rhel7 --disk-only && virsh domblklist rhel7
Domain snapshot 1417882967 created
Target     Source
------------------------------------------------
vda        /var/lib/libvirt/images/rhel7.1417882967

3. restart libvirtd then check guest source file
$ service  libvirtd restart && virsh domblklist rhel7
Redirecting to /bin/systemctl restart  libvirtd.service
Target     Source
------------------------------------------------
vda        /var/lib/libvirt/images/rhel7.img

This was first reported by Eric Blake
http://www.redhat.com/archives/libvir-list/2014-December/msg00369.html

Signed-off-by: Shanzhi Yu <shyu@redhat.com>
2015-01-13 15:59:06 -05:00
Daniel P. Berrange
0ecd685109 Give virDomainDef parser & formatter their own flags
The virDomainDefParse* and virDomainDefFormat* methods both
accept the VIR_DOMAIN_XML_* flags defined in the public API,
along with a set of other VIR_DOMAIN_XML_INTERNAL_* flags
defined in domain_conf.c.

This is seriously confusing & error prone for a number of
reasons:

 - VIR_DOMAIN_XML_SECURE, VIR_DOMAIN_XML_MIGRATABLE and
   VIR_DOMAIN_XML_UPDATE_CPU are only relevant for the
   formatting operation
 - Some of the VIR_DOMAIN_XML_INTERNAL_* flags only apply
   to parse or to format, but not both.

This patch cleanly separates out the flags. There are two
distint VIR_DOMAIN_DEF_PARSE_* and VIR_DOMAIN_DEF_FORMAT_*
flags that are used by the corresponding methods. The
VIR_DOMAIN_XML_* flags received via public API calls must
be converted to the VIR_DOMAIN_DEF_FORMAT_* flags where
needed.

The various calls to virDomainDefParse which hardcoded the
use of the VIR_DOMAIN_XML_INACTIVE flag change to use the
VIR_DOMAIN_DEF_PARSE_INACTIVE flag.
2015-01-13 16:26:12 +00:00
Eric Blake
e1125cebfc qemu: forbid second blockcommit during active commit
https://bugzilla.redhat.com/show_bug.cgi?id=1135339 documents some
confusing behavior when a user tries to start an inactive block
commit in a second connection while there is already an on-going
active commit from a first connection.  Eventually, qemu will
support multiple simultaneous block jobs, but as of now, it does
not; furthermore, libvirt also needs an overhaul before we can
support simultaneous jobs.  So, the best way to avoid confusing
ourselves is to quit relying on qemu to tell us about the situation
(where we risk getting in weird states) and instead forbid a
duplicate block commit ourselves.

Note that we are still relying on qemu to diagnose attempts to
interrupt an inactive commit (since we only track XML of an active
commit), but as inactive commit is less confusing for libvirt to
manage, there is less that can go wrong by leaving that detection
up to qemu.

* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Hoist check for
active commit to occur earlier outside of conditions.

Signed-off-by: Eric Blake <eblake@redhat.com>
2015-01-13 08:21:20 -07:00
Daniel P. Berrange
4d2ebc71ce Add stub virDomainDefineXMLFlags impls
Make sure every virt driver implements virDomainDefineXMLFlags
by adding a trivial passthrough from the existing impl with
no flags set.
2015-01-13 10:38:56 +00:00
Luyao Huang
565d049fd1 qemu: Restore old bandwidth rules when setting new fails
https://bugzilla.redhat.com/show_bug.cgi?id=1177723

When setting new bandwidth limits via
virDomainSetInterfaceParameters, the old ones are cleared first.
However, if setting the new ones fails, the old are already gone
and interface is left in inconsistent state.  Therefore, right
before failing we ought to try to restore the old bandwidth.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
2015-01-06 13:27:43 +01:00
Cédric Bosdonnat
aa2cc72100 Domain conf: allow more than one IP address for net devices
Add the possibility to have more than one IP address configured for a
domain network interface. IP addresses can also have a prefix to define
the corresponding netmask.
2015-01-05 20:24:04 +01:00
Martin Kletzander
31354b5b32 qemu: Fix coverity issues after refcount refactoring
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-23 05:34:05 +01:00
Martin Kletzander
540c339a25 qemu: completely rework reference counting
There is one problem that causes various errors in the daemon.  When
domain is waiting for a job, it is unlocked while waiting on the
condition.  However, if that domain is for example transient and being
removed in another API (e.g. cancelling incoming migration), it get's
unref'd.  If the first call, that was waiting, fails to get the job, it
unref's the domain object, and because it was the last reference, it
causes clearing of the whole domain object.  However, when finishing the
call, the domain must be unlocked, but there is no way for the API to
know whether it was cleaned or not (unless there is some ugly temporary
variable, but let's scratch that).

The root cause is that our APIs don't ref the objects they are using and
all use the implicit reference that the object has when it is in the
domain list.  That reference can be removed when the API is waiting for
a job.  And because each domain doesn't do its ref'ing, it results in
the ugly checking of the return value of virObjectUnref() that we have
everywhere.

This patch changes qemuDomObjFromDomain() to ref the domain (using
virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which
should be the only function in which the return value of
virObjectUnref() is checked.  This makes all reference counting
deterministic and makes the code a bit clearer.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-21 10:48:56 +01:00
Eric Blake
af5c3a1015 qemu: fix memory leak in blockinfo
Coverity flagged commit 0282ca45 as introducing a memory leak;
in all my refactoring to make capacity probing conditional on
whether the image is non-raw, I missed deleting the unconditional
probe.

* src/qemu/qemu_driver.c (qemuStorageLimitsRefresh): Drop
redundant assignment.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-17 16:10:45 -07:00
Eric Blake
3937ef9cf4 getstats: crawl backing chain for qemu
Wire up backing chain recursion.  For the first time, it is now
possible to get libvirt to expose that qemu tracks read statistics
on backing files, as well as report maximum extent written on a
backing file during a block-commit operation.

For a running domain, where one of the two images has a backing
file, I see the traditional output:

$ virsh domstats --block testvm2
Domain: 'testvm2'
  block.count=2
  block.0.name=vda
  block.0.path=/tmp/wrapper.qcow2
  block.0.rd.reqs=1
  block.0.rd.bytes=512
  block.0.rd.times=28858
  block.0.wr.reqs=0
  block.0.wr.bytes=0
  block.0.wr.times=0
  block.0.fl.reqs=0
  block.0.fl.times=0
  block.0.allocation=0
  block.0.capacity=1310720000
  block.0.physical=200704
  block.1.name=vdb
  block.1.path=/dev/sda7
  block.1.rd.reqs=0
  block.1.rd.bytes=0
  block.1.rd.times=0
  block.1.wr.reqs=0
  block.1.wr.bytes=0
  block.1.wr.times=0
  block.1.fl.reqs=0
  block.1.fl.times=0
  block.1.allocation=0
  block.1.capacity=1310720000

vs. the new output:

$ virsh domstats --block --backing testvm2
Domain: 'testvm2'
  block.count=3
  block.0.name=vda
  block.0.path=/tmp/wrapper.qcow2
  block.0.rd.reqs=1
  block.0.rd.bytes=512
  block.0.rd.times=28858
  block.0.wr.reqs=0
  block.0.wr.bytes=0
  block.0.wr.times=0
  block.0.fl.reqs=0
  block.0.fl.times=0
  block.0.allocation=0
  block.0.capacity=1310720000
  block.0.physical=200704
  block.1.name=vda
  block.1.path=/dev/sda6
  block.1.backingIndex=1
  block.1.rd.reqs=0
  block.1.rd.bytes=0
  block.1.rd.times=0
  block.1.wr.reqs=0
  block.1.wr.bytes=0
  block.1.wr.times=0
  block.1.fl.reqs=0
  block.1.fl.times=0
  block.1.allocation=327680
  block.1.capacity=786432000
  block.2.name=vdb
  block.2.path=/dev/sda7
  block.2.rd.reqs=0
  block.2.rd.bytes=0
  block.2.rd.times=0
  block.2.wr.reqs=0
  block.2.wr.bytes=0
  block.2.wr.times=0
  block.2.fl.reqs=0
  block.2.fl.times=0
  block.2.allocation=0
  block.2.capacity=1310720000

I may later do a patch that trims the output to avoid 0 stats,
particularly for backing files (which are more likely to have
0 stats, at least for write statistics when no block-commit
is performed).  Also, I still plan to expose physical size
information (qemu doesn't expose it yet, so it requires a stat,
and for block devices, a further open/seek operation).  But
this patch is good enough without worrying about that yet.

* src/qemu/qemu_driver.c (QEMU_DOMAIN_STATS_BACKING): New internal
enum bit.
(qemuConnectGetAllDomainStats): Recognize new user flag, and pass
details to...
(qemuDomainGetStatsBlock): ...here, where we can do longer recursion.
(qemuDomainGetStatsOneBlock): Output new field.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-17 02:07:44 -07:00
Eric Blake
c2d380bff8 getstats: split block stats reporting for easier recursion
In order to report stats on backing chains, we need to separate
the output of stats for one block from how we traverse blocks.

* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Split...
(qemuDomainGetStatsOneBlock): ...into new helper.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-17 02:07:44 -07:00
Eric Blake
14ef1f62e3 getstats: prepare for dynamic block.count stat
A coming patch will make it optionally possible to list backing
chain block stats; in this mode of operation, block.counts is no
longer the number of <disks> in the domain, but the number of
blocks in the array being reported.  We still want block.count
listed first, but rather than iterate the tree twice (once to
count, and once to list stats), it's easier to just touch things
up after the fact.

* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Compute count
after the fact.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-17 00:20:21 -07:00
Eric Blake
596a137134 getstats: report block sizes for offline domains
The prior refactoring can now be put to use. With the same domain
as the earlier commit 7b49926 (one qcow2 disk and an empty
cdrom drive):
$ virsh domstats --block foo
Domain: 'foo'
  block.count=2
  block.0.name=hda
  block.0.path=/var/lib/libvirt/images/foo.qcow2
  block.0.allocation=1309614080
  block.0.capacity=42949672960
  block.0.physical=1309671424
  block.1.name=hdc

* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Use
qemuStorageLimitsRefresh to report offline statistics.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-17 00:20:21 -07:00
Eric Blake
8de6544e98 qemu: refactor blockinfo data gathering
Create a helper function that can be reused for gathering block
info from virDomainListGetStats.

* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Split guts...
(qemuStorageLimitsRefresh): ...into new helper function.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-16 23:28:36 -07:00
Eric Blake
0282ca45a0 qemu: fix bugs in blockstats
The documentation for virDomainBlockInfo was confusing: it stated
that 'physical' was the size of the container, then gave an example
of it being the amount of storage used by a sparse file (that is,
for a sparse raw image on a regular file, the wording implied
capacity==physical, while allocation was smaller; but the example
instead claimed physical==allocation).  Since we use 'physical' for
the last offset of a block device, we should do likewise for
regular files.

Furthermore, the example claimed that for a qcow2 regular file,
allocation==physical.  At the time the code was first written,
this was true (qcow2 files were allocated sequentially, and were
never sparse, so the last sector written happened to also match
the disk space occupied); but modern qemu does much better and
can punch holes for a qcow2 with allocation < physical.

Basically, after this patch, the three fields are now reliably
mapped as:
 'capacity' - how much storage the guest can see (equal to
physical for raw images, determined by image metadata otherwise)
 'allocation' - how much storage the image occupies (similar to
what 'du' would report)
 'physical' - the last offset of the image (similar to what 'ls'
would report)

'capacity' can be larger than 'physical' (such as for a qcow2
image that does not vary much from a backing file) or smaller
(such as for a qcow2 file with lots of internal snapshots).
Likewise, 'allocation' can be (slightly) larger than 'physical'
(such as counting the tail of cluster allocations required to
round a file size up to filesystem granularity) or smaller
(for a sparse file).  A block-resize operation changes capacity
(which, for raw images, also changes physical); many non-raw
images automatically grow physical and allocation as necessary
when starting with an allocation smaller than capacity; and even
when capacity and physical stay unchanged, allocation can change
when converting sectors from holes to data or back.

Note that this does not change semantics for qcow2 images stored
on block devices; there, we still rely on qemu to report the
highest written extent for allocation.  So using this API to
track when to extend a block device because a qcow2 image is
about to exceed a threshold will not see any changes.

Also, note that virStorageVolInfo is unfortunately limited to
just 'capacity' and 'allocation' (we can't expand it to add
'physical', although we can expand the XML to add it there);
historically, that struct's 'allocation' value has reported
file size for qcow2 files (what this patch terms 'physical'
for a domain block device), but disk usage for raw files (what
this patch terms 'allocation').  So follow-up patches will be
needed to make storage volumes report the same allocation
values and get at physical values, where those differ.

* include/libvirt/libvirt-domain.h (_virDomainBlockInfo): Tweak
documentation to match saner definition.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): For regular
files, physical size is capacity, not allocation.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-16 23:19:08 -07:00
Eric Blake
05e702cfd4 getstats: rearrange blockinfo gathering
Ultimately, we want to avoid read()ing a file while qemu is running.
We still have to open() block devices to determine their physical
size, but that is safer.  This patch rearranges code to group
together all code that reads the image, to make it easier for later
patches to skip the metadata collection when possible.

* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Check for empty
disk up front.  Place metadata reading next to use.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-16 23:13:04 -07:00
Eric Blake
7b11f5e554 getstats: prepare monitor collection for recursion
A future patch will allow recursion into backing chains when
collecting block stats.  This patch should not change behavior,
but merely moves out the common code that will be reused once
recursion is enabled, and adds the parameter that will turn on
recursion.

* src/qemu/qemu_monitor.h (qemuMonitorGetAllBlockStatsInfo)
(qemuMonitorBlockStatsUpdateCapacity): Add recursion parameter,
although it is ignored for now.
* src/qemu/qemu_monitor.h (qemuMonitorGetAllBlockStatsInfo)
(qemuMonitorBlockStatsUpdateCapacity): Likewise.
* src/qemu/qemu_monitor_json.h
(qemuMonitorJSONGetAllBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacity): Likewise.
* src/qemu/qemu_monitor_json.c
(qemuMonitorJSONGetAllBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacity): Add parameter, and
split...
(qemuMonitorJSONGetOneBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacityOne): ...into helpers.
(qemuMonitorJSONGetBlockStatsInfo): Update caller.
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Update caller.
* src/qemu/qemu_migration.c (qemuMigrationCookieAddNBD): Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-16 16:08:04 -07:00