Commit Graph

4136 Commits

Author SHA1 Message Date
Ján Tomko
c75f42f331 Really fix XML formatting flags in SaveImageUpdateDef
Commit cf2d4c6 used a logical or instead of bitwise or,
effectively passing 1, that is VIR_DOMAIN_XML_INACTIVE.

This was caught by a warning when building with clang.

https://bugzilla.redhat.com/show_bug.cgi?id=1183869
2015-02-27 12:01:31 +01:00
Laine Stump
4bbe1029f2 qemu: fix ifindex array reported to systemd
Commit f7afeddc added code to report to systemd an array of interface
indexes for all tap devices used by a guest. Unfortunately it not only
didn't add code to report the ifindexes for macvtap interfaces
(interface type='direct') or the tap devices used by type='ethernet',
it ended up sending "-1" as the ifindex for each macvtap or hostdev
interface. This resulted in a failure to start any domain that had a
macvtap or hostdev interface (or actually any type other than
"network" or "bridge").

This patch does the following with the nicindexes array:

1) Modify qemuBuildInterfaceCommandLine() to only fill in the
nicindexes array if given a non-NULL pointer to an array (and modifies
the test jig calls to the function to send NULL). This is because
there are tests in the test suite that have type='ethernet' and still
have an ifname specified, but that device of course doesn't actually
exist on the test system, so attempts to call virNetDevGetIndex() will
fail.

2) Even then, only add an entry to the nicindexes array for
appropriate types, and to do so for all appropriate types ("network",
"bridge", and "direct"), but only if the ifname is known (since that
is required to call virNetDevGetIndex().
2015-02-25 13:11:14 -05:00
Laine Stump
118b240808 network: only clear bandwidth if it has been set
libvirt was unconditionally calling virNetDevBandwidthClear() for
every interface (and network bridge) of a type that supported
bandwidth, whether it actually had anything set or not. This doesn't
hurt anything (unless ifname == NULL!), but is wasteful.

This patch makes sure that all calls to virNetDevBandwidthClear() are
qualified by checking that the interface really had some bandwidth
setup done, and checks for a null ifname inside
virNetDevBandwidthClear(), silently returning success if it is null
(as well as removing the ATTRIBUTE_NONNULL from that function's
prototype, since we can't guarantee that it is never null,
e.g. sometimes a type='ethernet' interface has no ifname as it is
provided on the fly by qemu).
2015-02-25 13:09:34 -05:00
Yuri Chornoivan
8a833d1eb0 Fix typos in messages
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2015-02-25 14:12:51 +01:00
Ján Tomko
52a166f493 Assign default SCSI controller model before checking attribute validity
If the qemu binary on x86 does not support lsi SCSI controller,
but it supports virtio-scsi, we reject the virtio-specific attributes
for no reason.

Move the default controller assignment before the check.

https://bugzilla.redhat.com/show_bug.cgi?id=1168849
2015-02-25 10:04:58 +01:00
Michal Privoznik
cf2d4c603c qemu: Use correct flags for ABI stability check in SaveImageUpdateDef
https://bugzilla.redhat.com/show_bug.cgi?id=1183869

Soo. you've successfully started yourself a domain. And since you want
to use it on your host exclusively you are confident enough to
passthrough the host CPU model, like this:

  <cpu mode='host-passthrough'/>

Then, after a while, you want to save the domain into a file (e.g.
virsh save dom dom.save). And here comes the trouble. The file consist
of two parts: Libvirt header (containing domain XML among other
things), and qemu migration data. Now, the domain XML in the header is
formatted using special flags (VIR_DOMAIN_XML_SECURE |
VIR_DOMAIN_XML_UPDATE_CPU | VIR_DOMAIN_XML_INACTIVE |
VIR_DOMAIN_XML_MIGRATABLE).

Then, on your way back from the bar, you think of changing something
in the XML in the saved file (we have a command for it after all), say
listen address for graphics console. So you successfully type in the
command:

  virsh save-image-edit dom.save

Change all the bits, and exit the editor. But instead of success
you're left with sad error message:

  error: unsupported configuration: Target CPU model <null> does not
  match source Pentium Pro

Sigh. Digging into the code you see lines, where we check for ABI
stability. The new XML you've produced is compared with the old one
from the saved file to see if qemu ABI will break or not. Wait, what?
We are using different flags to parse the XML you've provided so we
were just lucky it worked in some cases? Yep, that's right.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-02-25 09:28:54 +01:00
Pavel Hrdina
efd30e2e1c qemu: fix memory leak while starting a guest
In commit cc41c648 I've re-factored qemuMonitorFindBalloonObjectPath, but
missed that there is a memory leak. The "nextpath" variable is
overwritten while looping in for cycle and we have to free it before next
cycle.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2015-02-24 16:38:50 +01:00
Stefan Zimmermann
8e6ee9f280 Rework s390 architecture checking
Making use of the ARCH_IS_S390 macro introduced with
e808357528

Signed-off-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2015-02-23 14:51:02 -05:00
Stefan Zimmermann
09ab9dcc85 Prevent default creation of usb controller on s390 and s390x
Since s390 does not support usb the default creation of a usb controller
for a domain should not occur.

Also adjust s390 test cases by removing usb device instances since
usb devices are no longer created by default for s390 the s390
test cases need to be adjusted.

Signed-off-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2015-02-23 14:50:15 -05:00
Cole Robinson
f2f1e388e1 qemu: Fix AAVMF/OVMF #define names
The AAVMF and OVMF names were swapped. Reorder the one usage where it
matters so behavior doesn't change.
2015-02-21 14:44:46 -05:00
Peter Krempa
103707d4b7 qemu: caps: Add capability bit for the "pc-dimm" device
The pc-dimm device represents a RAM memory module.
2015-02-20 19:25:09 +01:00
Peter Krempa
181742d43f conf: Move all NUMA configuration to virDomainNuma
For historical reasons data regarding NUMA configuration were split
between the CPU definition and numatune. We cannot do anything about the
XML still being split, but we certainly can at least store the relevant
data in one place.

This patch moves the NUMA stuff to the right place.
2015-02-20 17:50:08 +01:00
Peter Krempa
b9ddb25822 conf: numa: Add setter/getter for NUMA node memory size
Add the helpers and refactor places where the value is accessed without
them.
2015-02-20 17:50:08 +01:00
Peter Krempa
7800d473f5 conf: numa: Add accessor to NUMA node's memory access mode 2015-02-20 17:50:08 +01:00
Peter Krempa
d9a779a36e conf: numa: Add accessor for the NUMA node cpu mask
Add virDomainNumaGetNodeCpumask() and refactor a few places that would
get the cpu mask without the helper.
2015-02-20 17:50:08 +01:00
Peter Krempa
be22d07315 conf: numa: Add helper to get guest NUMA node count and refactor users
Add an accessor so that a later refactor is simpler.
2015-02-20 17:50:07 +01:00
Peter Krempa
ba2183a331 qemu: command: Unify retrieval of NUMA cell count in qemuBuildNumaArgStr
The function uses the cell count in 6 places. Add a temp variable to
hold the count as it will greatly simplify the refactor.
2015-02-20 17:50:07 +01:00
Peter Krempa
fa9930720b numa: conf: Tweak parameters of virDomainNumatuneSet
As virDomainNumatuneSet now doesn't allocate the virDomainNuma object
any longer it's not necessary to pass the pointer to a pointer to store
the object as it will not change any longer.

While touching the parameter definitions I've also changed the name of
the parameter to "numa".
2015-02-20 17:50:07 +01:00
Peter Krempa
c03411199e conf: Allocate domain definition with the new helper
Use the virDomainDefNew() helper to allocate the definition instead of
doing it via VIR_ALLOC.
2015-02-20 17:43:05 +01:00
Peter Krempa
a3673b225d conf: Move enum virMemAccess to the NUMA code and rename it
Name it virNumaMemAccess and add it to conf/numa_conf.[ch]

Note that to avoid a circular dependency the type of the NUMA cell
memAccess variable was changed to int. It will be turned back later
after the circular dependency will not exist.
2015-02-20 17:43:04 +01:00
Peter Krempa
6bc80fa86d conf: numa: Rename virDomainNumatune to virDomainNuma
The structure will gradually become the only place for NUMA related
config, thus rename it appropriately.
2015-02-20 17:43:04 +01:00
Michal Privoznik
af20423264 virQEMUCapsCacheLookupCopy: Filter qemuCaps based on machineType
Not all machine types support all devices, device properties, backends,
etc. So until we create a matrix of [machineType, qemuCaps], lets just
filter out some capabilities before we return them to the consumer
(which is going to make decisions based on them straight away).
Currently, as qemu is unable to tell which capabilities are (not)
enabled for given machine types, it's us who has to hardcode the matrix.
One day maybe the hardcoding will go away and we can create the matrix
dynamically on the fly based on a few monitor calls.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-02-20 13:28:04 +01:00
Michal Privoznik
37cf163ab2 virQEMUCapsCacheLookupCopy: Pass machine type
It will come handy in the near future when we will filter some
capabilities based on it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-02-20 13:27:59 +01:00
Michal Privoznik
80c5f10e86 qemuMigrationDriveMirror: Listen to events
https://bugzilla.redhat.com/show_bug.cgi?id=1179678

When migrating with storage, libvirt iterates over domain disks and
instruct qemu to migrate the ones we are interested in (shared, RO and
source-less disks are skipped). The disks are migrated in series. No
new disk is transferred until the previous one hasn't been quiesced.
This is checked on the qemu monitor via 'query-jobs' command. If the
disk has been quiesced, it practically went from copying its content
to mirroring state, where all disk writes are mirrored to the other
side of migration too. Having said that, there's one inherent error in
the design. The monitor command we use reports only active jobs. So if
the job fails for whatever reason, we will not see it anymore in the
command output. And this can happen fairly simply: just try to migrate
a domain with storage. If the storage migration fails (e.g. due to
ENOSPC on the destination) we resume the host on the destination and
let it run on partly copied disk.

The proper fix is what even the comment in the code says: listen for
qemu events instead of polling. If storage migration changes state an
event is emitted and we can act accordingly: either consider disk
copied and continue the process, or consider disk mangled and abort
the migration.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-02-19 14:12:38 +01:00
Michal Privoznik
76c61cdca2 qemuProcessHandleBlockJob: Take status into account
Upon BLOCK_JOB_COMPLETED event delivery, we check if the job has
completed (in qemuMonitorJSONHandleBlockJobImpl()). For better image,
the event looks something like this:

"timestamp": {"seconds": 1423582694, "microseconds": 372666}, "event":
"BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-disk0", "len":
8412790784, "offset": 409993216, "speed": 8796093022207, "type":
"mirror", "error": "No space left on device"}}

If "len" does not equal "offset" it's considered an error, and we can
clearly see "error" field filled in. However, later in the event
processing this case was handled no differently to case of job being
aborted via separate API. It's time that we start differentiate these
two because of the future work.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-02-19 14:12:38 +01:00
Michal Privoznik
c37943a068 qemuProcessHandleBlockJob: Set disk->mirrorState more often
Currently, upon BLOCK_JOB_* event, disk->mirrorState is not updated
each time. The callback code handling the events checks if a blockjob
was started via our public APIs prior to setting the mirrorState.
However, some block jobs may be started internally (e.g. during
storage migration), in which case we don't bother with setting
disk->mirror (there's nothing we can set it to anyway), or other
fields. But it will come handy if we update the mirrorState in these
cases too. The event wasn't delivered just for fun - we've started the
job after all.

So, in this commit, the mirrorState is set to whatever job status
we've obtained. Of course, there are some actions on some statuses
that we want to perform. But instead of if {} else if {} else {} ...
enumeration, let's move to switch().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-02-19 14:12:38 +01:00
Peter Krempa
0df2f0404f qemu: Exit job on error path of qemuDomainSetVcpusFlags()
Commit e105dc9814 moved some code but
didn't adjust the jump labels so that the job would be terminated.
2015-02-18 18:17:54 +01:00
Pavel Hrdina
77a9dc0b8d qemu_cgroup: initialize mem_mask to NULL
If 'virNumaGetHostNodeset()' fails then the error path will try to free
uninitialized pointer mem_mask. Introduced by commit af2a1f058.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2015-02-17 14:22:50 +01:00
Prerna Saxena
5e4f49ab8a PowerPC : Forbid NULL CPU model with 'host-model' mode.
PowerPC : Forbid NULL CPU model with 'host-model' mode in qemu command line.

This ensures that an XML such as following:
...
  <cpu mode='host-model'>
    <model fallback='allow'/>
  </cpu>
...

will not generate a '-cpu host,compat=(null)' command line with qemu-system-ppc64.

Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
2015-02-17 12:20:40 +01:00
Prerna Saxena
bdbe723fcd PowerPC : Make 'qemu-system-ppc64' the default emulator on ppc64[le].
PowerPC : Explicitly associate 'qemu-system-ppc64' as the
 default emulator for all 64-bit PowerPC guests ( both Big & Little Endian )

Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
2015-02-17 12:20:40 +01:00
Luyao Huang
337265bb52 qemu: fix vm deadlock when try to use numatune in session mode
https://bugzilla.redhat.com/show_bug.cgi?id=1126762

Commit 43b67f introduced a deadlock issue when we use numatune
to change numa settings to a vm in session mode.

Jump to endjob instead of jump to cleanup.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
2015-02-17 11:08:00 +01:00
Michal Privoznik
7832fac847 qemuBuildMemoryBackendStr: Report backend requirement more appropriately
So, when building the '-numa' command line, the
qemuBuildMemoryBackendStr() function does quite a lot of checks to
chose the best backend, or to check if one is in fact needed. However,
it returned that backend is needed even for this little fella:

  <numatune>
    <memory mode="strict" nodeset="0,2"/>
  </numatune>

This can be guaranteed via CGroups entirely, there's no need to use
memory-backend-ram to let qemu know where to get memory from. Well, as
long as there's no <memnode/> element, which explicitly requires the
backend. Long story short, we wouldn't have to care, as qemu works
either way. However, the problem is migration (as always). Previously,
libvirt would have started qemu with:

  -numa node,memory=X

in this case and restricted memory placement in CGroups. Today, libvirt
creates more complicated command line:

  -object memory-backend-ram,id=ram-node0,size=X
  -numa node,memdev=ram-node0

Again, one wouldn't find anything wrong with these two approaches.
Both work just fine. Unless you try to migrated from the older libvirt
into the newer one. These two approaches are, unfortunately, not
compatible. My suggestion is, in order to allow users to migrate, lets
use the older approach for as long as the newer one is not needed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-02-17 09:07:09 +01:00
Erik Skultety
c3d9d3bbc9 security: introduce virSecurityManagerCheckAllLabel function
We do have a check for valid per-domain security model, however we still
do permit an invalid security model for a domain's device (those which
are specified with <source> element).
This patch introduces a new function virSecurityManagerCheckAllLabel
which compares user specified security model against currently
registered security drivers. That being said, it also permits 'none'
being specified as a device security model.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1165485
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2015-02-13 14:37:54 +01:00
Ján Tomko
6ba5d1afec Wire up mrg_rxbuf option for qemu
<interface ...>
  ...
  <model type='virtio'/>
  <driver ...>
    <host mrg_rxbuf='off'/>
  </driver>
</interface>

will result in:
-device virtio-net-pci,mrg_rxbuf=off,...

https://bugzilla.redhat.com/show_bug.cgi?id=1186886
2015-02-13 12:31:38 +01:00
Daniel P. Berrange
9358b63a0d qemu: do upfront check for vcpupids being null when querying pinning
The qemuDomainHelperGetVcpus attempted to report an error when the
vcpupids info was NULL. Unfortunately earlier code would clamp the
value of 'maxinfo' to 0 when nvcpupids was 0, so the error reporting
would end up being skipped.

This lead to 'virsh vcpuinfo <dom>' just returning an empty list
instead of giving the user a clear error.
2015-02-12 10:02:50 +00:00
Daniel P. Berrange
a103bb105c qemu: fix setting of VM CPU affinity with TCG
If a previous commit I fixed the incorrect handling of vcpu pids
for TCG mode QEMU:

  commit b07f3d821d
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Thu Dec 18 16:34:39 2014 +0000

    Don't setup fake CPU pids for old QEMU

    The code assumes that def->vcpus == nvcpupids, so when we setup
    fake CPU pids for old QEMU with nvcpupids == 1, we cause the
    later code to read off the end of the array. This has fun results
    like sche_setaffinity(0, ...) which changes libvirtd's own CPU
    affinity, or even better sched_setaffinity($RANDOM, ...) which
    changes the affinity of a random OS process.

The intent was that this would merely disable the ability to set
per-vCPU affinity. It should still have been possible to set VM
level host CPU affinity.

Unfortunately, when you set  <vcpu cpuset='0-1'>4</vcpu>, the XML
parser will internally take this & initialize an entry in the
def->cputune.vcpupin array for every VCPU. IOW this is implicitly
being treated as

  <cputune>
    <vcpupin cpuset='0-1' vcpu='0'/>
    <vcpupin cpuset='0-1' vcpu='1'/>
    <vcpupin cpuset='0-1' vcpu='2'/>
    <vcpupin cpuset='0-1' vcpu='3'/>
  </cputune>

Even more fun, the faked cputune elements are hidden from view when
querying the live XML, because their cpuset mask is the same as the
VM default cpumask.

The upshot was that it was impossible to set VM level CPU affinity.

To fix this we must update qemuProcessSetVcpuAffinities so that it
only reports a fatal error if the per-VCPU cpu mask is different
from the VM level cpu mask.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-02-12 10:02:50 +00:00
Martin Kletzander
104ba5966a qemu: Add support for setting vCPU and I/O thread scheduler setting
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1178986

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2015-02-11 17:30:07 +01:00
John Ferlan
92f09dab50 qemu: qemuOpenFileAs - set flag VIR_FILE_OPEN_FORCE_MODE
In the event we're falling into the code that tries to create the file
in a forked environment (VIR_FILE_OPEN_FORK) we pass different mode bits,
but those are never set because the virFileOpenForceOwnerMode has a check
if the OPEN_FORCE_MODE bit is set before attempting to change the mode.

Since this is a special case it seems reasonable to set u+rw,g+rw,o
2015-02-11 07:29:29 -05:00
Luyao Huang
45853b5289 qemu: fix crash when migrateuri has no scheme
https://bugzilla.redhat.com/show_bug.cgi?id=1191355

When we attempt to migrate a vm with a migrateuri that has no scheme:

 # virsh migrate test4 --live qemu+ssh://lhuang/system --migrateuri 127.0.0.1

target libvirtd will crash because uri->scheme is NULL in
qemuMigrationPrepareDirect on this line:

     if (STRNEQ(uri->scheme, "tcp") &&

Add a value check before this line. Also fix a bug like this in
doNativeMigrate, that could only happen when destination libvirtd
returned an incorrect URI.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2015-02-11 13:20:30 +01:00
Ján Tomko
a7c9c7a6ab Fix qemu job handling in SetSchedulerParameters
Commit c5ee5cf added a job to SetSchedulerParameters, but
forgot to change one label in the SCHED_RANGE_CHECK macro.
2015-02-10 14:36:03 +01:00
Luyao Huang
862473fa12 qemu: Implement random number generator hotunplug
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2015-02-10 13:05:23 +01:00
Luyao Huang
980b265d08 qemu: Implement random number generator hotplug
Export the required helpers and add backend code to hotplug RNG devices.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2015-02-10 13:05:22 +01:00
Peter Krempa
fe6acfbd0e qemu: Implement random number generator cold (un)plug
Add support for using the attach/detach device APIs on the inactive
configuration to add RNG devices.
2015-02-10 13:05:22 +01:00
Peter Krempa
25e2d89788 qemu: command: Refactor creation of RNG device commandline
As the RNG device is using an -object as backend refactor the code to
use the JSON to commandline generator so that we can reuse the code
later in hotplug.
2015-02-10 13:05:22 +01:00
Peter Krempa
b9f2d781d9 qemu: command: Break some very long lines in qemuBuildRNGDevStr() 2015-02-10 13:05:22 +01:00
Peter Krempa
d7ec244f6e qemu: command: Shuffle around formatting of alias for RNG device backend
Move the alias name right after the object type for rng-egd backend so
that we can later use the JSON to commandline generator to create the
command line.
2015-02-10 13:05:22 +01:00
Luyao Huang
98e982b455 qemu: command: Make RNG backend device IDs unique
Libvirt didn't prefix the random number generator backend object alias
with any string thus the device alias and object alias were identical.

To avoid possible problems, rename the alias for the backend object and
tweak tests to comply with the change.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2015-02-10 13:05:22 +01:00
Luyao Huang
58a4eee81a qemu: refactor qemuBuildRNGDeviceArgs to allow reuse in RNG hotplug
Rename qemuBuildRNGDeviceArgs to qemuBuildRNGDevStr and change the
return type so that it can be reused in the device hotplug code later.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2015-02-10 13:05:22 +01:00
Luyao Huang
3921d13581 qemu: Add helper to assign RNG device aliases
This function is used to assign an alias for a RNG device. It will be
later reused when hotplugging RNGs.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2015-02-10 13:05:22 +01:00
Ján Tomko
8e724e9f3e Error out when custom tap device path makes no sense
It is only usable for NETWORK and BRIDGE type interfaces.
Error out when trying to start a domain where the custom
tap device path is specified for interfaces of other types,
or when the daemon is not privileged.

Note that this cannot be checked at definition time, because
the comparison is against actual type.

https://bugzilla.redhat.com/show_bug.cgi?id=1147195
2015-02-06 12:52:50 +01:00
Daniel P. Berrange
95fd6a91c6 qemu: include libvirt & QEMU versions in QEMU log files
It is often helpful to know which version of libvirt and QEMU
was present when a guest was first launched. Ensure this info
is written into the QEMU log file for each guest.
2015-02-06 10:22:07 +00:00
Luyao Huang
1b2c9ce752 qemu: Properly report error on uuid mismatch in the migration cookie
Add the missing jump to the error label when the uuid in the
migration cookie XML does not match the uuid of the migrated
domain.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2015-02-05 08:14:36 +01:00
Daniel P. Berrange
b38da58423 Make tests independant of system page size
Some code paths have special logic depending on the page size
reported by sysconf, which in turn affects the test results.
We must mock this so tests always have a consistent page size.
2015-02-02 20:27:43 +00:00
Peter Krempa
b92a003710 qemu: command: Don't combine old and modern NUMA node creation
Change done by commit f309db1f4d wrongly
assumes that qemu can start with a combination of NUMA nodes specified
with the "memdev" option and the appropriate backends, and the legacy
way by specifying only "mem" as a size argument. QEMU rejects such
commandline though:

$ /usr/bin/qemu-system-x86_64 -S -M pc -m 1024 -smp 2 \
-numa node,nodeid=0,cpus=0,mem=256 \
-object memory-backend-ram,id=ram-node1,size=12345 \
-numa node,nodeid=1,cpus=1,memdev=ram-node1
qemu-system-x86_64: -numa node,nodeid=1,cpus=1,memdev=ram-node1: qemu: memdev option must be specified for either all or no nodes

To fix this issue we need to check if any of the nodes requires the new
definition with the backend and if so, then all other nodes have to use
it too.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1182467
2015-01-31 08:53:22 +01:00
Peter Krempa
8795adf7d1 qemu: command: Refactor NUMA backend object formatting to use JSON objs
With the new JSON to argv formatter we are now able to represent the
memory backend definitions in the JSON object format that is reusable
for monitor use (hotplug) and then convert it into the shell string.
This will avoid having two separate instances of the same code that
would create the different formats.

Previous refactors now allow to make this step without changes to the
test suite.
2015-01-31 08:53:22 +01:00
Peter Krempa
b50b4ef30c qemu: command: Switch to bytes when formatting size for memory backends
QEMU's command line visitor as well as the JSON interface take bytes by
default for memory object sizes. Convert mebibytes to bytes so that we
can later refactor the existing code for hotplug purposes.
2015-01-31 08:53:22 +01:00
Peter Krempa
a47174c508 qemu: command: Unify values for boolean values when formating memory backends
QEMU's qapi visitor code allows yes/on/y for true and no/off/n for false
value of boolean properities. Unify the used style so that we can
generate it later and fix test cases.
2015-01-31 08:53:22 +01:00
Peter Krempa
172100ac85 qemu: command: Shuffle around formating of alias for memory backend objs
Move the alias as the second formated argument and tweak the tests so
that a future refactor that will change the order doesn't break tests.
2015-01-31 08:53:22 +01:00
Peter Krempa
db3b1c4a1c qemu: Extract code to setup memory backing objects
Extract the memory backend device code into a separate function so that
it can be later easily refactored and reused.

Few small changes for future reusability, namely:
- new (currently unused) parameter for user specified page size
- size of the memory is specified in kibibytes, divided up in the
function
- new (currently unused) parameter for user specifed source nodeset
- option to enforce capability check
2015-01-31 08:53:22 +01:00
Peter Krempa
331b2583ec qemu: command: Add helper to format -object strings from JSON representation
Unlike -device, qemu uses a JSON object to add backend "objects" via the
monitor rather than the string that would be passed on the commandline.

To be able to reuse code parts that configure backends for various
devices, this patch adds a helper that will allow generating the command
line representations from the JSON property object.
2015-01-31 08:53:22 +01:00
Tony Krowiak
79a8769479 qemu: change macvtap device options in response to NIC_RX_FILTER_CHANGED
This patch enables synchronization of the host macvtap
device options with the guest device's in response to the
NIC_RX_FILTER_CHANGED event.

The following device options will be synchronized:
* PROMISC
* MULTICAST
* ALLMULTI

Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-01-30 13:16:28 +01:00
John Ferlan
7879d03197 qemu: Don't unconditionally delete file in qemuOpenFileAs
https://bugzilla.redhat.com/show_bug.cgi?id=1158034

If we're expecting to create a file somewhere and that fails for some
reason during qemuOpenFileAs, then we unlink the path we're attempting
to create leaving no way to determine what the "existing" privileges,
protections, or labels are that caused the failure (open, change owner
and group, change mode, etc.).

Furthermore, if we fall into the path where we'll be opening / creating
the file using VIR_FILE_OPEN_FORK, we need to first unlink/delete the file
we created in the first path; otherwise, the attempt by the child process
to open as some specific user:group may fail because the file was already
created using nfsnobody:nfsnobody. Again, if we didn't create the file we
don't want to blindly delete what already exists. Thus, a second reason for
the original check to set need_unlink to false when we find the file with
CREAT set, but already existing.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2015-01-29 15:37:34 -05:00
John Ferlan
8ff383366b qemu: Adjust EndAsyncJob for qemuDomainSaveInternal error path
Commit id '540c339a' to fix issues with reference counting and transient
domains moved the qemuDomainObjEndAsyncJob call prior to the attempt to
restart the guest CPU's resulting in an error:

    error: Failed to save domain rhel70 to /tmp/pl/rhel70.save
    error: internal error: unexpected async job 3

when (ret != 0) - eg, the error path from qemuDomainSaveMemory.

This patch will adjust the logic to call the EndAsyncJob only after
we've tried to restart the guest CPUs. It also needs to adjust the
test for qemuDomainRemoveInactive to add the ret == 0 condition.

Additionally, if we get to endjob: because of some error earlier, then
we need to save that error in the event the CPU restart logic fails.
We don't want to return the error from CPU restart failure, rather we
want to return the error from the failed save that caused us to fall
into the retry to start the CPU logic.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2015-01-29 12:10:41 -05:00
Michal Privoznik
436dcf0b74 qemu: Add AAVMF to the list of known UEFIs
Well, even though users can pass the list of UEFI:NVRAM pairs at the
configure time, we may maintain the list of widely available UEFI
ourselves too. And as arm64 begin to rises, OVMF was ported there too.
With a slight name change - it's called AAVMF, with AAVMF_CODE.fd
being the UEFI firmware and AAVMF_VARS.fd being the NVRAM store file.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-01-29 15:20:47 +01:00
Michal Privoznik
bc03a23149 qemu: Allow UEFI paths to be specified at compile time
Up until now there are just two ways how to specify UEFI paths to
libvirt. The first one is editing qemu.conf, the other is editing
qemu_conf.c and recompile which is not that fancy. So, new
configure option is introduced: --with-loader-nvram which takes a
list of pairs of UEFI firmware and NVRAM store. This way, the
compiled in defaults can be passed during compile time without
need to change the code itself.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-01-29 15:20:42 +01:00
Luyao Huang
f76df311e8 qemu: fix cannot set graphic passwd via qemuDomainSaveImageDefineXML
https://bugzilla.redhat.com/show_bug.cgi?id=1183890

When we try to update a xml to a image file, we will clear the
graphics passwd settings, because we do not pass VIR_DOMAIN_XML_SECURE
to qemuDomainDefCopy, qemuDomainDefFormatBuf won't format the passwd.

Add VIR_DOMAIN_XML_SECURE flag when we call qemuDomainDefCopy
in qemuDomainSaveImageUpdateDef.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
2015-01-28 16:56:34 +01:00
Ján Tomko
21e0e8866e hotplug: only add a chardev to vmdef after monitor call
https://bugzilla.redhat.com/show_bug.cgi?id=1161024

This way the device is in vmdef only if ret = 0 and the caller
(qemuDomainAttachDeviceFlags) does not free it.

Otherwise it might get double freed by qemuProcessStop
and qemuDomainAttachDeviceFlags if the domain crashed
in monitor after we've added it to vm->def.
2015-01-28 10:10:54 +01:00
Ján Tomko
daf51be5f1 Split qemuDomainChrInsert into two parts
Do the allocation first, then add the actual device.
The second part should never fail. This is good
for live hotplug where we don't want to remove the device
on OOM after the monitor command succeeded.

The only change in behavior is that on failure, the
vmdef->consoles array is freed, not just the first console.
2015-01-27 18:30:15 +01:00
Daniel P. Berrange
f7afeddce9 qemu: report TAP device indexes to systemd
Record the index of each TAP device created and report them to
systemd, so they show up in machinectl status for the VM.
2015-01-27 13:57:02 +00:00
Daniel P. Berrange
55ea7be7d9 Removing probing of secondary drivers
For stateless, client side drivers, it is never correct to
probe for secondary drivers. It is only ever appropriate to
use the secondary driver that is associated with the
hypervisor in question. As a result the ESX & HyperV drivers
have both been forced to do hacks where they register no-op
drivers for the ones they don't implement.

For stateful, server side drivers, we always just want to
use the same built-in shared driver. The exception is
virtualbox which is really a stateless driver and so wants
to use its own server side secondary drivers. To deal with
this virtualbox has to be built as 3 separate loadable
modules to allow registration to work in the right order.

This can all be simplified by introducing a new struct
recording the precise set of secondary drivers each
hypervisor driver wants

struct _virConnectDriver {
    virHypervisorDriverPtr hypervisorDriver;
    virInterfaceDriverPtr interfaceDriver;
    virNetworkDriverPtr networkDriver;
    virNodeDeviceDriverPtr nodeDeviceDriver;
    virNWFilterDriverPtr nwfilterDriver;
    virSecretDriverPtr secretDriver;
    virStorageDriverPtr storageDriver;
};

Instead of registering the hypervisor driver, we now
just register a virConnectDriver instead. This allows
us to remove all probing of secondary drivers. Once we
have chosen the primary driver, we immediately know the
correct secondary drivers to use.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-01-27 12:02:04 +00:00
Daniel P. Berrange
7b1ba9566b Remove use of nwfilterPrivateData from nwfilter driver
The nwfilter driver can rely on its global state instead
of the connect private data.
2015-01-27 12:02:03 +00:00
Peter Krempa
d13f56f08a qemu: Fix job handling in qemuDomainSetMetadata
The code modifies the domain configuration but doesn't take a MODIFY
type job to do so.
2015-01-27 10:39:21 +01:00
Peter Krempa
fb2ed975c3 qemu: Fix job type in qemuDomainGetBlockIoTune
The function just queries status so there's no need for a MODIFY type
job.
2015-01-27 10:39:21 +01:00
Peter Krempa
c5ee5cfb18 qemu: Fix job handling in qemuDomainSetSchedulerParametersFlags
The code modifies the domain configuration but doesn't take a MODIFY
type job to do so.
2015-01-27 10:38:47 +01:00
Peter Krempa
4fd7a72075 qemu: Fix job handling in qemuDomainSetMemoryParameters
The code modifies the domain configuration but doesn't take a MODIFY
type job to do so.
2015-01-27 10:24:04 +01:00
Peter Krempa
e3e72743df qemu: Fix job handling in qemuDomainSetAutostart
The code modifies the domain configuration but doesn't take a MODIFY
type job to do so.

This patch also fixes a few very long lines of code around the touched
parts.
2015-01-27 10:24:04 +01:00
Peter Krempa
79e5603307 qemu: Fix job handling in qemuDomainPinEmulator
The code modifies the domain configuration but doesn't take a MODIFY
type job to do so.
2015-01-27 10:24:04 +01:00
Peter Krempa
46d950443d qemu: Fix job handling in qemuDomainPinVcpuFlags
The domain modifies the domain configuration but doesn't take a MODIFY
type job to do it.
2015-01-27 10:24:03 +01:00
Richard W.M. Jones
ee4c13ce1d aarch64: Support versioned machine types.
For distros that want to add versioned machine types, they will add
(downstream) machine types like "virt-foo-1.2.3".  Detect these as
MMIO too.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
2015-01-23 15:12:33 +00:00
Erik Skultety
b7e6f2fc80 qemu: Add check for PCI bridge placement if there are too many PCI devices
Previous patch of this series fixed the issue with adding a new PCI bridge
when all the slots were reserved by devices with user specified addresses.
In case there are still some PCI devices waiting to get a slot reserved
by qemuAssignDevicePCISlots, this means a new bus needs to be
created along with a corresponding bridge controller. By adding an
additional check, this scenario now results in a reasonable error
instead of generating wrong qemu command line.
2015-01-23 14:35:03 +01:00
Erik Skultety
5d6904b991 qemu: Fix auto-adding PCI bridge when all slots are reserved
Commit 93c8ca tried to fix the issue with auto-adding of a PCI bridge
controller, but didn't work properly in all scenarios.

This patch provides a better fix of the issue when all slots on a PCI bus
are reserved by devices with user specified addresses and no additional
bridges need to be created.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1132900
2015-01-23 14:32:18 +01:00
Erik Skultety
a3ecd63e92 qemu: move PCI slot assignment for PIIX3, Q35 into a separate function
In order to be able to test for fully reserved PCI buses, assignment of
PCI slots for integrated devices needs to be moved to a separate function.
This also might be a good preparation if we decide to add support for
other chipsets as well.
2015-01-23 14:26:55 +01:00
Erik Skultety
3fb2a69284 qemu: reorder PCI slot assignment functions
Move qemuDomainAssignPCIAddresses after the definition
of the static function qemuDomainValidateDevicePCISlotsQ35.

This lets us define a new static function using
qemuDomainValidateDevicePCISlots* and use it in
qemuDomainAssignPCIAddresses without a forward declaration.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
2015-01-23 14:16:40 +01:00
Peter Krempa
165c34778b qemu: command: Honor const-correctnes in qemuBuildNumaArgStr
@def is modified in the function indirectly although it's marked as
const.
2015-01-23 13:18:04 +01:00
Erik Skultety
2fbfb3ac41 qemu: Remove dead code in qemuDomainAssignPCIAddresses revert patch
As it turned out, fix of dead code 419a22 changed the affected condition
from "never true" to "always true", so better fix would be to change the
return code of virDomainMaybeAddController from 0 to 1 if
a new bridge has been added, thus distinguishing case when we didn't need to
add any controller and case we successfully added one.

The return code is changed in the next commit
2015-01-23 11:03:45 +01:00
Peter Krempa
b347c0c2a3 CVE-2015-0236: qemu: Check ACLs when dumping security info from snapshots
The ACL check didn't check the VIR_DOMAIN_XML_SECURE flag and the
appropriate permission for it. Found via code inspection while fixing
permissions for save images.
2015-01-22 14:32:54 +01:00
Peter Krempa
03c3c0c874 CVE-2015-0236: qemu: Check ACLs when dumping security info from save image
The ACL check didn't check the VIR_DOMAIN_XML_SECURE flag and the
appropriate permission for it.
2015-01-22 14:32:54 +01:00
Luyao Huang
860522d26b qemu: output error when try to hotplug unsupported console type
https://bugzilla.redhat.com/show_bug.cgi?id=1164627

When using 'virsh attach-device' to hotplug an unsupported console type
into a qemu guest the attachment would succeed as the command line
formatter didn't report error in such case.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
2015-01-22 11:17:14 +01:00
Ján Tomko
280ece4af9 qemu: format server interface without a listen address
https://bugzilla.redhat.com/show_bug.cgi?id=1130390

The listen address is not mandatory for <interface type='server'>
but when it's not specified, we've been formatting it as:
-netdev socket,listen=(null):5558,id=hostnet0
which failed with:
Device 'socket' could not be initialized

Omit the address completely and only format the port in the listen
attribute.

Also fix the schema to allow specifying a model.
2015-01-21 13:22:36 +01:00
Ján Tomko
d16704fd60 qemu_conf: check for duplicate security drivers
Using the same driver multiple times is pointless and
it can result in confusing errors:

$ virsh start test
error: Failed to start domain test
error: internal error: security label already defined for VM

https://bugzilla.redhat.com/show_bug.cgi?id=1153891
2015-01-19 12:46:37 +01:00
Ján Tomko
5c703ca396 Always check return value of qemuDomainObjExitMonitor
Depending on the context, either error out if the domain
has disappeared in the meantime, or just ignore the value
to allow marking the function as ATTRIBUTE_RETURN_CHECK.
2015-01-19 10:12:32 +01:00
Ján Tomko
3070bc8ee5 Fix vmdef usage after domain crash in monitor on device attach
https://bugzilla.redhat.com/show_bug.cgi?id=1161024

If the domain crashed while we were in monitor,
we cannot rely on the REALLOC done on live definition,
since vm->def now points to the persistent definition.
Skip adding the attached devices to domain definition
if the domain crashed.

In AttachChrDevice, the chardev was already added to the
live definition and freed by qemuProcessStop in the case
of a crash. Skip the device removal in that case.

Also skip audit if the domain crashed in the meantime.
2015-01-19 10:12:32 +01:00
Ján Tomko
6edb97f29a Fix vmdef usage after domain crash in monitor on device detach
https://bugzilla.redhat.com/show_bug.cgi?id=1161024

In the device type-specific functions, exit early
if the domain has disappeared, because the cleanup
should have been done by qemuProcessStop.

Check the return value in processDeviceDeletedEvent
and qemuProcessUpdateDevices.

Skip audit and removing the device from live def because
it has already been cleaned up.
2015-01-19 10:12:07 +01:00
Dmitry Guryanov
c8a6f844c3 add ploop fs driver type
Ploop is a pseudo device which makeit possible to access
to an image in a file as a block device. Like loop devices,
but with additional features, like snapshots, write tracker
and without double-caching.

It used in PCS for containers and in OpenVZ. You can manage
ploop devices and images with ploop utility
(http://git.openvz.org/?p=ploop).

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2015-01-16 14:07:46 +01:00
Martin Kletzander
6514c04c18 qemu: Add support for enabling/disabling PMU
This is used as a boolean parameter for the '-cpu' option.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1178853

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2015-01-16 13:43:46 +01:00
Erik Skultety
419a22d5db Remove dead code in qemuDomainAssignPCIAddresses
We tested for positive return value from virDomainMaybeAddController,
but it returns 0 or -1 only resulting in a dead code.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-01-16 10:59:13 +01:00
Erik Skultety
93c8ca9974 qemu: Tweak auto adding PCI bridge controller when extending default PCI bus
In case we find out, there are more PCI devices to be connected
than there are available slots on the default PCI bus, we automatically add a
new bus and a related PCI bridge controller as well. As there are no free slots
left on the default PCI bus, PCI bridge controller gets a free slot on a
newly created PCI bus which causes qemu to refuse to start the guest.
This fix introduces a new function qemuDomainPCIBusFullyReserved which
is checked right before we possibly try to reserve a slot for PCI bridge
controller.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1132900
2015-01-16 10:38:29 +01:00
Daniel P. Berrange
dd69a14f90 Add support for schema validation when passing in XML
The virDomainDefineXMLFlags and virDomainCreateXML APIs both
gain new flags allowing them to be told to validate XML.
This updates all the drivers to turn on validation in the
XML parser when the flags are set
2015-01-15 16:40:27 +00:00
Daniel P. Berrange
c5b6a4a5cb Change int to size_t in size var for tap/vhost FDs
A number of methods take an int for a parameter that indicates
the size of an array. The correct type for array sizes is
size_t
2015-01-15 11:07:13 +00:00
Daniel P. Berrange
318df5a05f Add support for systemd-machined CreateMachineWithNetwork
systemd-machined introduced a new method CreateMachineWithNetwork
that obsoletes CreateMachine. It expects to be given a list of
VETH/TAP device indexes for the host side device(s) associated
with a container/machine.

This falls back to the old CreateMachine method when the new
one is not supported.
2015-01-15 11:07:07 +00:00
Luyao Huang
5035279198 qemu: free priv->origname when qemuMigrationPrepareAny fails
https://bugzilla.redhat.com/show_bug.cgi?id=1181182

When we meet error in qemuMigrationPrepareAny and goto
cleanup with rc < 0, we forget clear the priv->origname and this
will make this vm migrate fail next time because leave a wrong
origname in  priv, and will Generate a wrong cookie when do
migrate next time.

This patch will make priv->origname is NULL when migrate fail
in target host.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2015-01-15 11:32:50 +01:00
Ján Tomko
c749eda4a2 Fix vmdef usage while in monitor in qemu process
Make local copy of the disk alias in qemuProcessInitPasswords,
instead of referencing the one in domain definition, which
might get freed if the domain crashes while we're in monitor.

Also copy the memballoon period value.
2015-01-14 19:30:32 +01:00
Ján Tomko
3f21398437 Fix vmdef usage while in monitor in BlockStat* APIs
Make a local copy of the disk alias instead of pointing
to the domain definition, which might get freed if
the domain dies while we're in monitor.

Also exit early if that happens.
2015-01-14 19:30:32 +01:00
Ján Tomko
051add2ff9 Fix vmdef usage while in monitor in qemuDomainHotplugVcpus
Exit the monitor right after we've done with it to get
the virDomainObjPtr lock back, otherwise we might be accessing
vm->def while it's being cleaned up by qemuProcessStop.

If the domain crashed while we were in the monitor, exit
early instead of changing vm->def which is now the persistent
definition.
2015-01-14 19:30:32 +01:00
Ján Tomko
dc2fd51fd7 Check for domain liveness in qemuDomainObjExitMonitor
The domain might disappear during the time in monitor when
the virDomainObjPtr is unlocked, so the caller needs to check
if it's still alive.

Since most of the callers are going to need it, put the
check inside qemuDomainObjExitMonitor and return -1 if
the domain died in the meantime.
2015-01-14 19:30:32 +01:00
Pavel Hrdina
ce745914b3 qemu_process: detect updated video ram size values from QEMU
QEMU internally updates the size of video memory if the domain XML had
provided too low memory size or there are some dependencies for a QXL
devices 'vgamem' and 'ram' size. We need to know about the changes and
store them into the status XML to not break migration or managedsave
through different libvirt versions.

The values would be loaded only if the "vgamem_mb" property exists for
the device.  The presence of the "vgamem_mb" also tells that the
"ram_size" and "vram_size" exists for QXL devices.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2015-01-14 11:55:51 +01:00
Pavel Hrdina
cc41c64878 qemu_monitor: introduce new function to get QOM path
The search is done recursively only through QOM object that has a type
prefixed with "child<" as this indicate that the QOM is a parent for
other QOM objects.

The usage is that you give known device name with starting path where to
search.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2015-01-14 11:55:51 +01:00
Pavel Hrdina
e105dc9814 qemu_driver: fix setting vcpus for offline domain
Commit e3435caf fixed hot-plugging of vcpus with strict memory pinning
on NUMA hosts, but unfortunately it also broke updating number of vcpus
for offline guests using our API.

The issue is that we try to create a cpu cgroup for non-running guest
which fails as there are no cgroups for that domain. We should create
cgroups and update cpuset.mems only if we are hot-plugging.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2015-01-14 10:34:20 +01:00
Michal Privoznik
04cf99a6b6 qemu, lxc: Warn if setting QoS on unsupported vNIC types
https://bugzilla.redhat.com/show_bug.cgi?id=1165993

So, there are still plenty of vNIC types that we don't know how to set
bandwidth on. Let's warn explicitly in case user has requested it
instead of pretending everything was set.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-01-14 08:54:49 +01:00
Shanzhi Yu
9f974858dd qemu: snapshot: inactive external snapshot can't work after libvirtd restart
When create inactive external snapshot, after update disk definitions,
virDomainSaveConfig is needed, if not after restart libvirtd the new snapshot
file definitions in xml will be lost.

Reproduce steps:

1. prepare a shut off guest
$ virsh domstate rhel7 && virsh domblklist rhel7
shut off

Target     Source
------------------------------------------------
vda        /var/lib/libvirt/images/rhel7.img

2. create external disk snapshot
$ virsh snapshot-create rhel7 --disk-only && virsh domblklist rhel7
Domain snapshot 1417882967 created
Target     Source
------------------------------------------------
vda        /var/lib/libvirt/images/rhel7.1417882967

3. restart libvirtd then check guest source file
$ service  libvirtd restart && virsh domblklist rhel7
Redirecting to /bin/systemctl restart  libvirtd.service
Target     Source
------------------------------------------------
vda        /var/lib/libvirt/images/rhel7.img

This was first reported by Eric Blake
http://www.redhat.com/archives/libvir-list/2014-December/msg00369.html

Signed-off-by: Shanzhi Yu <shyu@redhat.com>
2015-01-13 15:59:06 -05:00
Daniel P. Berrange
0ecd685109 Give virDomainDef parser & formatter their own flags
The virDomainDefParse* and virDomainDefFormat* methods both
accept the VIR_DOMAIN_XML_* flags defined in the public API,
along with a set of other VIR_DOMAIN_XML_INTERNAL_* flags
defined in domain_conf.c.

This is seriously confusing & error prone for a number of
reasons:

 - VIR_DOMAIN_XML_SECURE, VIR_DOMAIN_XML_MIGRATABLE and
   VIR_DOMAIN_XML_UPDATE_CPU are only relevant for the
   formatting operation
 - Some of the VIR_DOMAIN_XML_INTERNAL_* flags only apply
   to parse or to format, but not both.

This patch cleanly separates out the flags. There are two
distint VIR_DOMAIN_DEF_PARSE_* and VIR_DOMAIN_DEF_FORMAT_*
flags that are used by the corresponding methods. The
VIR_DOMAIN_XML_* flags received via public API calls must
be converted to the VIR_DOMAIN_DEF_FORMAT_* flags where
needed.

The various calls to virDomainDefParse which hardcoded the
use of the VIR_DOMAIN_XML_INACTIVE flag change to use the
VIR_DOMAIN_DEF_PARSE_INACTIVE flag.
2015-01-13 16:26:12 +00:00
Eric Blake
e1125cebfc qemu: forbid second blockcommit during active commit
https://bugzilla.redhat.com/show_bug.cgi?id=1135339 documents some
confusing behavior when a user tries to start an inactive block
commit in a second connection while there is already an on-going
active commit from a first connection.  Eventually, qemu will
support multiple simultaneous block jobs, but as of now, it does
not; furthermore, libvirt also needs an overhaul before we can
support simultaneous jobs.  So, the best way to avoid confusing
ourselves is to quit relying on qemu to tell us about the situation
(where we risk getting in weird states) and instead forbid a
duplicate block commit ourselves.

Note that we are still relying on qemu to diagnose attempts to
interrupt an inactive commit (since we only track XML of an active
commit), but as inactive commit is less confusing for libvirt to
manage, there is less that can go wrong by leaving that detection
up to qemu.

* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Hoist check for
active commit to occur earlier outside of conditions.

Signed-off-by: Eric Blake <eblake@redhat.com>
2015-01-13 08:21:20 -07:00
Daniel P. Berrange
4d2ebc71ce Add stub virDomainDefineXMLFlags impls
Make sure every virt driver implements virDomainDefineXMLFlags
by adding a trivial passthrough from the existing impl with
no flags set.
2015-01-13 10:38:56 +00:00
Martin Kletzander
adff345e1e qemu: Allow enabling/disabling features with host-passthrough
QEMU supports feature specification with -cpu host and we just skip
using that.  Since QEMU developers themselves would like to use this
feature, this patch modifies the code to work.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1178850

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2015-01-13 08:51:01 +01:00
Pavel Hrdina
0e502466ac qxl: change the default value for vgamem_mb to 16 MiB
The default value should be 16 MiB instead of 8 MiB. Only really old
version of upstream QEMU used the 8 MiB as default for vga framebuffer.

Without this change if you update your libvirt where we introduced the
"vgamem" attribute for QXL video device the value will be set to 8 MiB,
but previously your guest had 16 MiB because we didn't pass any value to
QEMU command line which means QEMU used its own 16 MiB as default.

This will affect all users with guest's display resolution higher than
1920x1080.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2015-01-12 14:51:13 +01:00
Michal Privoznik
732586d979 qemu: Fix system pages handling in <memoryBacking/>
In one of my previous commits (311b4a67) I've tried to allow to
pass regular system pages to <hugepages>. However, there was a
little bug that wasn't caught. If domain has guest NUMA topology
defined, qemuBuildNumaArgStr() function takes care of generating
corresponding command line. The hugepages backing for guest NUMA
nodes is handled there too. And here comes the bug: the hugepages
setting from XML is stored in KiB internally, however, the system
pages size was queried and stored in Bytes. So the check whether
these two are equal was failing even if it shouldn't.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-01-07 18:32:07 +01:00
Ján Tomko
b073179085 Indentation 2015-01-07 16:35:18 +01:00
Peter Krempa
79bb49a83d qemu: Don't unref domain after exit from nested async job
In commit 540c339a25 the whole domain
reference counting was refactored in the qemu driver. Domain jobs now
don't need to reference the domain object as they now expect the
reference from the calling function.

However, the patch forgot to remove the unref call in case we exit the
monitor when we were acquiring a nested job. This caused the daemon to
crash on a subsequent access to the domain object once we've done an
operation requiring a nested job for a monitor access.

An easy reproducer case:

1) Start a vm with qcow disks
2) virsh snapshot-create-as DOMNAME
3) virsh dumpxml DOMNAME
4) daemon crashes in a semi-random spot while accessing a now-removed VM
object.

Fortunately, the commit wasn't released yet, so there are no security
implications.

Reported-by: Shanzi Yu <shyu@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2015-01-07 13:47:31 +01:00
Luyao Huang
565d049fd1 qemu: Restore old bandwidth rules when setting new fails
https://bugzilla.redhat.com/show_bug.cgi?id=1177723

When setting new bandwidth limits via
virDomainSetInterfaceParameters, the old ones are cleared first.
However, if setting the new ones fails, the old are already gone
and interface is left in inconsistent state.  Therefore, right
before failing we ought to try to restore the old bandwidth.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
2015-01-06 13:27:43 +01:00
Luyao Huang
a791599cc6 qemu: fix miss goto cleanup in qemuDomainAttachNetDevice
This place have a wrong logic, maybe forget goto cleanup.
Also fix some small things.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
2015-01-06 11:07:13 +01:00
Luyao Huang
39449f70b9 qemu: use a wrong name for guest panic status
https://bugzilla.redhat.com/show_bug.cgi?id=1178652

We will get a warning when we have a guest in paused
status (caused by kernel panic) and restart libvirtd,
warning message like this:

Qemu reported unknown VM status: 'guest-panicked'

and this seems because we set a wrong status name in
qemu_monitor.c, and from qemu qapi-schema.json file
we know this status should named 'guest-panicked'.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2015-01-05 16:55:35 -07:00
Cédric Bosdonnat
aa2cc72100 Domain conf: allow more than one IP address for net devices
Add the possibility to have more than one IP address configured for a
domain network interface. IP addresses can also have a prefix to define
the corresponding netmask.
2015-01-05 20:24:04 +01:00
Martin Kletzander
31354b5b32 qemu: Fix coverity issues after refcount refactoring
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-23 05:34:05 +01:00
Martin Kletzander
540c339a25 qemu: completely rework reference counting
There is one problem that causes various errors in the daemon.  When
domain is waiting for a job, it is unlocked while waiting on the
condition.  However, if that domain is for example transient and being
removed in another API (e.g. cancelling incoming migration), it get's
unref'd.  If the first call, that was waiting, fails to get the job, it
unref's the domain object, and because it was the last reference, it
causes clearing of the whole domain object.  However, when finishing the
call, the domain must be unlocked, but there is no way for the API to
know whether it was cleaned or not (unless there is some ugly temporary
variable, but let's scratch that).

The root cause is that our APIs don't ref the objects they are using and
all use the implicit reference that the object has when it is in the
domain list.  That reference can be removed when the API is waiting for
a job.  And because each domain doesn't do its ref'ing, it results in
the ugly checking of the return value of virObjectUnref() that we have
everywhere.

This patch changes qemuDomObjFromDomain() to ref the domain (using
virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which
should be the only function in which the return value of
virObjectUnref() is checked.  This makes all reference counting
deterministic and makes the code a bit clearer.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-21 10:48:56 +01:00
Daniel P. Berrange
65686e5a81 disable vCPU pinning with TCG mode
Although QMP returns info about vCPU threads in TCG mode, the
data it returns is mostly lies. Only the first vCPU has a valid
thread_id returned. The thread_id given for the other vCPUs is
in fact the main emulator thread. All vCPUs actually run under
the same thread in TCG mode.

Our vCPU pinning code is not at all able to cope with this
so if you try to set CPU affinity per-vCPU you end up with
wierd errors

error: Failed to start domain instance-00000007
error: cannot set CPU affinity on process 24365: Invalid argument

Since few people will care about the performance of TCG with
strict CPU pinning, lets just disable that for now, so we get
a clear error message

error: Failed to start domain instance-00000007
error: Requested operation is not valid: cpu affinity is not supported
2014-12-19 11:32:21 +00:00
Daniel P. Berrange
b07f3d821d Don't setup fake CPU pids for old QEMU
The code assumes that def->vcpus == nvcpupids, so when we setup
fake CPU pids for old QEMU with nvcpupids == 1, we cause the
later code to read off the end of the array. This has fun results
like sche_setaffinity(0, ...) which changes libvirtd's own CPU
affinity, or even better sched_setaffinity($RANDOM, ...) which
changes the affinity of a random OS process.
2014-12-19 11:32:21 +00:00
Michal Privoznik
f309db1f4d qemu: Create memory-backend-{ram,file} iff needed
Libvirt BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1175397
QEMU BZ:    https://bugzilla.redhat.com/show_bug.cgi?id=1170093

In qemu there are two interesting arguments:

1) -numa to create a guest NUMA node
2) -object memory-backend-{ram,file} to tell qemu which memory
region on which host's NUMA node it should allocate the guest
memory from.

Combining these two together we can instruct qemu to create a
guest NUMA node that is tied to a host NUMA node. And it works
just fine. However, depending on machine type used, there might
be some issued during migration when OVMF is enabled (see QEMU
BZ). While this truly is a QEMU bug, we can help avoiding it. The
problem lies within the memory backend objects somewhere. Having
said that, fix on our side consists on putting those objects on
the command line if and only if needed. For instance, while
previously we would construct this (in all ways correct) command
line:

    -object memory-backend-ram,size=256M,id=ram-node0 \
    -numa node,nodeid=0,cpus=0,memdev=ram-node0

now we create just:

    -numa node,nodeid=0,cpus=0,mem=256

because the backend object is obviously not tied to any specific
host NUMA node.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-12-19 07:44:44 +01:00
Ján Tomko
1adda68a1b Remove redundant cleanup in qemuDomainAttachVirtioDiskDevice
Commit ca91ba7 moved these into the qemuDomainPrepareDisk helper,
but forgot to remove them from here as well.
2014-12-18 12:53:56 +01:00
Ján Tomko
1cddf0001f Fix hotplugging of block device-backed usb disks
Commit ca91ba7 moved qemuSetupDiskCgroup into the qemuDomainPrepareDisk
helper, but failed to call it for usb disks.

https://bugzilla.redhat.com/show_bug.cgi?id=1175668`
2014-12-18 12:53:56 +01:00
Eric Blake
af5c3a1015 qemu: fix memory leak in blockinfo
Coverity flagged commit 0282ca45 as introducing a memory leak;
in all my refactoring to make capacity probing conditional on
whether the image is non-raw, I missed deleting the unconditional
probe.

* src/qemu/qemu_driver.c (qemuStorageLimitsRefresh): Drop
redundant assignment.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-17 16:10:45 -07:00
Ján Tomko
952f8a7394 Fix error message on redirdev caps detection 2014-12-17 16:23:45 +01:00
Luyao Huang
dddd832735 conf: fix cannot start a guest have a shareable network iscsi hostdev
https://bugzilla.redhat.com/show_bug.cgi?id=1174569

There's nothing we need to do for shared iSCSI devices in
qemuAddSharedHostdev and qemuRemoveSharedHostdev. The iSCSI layer
takes care about that for us.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-12-17 11:23:00 +01:00
Eric Blake
3937ef9cf4 getstats: crawl backing chain for qemu
Wire up backing chain recursion.  For the first time, it is now
possible to get libvirt to expose that qemu tracks read statistics
on backing files, as well as report maximum extent written on a
backing file during a block-commit operation.

For a running domain, where one of the two images has a backing
file, I see the traditional output:

$ virsh domstats --block testvm2
Domain: 'testvm2'
  block.count=2
  block.0.name=vda
  block.0.path=/tmp/wrapper.qcow2
  block.0.rd.reqs=1
  block.0.rd.bytes=512
  block.0.rd.times=28858
  block.0.wr.reqs=0
  block.0.wr.bytes=0
  block.0.wr.times=0
  block.0.fl.reqs=0
  block.0.fl.times=0
  block.0.allocation=0
  block.0.capacity=1310720000
  block.0.physical=200704
  block.1.name=vdb
  block.1.path=/dev/sda7
  block.1.rd.reqs=0
  block.1.rd.bytes=0
  block.1.rd.times=0
  block.1.wr.reqs=0
  block.1.wr.bytes=0
  block.1.wr.times=0
  block.1.fl.reqs=0
  block.1.fl.times=0
  block.1.allocation=0
  block.1.capacity=1310720000

vs. the new output:

$ virsh domstats --block --backing testvm2
Domain: 'testvm2'
  block.count=3
  block.0.name=vda
  block.0.path=/tmp/wrapper.qcow2
  block.0.rd.reqs=1
  block.0.rd.bytes=512
  block.0.rd.times=28858
  block.0.wr.reqs=0
  block.0.wr.bytes=0
  block.0.wr.times=0
  block.0.fl.reqs=0
  block.0.fl.times=0
  block.0.allocation=0
  block.0.capacity=1310720000
  block.0.physical=200704
  block.1.name=vda
  block.1.path=/dev/sda6
  block.1.backingIndex=1
  block.1.rd.reqs=0
  block.1.rd.bytes=0
  block.1.rd.times=0
  block.1.wr.reqs=0
  block.1.wr.bytes=0
  block.1.wr.times=0
  block.1.fl.reqs=0
  block.1.fl.times=0
  block.1.allocation=327680
  block.1.capacity=786432000
  block.2.name=vdb
  block.2.path=/dev/sda7
  block.2.rd.reqs=0
  block.2.rd.bytes=0
  block.2.rd.times=0
  block.2.wr.reqs=0
  block.2.wr.bytes=0
  block.2.wr.times=0
  block.2.fl.reqs=0
  block.2.fl.times=0
  block.2.allocation=0
  block.2.capacity=1310720000

I may later do a patch that trims the output to avoid 0 stats,
particularly for backing files (which are more likely to have
0 stats, at least for write statistics when no block-commit
is performed).  Also, I still plan to expose physical size
information (qemu doesn't expose it yet, so it requires a stat,
and for block devices, a further open/seek operation).  But
this patch is good enough without worrying about that yet.

* src/qemu/qemu_driver.c (QEMU_DOMAIN_STATS_BACKING): New internal
enum bit.
(qemuConnectGetAllDomainStats): Recognize new user flag, and pass
details to...
(qemuDomainGetStatsBlock): ...here, where we can do longer recursion.
(qemuDomainGetStatsOneBlock): Output new field.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-17 02:07:44 -07:00
Eric Blake
c2d380bff8 getstats: split block stats reporting for easier recursion
In order to report stats on backing chains, we need to separate
the output of stats for one block from how we traverse blocks.

* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Split...
(qemuDomainGetStatsOneBlock): ...into new helper.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-17 02:07:44 -07:00
Eric Blake
14ef1f62e3 getstats: prepare for dynamic block.count stat
A coming patch will make it optionally possible to list backing
chain block stats; in this mode of operation, block.counts is no
longer the number of <disks> in the domain, but the number of
blocks in the array being reported.  We still want block.count
listed first, but rather than iterate the tree twice (once to
count, and once to list stats), it's easier to just touch things
up after the fact.

* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Compute count
after the fact.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-17 00:20:21 -07:00
Eric Blake
596a137134 getstats: report block sizes for offline domains
The prior refactoring can now be put to use. With the same domain
as the earlier commit 7b49926 (one qcow2 disk and an empty
cdrom drive):
$ virsh domstats --block foo
Domain: 'foo'
  block.count=2
  block.0.name=hda
  block.0.path=/var/lib/libvirt/images/foo.qcow2
  block.0.allocation=1309614080
  block.0.capacity=42949672960
  block.0.physical=1309671424
  block.1.name=hdc

* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Use
qemuStorageLimitsRefresh to report offline statistics.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-17 00:20:21 -07:00
Eric Blake
8de6544e98 qemu: refactor blockinfo data gathering
Create a helper function that can be reused for gathering block
info from virDomainListGetStats.

* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Split guts...
(qemuStorageLimitsRefresh): ...into new helper function.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-16 23:28:36 -07:00
Eric Blake
0282ca45a0 qemu: fix bugs in blockstats
The documentation for virDomainBlockInfo was confusing: it stated
that 'physical' was the size of the container, then gave an example
of it being the amount of storage used by a sparse file (that is,
for a sparse raw image on a regular file, the wording implied
capacity==physical, while allocation was smaller; but the example
instead claimed physical==allocation).  Since we use 'physical' for
the last offset of a block device, we should do likewise for
regular files.

Furthermore, the example claimed that for a qcow2 regular file,
allocation==physical.  At the time the code was first written,
this was true (qcow2 files were allocated sequentially, and were
never sparse, so the last sector written happened to also match
the disk space occupied); but modern qemu does much better and
can punch holes for a qcow2 with allocation < physical.

Basically, after this patch, the three fields are now reliably
mapped as:
 'capacity' - how much storage the guest can see (equal to
physical for raw images, determined by image metadata otherwise)
 'allocation' - how much storage the image occupies (similar to
what 'du' would report)
 'physical' - the last offset of the image (similar to what 'ls'
would report)

'capacity' can be larger than 'physical' (such as for a qcow2
image that does not vary much from a backing file) or smaller
(such as for a qcow2 file with lots of internal snapshots).
Likewise, 'allocation' can be (slightly) larger than 'physical'
(such as counting the tail of cluster allocations required to
round a file size up to filesystem granularity) or smaller
(for a sparse file).  A block-resize operation changes capacity
(which, for raw images, also changes physical); many non-raw
images automatically grow physical and allocation as necessary
when starting with an allocation smaller than capacity; and even
when capacity and physical stay unchanged, allocation can change
when converting sectors from holes to data or back.

Note that this does not change semantics for qcow2 images stored
on block devices; there, we still rely on qemu to report the
highest written extent for allocation.  So using this API to
track when to extend a block device because a qcow2 image is
about to exceed a threshold will not see any changes.

Also, note that virStorageVolInfo is unfortunately limited to
just 'capacity' and 'allocation' (we can't expand it to add
'physical', although we can expand the XML to add it there);
historically, that struct's 'allocation' value has reported
file size for qcow2 files (what this patch terms 'physical'
for a domain block device), but disk usage for raw files (what
this patch terms 'allocation').  So follow-up patches will be
needed to make storage volumes report the same allocation
values and get at physical values, where those differ.

* include/libvirt/libvirt-domain.h (_virDomainBlockInfo): Tweak
documentation to match saner definition.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): For regular
files, physical size is capacity, not allocation.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-16 23:19:08 -07:00
Eric Blake
05e702cfd4 getstats: rearrange blockinfo gathering
Ultimately, we want to avoid read()ing a file while qemu is running.
We still have to open() block devices to determine their physical
size, but that is safer.  This patch rearranges code to group
together all code that reads the image, to make it easier for later
patches to skip the metadata collection when possible.

* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Check for empty
disk up front.  Place metadata reading next to use.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-16 23:13:04 -07:00
Eric Blake
b1802714da getstats: perform recursion in monitor collection
When requested in a later patch, the QMP command results are now
examined recursively.  As qemu_driver will eventually have to
read items out of the hash table as stored by this patch, the
computation of backing alias string is done in a shared location.

* src/qemu/qemu_domain.h (qemuDomainStorageAlias): New prototype.
* src/qemu/qemu_domain.c (qemuDomainStorageAlias): Implement it.
* src/qemu/qemu_monitor_json.c
(qemuMonitorJSONGetOneBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacityOne): Perform recursion.
(qemuMonitorJSONGetAllBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacity): Update callers.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-16 16:14:55 -07:00
Eric Blake
7b11f5e554 getstats: prepare monitor collection for recursion
A future patch will allow recursion into backing chains when
collecting block stats.  This patch should not change behavior,
but merely moves out the common code that will be reused once
recursion is enabled, and adds the parameter that will turn on
recursion.

* src/qemu/qemu_monitor.h (qemuMonitorGetAllBlockStatsInfo)
(qemuMonitorBlockStatsUpdateCapacity): Add recursion parameter,
although it is ignored for now.
* src/qemu/qemu_monitor.h (qemuMonitorGetAllBlockStatsInfo)
(qemuMonitorBlockStatsUpdateCapacity): Likewise.
* src/qemu/qemu_monitor_json.h
(qemuMonitorJSONGetAllBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacity): Likewise.
* src/qemu/qemu_monitor_json.c
(qemuMonitorJSONGetAllBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacity): Add parameter, and
split...
(qemuMonitorJSONGetOneBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacityOne): ...into helpers.
(qemuMonitorJSONGetBlockStatsInfo): Update caller.
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Update caller.
* src/qemu/qemu_migration.c (qemuMigrationCookieAddNBD): Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-16 16:08:04 -07:00
Eric Blake
89646e69ac qemu: let blockinfo reuse virStorageSource
Right now, grabbing blockinfo always calls stat on the disk, then
opens the image to determine the capacity, using a throw-away
virStorageSourcePtr.  This has a couple of drawbacks:

1. We are calling stat and opening a file on every invocation of
the API.  However, there are cases where the stats should NOT be
changing between successive calls (if a domain is running, no
one should be changing the physical size of a block device or raw
image behind our backs; capacity of read-only files should not
be changing; and we are the gateway to the block-resize command
to know when the capacity of read-write files should be changing).
True, we still have to use stat in some cases (a sparse raw file
changes allocation if it is read-write and the amount of holes is
changing, and a read-write qcow2 image stored in a file changes
physical size if it was not fully pre-allocated).  But for
read-only images, even this should be something we can remember
from the previous time, rather than repeating every call.

2. We want to enhance the power of virDomainListGetStats, by
sharing code.  But we already have a virStorageSourcePtr for
each disk, and it would be easier to reuse the common structure
than to have to worry about the one-off virDomainBlockInfoPtr.

While this patch does not optimize reuse of information in point
1, it does get us closer to being able to do so; by updating a
structure that survives between consecutive calls.

* src/util/virstoragefile.h (_virStorageSource): Add physical, to
mirror virDomainBlockInfo; rearrange fields to match public struct.
(virStorageSourceCopy): Copy the new field.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Store into
storage source, then copy to block info.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-16 16:05:47 -07:00
Eric Blake
a20c3aafbe qemu: refactor blockinfo job handling
In order for a future patch to virDomainListGetStats to reuse
some code for determining disk usage of offline domains, we
need to make it easier to pull out part of the guts of grabbing
blockinfo.  The current implementation grabs a job fairly late
in the game, while getstats will already own a job; reordering
things so that the job is always grabbed up front in both
functions will make it easier to pull out the common code.
This patch results in grabbing a job in cases where one was not
previously needed, but as it is a query job, it should not be
noticeably slower.

This patch touches the same code as the fix for CVE-2014-6458
(commit b799259); in that patch, we avoided hotplug changing
a disk reference during the time of obtaining a monitor lock
by copying all data we needed and no longer referencing disk;
this patch goes the other way and ensures that by holding the
job, the disk cannot be changed so we no longer need to worry
about the disk being invalidated across the monitor lock.

* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Rearrange job
control to be outside of disk information.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-16 14:12:24 -07:00
Martin Kletzander
4d1e3943d6 qemu: Free saved error in qemuDomainSetVcpusFlags
Commit e3435caf added cleanup code to qemuDomainSetVcpusFlags() that was
not supposed to reset the error.  Usual procedure was done, saving the
error to temporary variable, but it was never free'd, but rather leaked.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-16 20:45:05 +01:00
Martin Kletzander
86759ec61a qemu: Add missing goto error in qemuRestoreCgroupState
Commit af2a1f05 tried clearly separating each condition in
qemuRestoreCgroupState() for the sake of readability, however somehow
one condition body was missing.  That means that the body of the next
condition got executed only if both of there were true, which is
impossible, thus resulting in a dead code and a logic error.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-16 20:44:33 +01:00
Martin Kletzander
e3435caf6a qemu: Fix hotplugging cpus with strict memory pinning
When hot-plugging a VCPU into the guest, kvm needs to allocate some data
from the DMA zone, which might be in a memory node that's not allowed in
cpuset.mems.  Basically the same problem as there was with starting the
domain and due to which commit 7e72ac7878
exists.  This patch just extends it to hotplugging as well.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1161540

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-16 11:15:27 +01:00
Martin Kletzander
af2a1f0587 qemu: Leave cpuset.mems in parent cgroup alone
Instead of setting the value of cpuset.mems once when the domain starts
and then re-calculating the value every time we need to change the child
cgroup values, leave the cgroup alone and rather set the child data
every time there is new cgroup created.  We don't leave any task in the
parent group anyway.  This will ease both current and future code.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-16 11:15:27 +01:00
Martin Kletzander
c74d58ad47 qemu: Save numad advice into qemuDomainObjPrivate
Thanks to that we don't need to drag the pointer everywhere and future
code will get cleaner.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-16 11:15:27 +01:00
Martin Kletzander
f801a81208 qemu: Remove unnecessary qemuSetupCgroupPostInit function
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-16 11:15:27 +01:00
Luyao Huang
98dee71759 qemu: Auto generate a controller when attach hostdev and chr device
https://bugzilla.redhat.com/show_bug.cgi?id=1174154

When we use attach-device add a hostdev or chr device which have a
iscsi address or others (just like guest agent, subsys iscsi disk...),
we will find there is no basic controller for our new attached device.
Somtimes this will make guest cannot start after we add them (although
they can start at the second time).

Signed-off-by: Luyao Huang <lhuang@redhat.com>
2014-12-15 16:24:01 +01:00
Laine Stump
44292e48a0 qemu: add/remove bridge fdb entries as guest CPUs are started/stopped
When libvirt is managing a bridge's forwarding database (FDB)
(macTableManager='libvirt'), if we add FDB entries for a new guest
interface even before the qemu process is created, then in the case of
a migration any other guest attached to the "destination" bridge will
have its traffic immediately sent to the destination of the migration
even while the source domain is still running (and the destination, of
course, isn't). To make sure that traffic from other guests on the new
host continues flowing to the old guest until the new one is ready, we
have to wait until the new guest CPUs are started to add the FDB
entries.

Conversely, we need to remove the FDB entries from the bridge any time
the guest CPUs are stopped; among other things, this will assure
proper operation during a post-copy migration (which is just the
opposite of the problem described in the previous paragraph).
2014-12-15 10:07:06 -05:00
Wang Rui
9603bce7b1 qemu: make persistent update of graphics device supported
We can change vnc password by using virDomainUpdateDeviceFlags API with
live flag. But it can't be changed with config flag. Error is reported as
below.

error: Operation not supported: persistent update of device 'graphics' is not supported

This patch supports the graphics arguments changed with config flag.

Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
2014-12-15 15:45:24 +01:00
Wang Rui
dec5f07b9e qemu: fix alignment of qemuDomainFindGraphics
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
2014-12-15 15:45:24 +01:00
Wang Rui
2609479b54 qemu: report properer error number when change graphics failed
It's not supported to change some graphics arguments with '--live'.
Replace some error code VIR_ERR_INTERNAL_ERROR and VIR_ERR_INVALID_ARG
with VIR_ERR_OPERATION_UNSUPPORTED.

Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
2014-12-15 15:45:24 +01:00
Michal Privoznik
311b4a677f qemu: Allow system pages to <memoryBacking/>
https://bugzilla.redhat.com/show_bug.cgi?id=1173507

It occurred to me that OpenStack uses the following XML when not using
regular huge pages:

  <memoryBacking>
    <hugepages>
      <page size='4' unit='KiB'/>
    </hugepages>
  </memoryBacking>

However, since we are expecting to see huge pages only, we fail to
startup the domain with following error:

  libvirtError: internal error: Unable to find any usable hugetlbfs
  mount for 4 KiB

While regular system pages are not huge pages technically, our code is
prepared for that and if it helps OpenStack (or other management
applications) we should cope with that.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-12-15 13:36:47 +01:00
Michal Privoznik
ca4f9518b8 virconf: Introduce VIR_CONF_ULONG
https://bugzilla.redhat.com/show_bug.cgi?id=1160995

In our config files users are expected to pass several integer values
for different configuration knobs. However, majority of them expect a
nonnegative number and only a few of them accept a negative number too
(notably keepalive_interval in libvirtd.conf).
Therefore, a new type to config value is introduced: VIR_CONF_ULONG
that is set whenever an integer is positive or zero. With this
approach knobs accepting VIR_CONF_LONG should accept VIR_CONF_ULONG
too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-12-15 10:34:18 +01:00
Laine Stump
c5a54917d5 qemu: add a qemuInterfaceStopDevices(), called when guest CPUs stop
We now have a qemuInterfaceStartDevices() which does the final
activation needed for the host-side tap/macvtap devices that are used
for qemu network connections. It will soon make sense to have the
converse qemuInterfaceStopDevices() which will undo whatever was done
during qemuInterfaceStartDevices().

A function to "stop" a single device has also been added, and is
called from the appropriate place in qemuDomainDetachNetDevice(),
although this is currently unnecessary - the device is going to
immediately be deleted anyway, so any extra "deactivation" will be for
naught. The call is included for completeness, though, in anticipation
that in the future there may be some required action that *isn't*
nullified by deleting the device.

This patch is a part of a more complete fix for:

  https://bugzilla.redhat.com/show_bug.cgi?id=1081461
2014-12-13 22:20:28 -05:00
Laine Stump
879c13d6cc qemu: always call qemuInterfaceStartDevices() when starting CPUs
The patch that added qemuInterfaceStartDevices() (upstream commit
82977058f5) had an extra conditional to
prevent calling it if the reason for starting the CPUs was
VIR_DOMAIN_RUNNING_UNPAUSED or VIR_DOMAIN_RUNNING_SAVE_CANCELED.  This
was put in by the author as the result of a reviewer asking if it was
necessary to ifup the interfaces in *all* occasions (because these
were the two cases where the CPU would have already been started (and
stopped) once, so the interface would already be ifup'ed).

It turns out that, as long as there is no corresponding
qemuInterfaceStopDevices() to ifdown the interfaces anytime the CPUs
are stopped, neglecting to ifup when reason is RUNNING_UNPAUSED or
RUNNING_SAVE_CANCELED doesn't cause any problems (because it just
happens that the interface will have already been ifup'ed by a prior
call when the CPU was previously started for some other reason).

However, it also doesn't *help*, and there will soon be a
qemuInterfaceStopDevices() function which *will* ifdown these
interfaces when the guest CPUs are stopped, and once that is done, the
interfaces will be left down in some cases when they should be up (for
example, if a domain is paused and then unpaused).

So, this patch is removing the condition in favor of always calling
qemuInterfaeStartDevices() when the guest CPUs are started.

This patch (and the aforementioned patch) resolve:

  https://bugzilla.redhat.com/show_bug.cgi?id=1081461
2014-12-13 21:44:45 -05:00
Francesco Romani
cb104ef734 qemu: bulk stats: Fix logic in monitor handling
A logic bug in qemuConnectGetAllDomainStats makes the code mark the
monitor as available when qemuDomainObjBeginJob fails, instead of when
it succeeds, as the correct flow requires.

This patch fixes the check and updates the code documentation
accordingly.

Broken by commit 57023c0a3a.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2014-12-11 11:02:05 +01:00
Matthew Rosato
82977058f5 network: Bring netdevs online later
Currently, MAC registration occurs during device creation, which is
early enough that, during live migration, you end up with duplicate
MAC addresses on still-running source and target devices, even though
the target device isn't actually being used yet.
This patch proposes to defer MAC registration until right before
the guest can actually use the device -- In other words, right
before starting guest CPUs.

Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: Laine Stump <laine@laine.org>
2014-12-10 15:09:01 -05:00
Wang Rui
6ee1c0ff67 maint: clean up the unused variable 'caps' in src/qemu/qemu_*.c
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
2014-12-10 11:21:31 +01:00
Martin Kletzander
57023c0a3a CVE-2014-8131: Fix possible deadlock and segfault in qemuConnectGetAllDomainStats()
When user doesn't have read access on one of the domains he requested,
the for loop could exit abruptly or continue and override pointer which
pointed to locked object.

This patch fixed two issues at once.  One is that domflags might have
had QEMU_DOMAIN_STATS_HAVE_JOB even when there was no job started (this
is fixed by doing domflags |= QEMU_DOMAIN_STATS_HAVE_JOB only when the
job was acquired and cleaning domflags on every start of the loop.
Second one is that the domain is kept locked when
virConnectGetAllDomainStatsCheckACL() fails and continues the loop when
it didn't end.  Adding a simple virObjectUnlock() and clearing the
pointer ought to do.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-10 09:11:57 +01:00
Peter Krempa
2bdcd29c71 qemu: migration: Unlock vm on failed ACL check in protocol v2 APIs
Avoid leaving the domain locked on a failed ACL check in
qemuDomainMigratePerform() and qemuDomainMigrateFinish2().

Introduced in commit abf75aea24 (Add ACL checks into the QEMU driver).
2014-12-09 10:10:24 +01:00
Laine Stump
4aae2ed6fb qemu: always use virDomainNetGetActualBridgeName to get interface's bridge
qemuNetworkIfaceConnect() used to have a special case for
actualType='network' (a network with forward mode of route, nat, or
isolated) to call the libvirt public API to retrieve the bridge being
used by a network. That is no longer necessary - since all network
types that use a bridge and tap device now get the bridge name stored
in the ActualNetDef, we can just always use
virDomainNetGetActualBridgeName() instead.

(an audit of the two callers to qemuNetworkIfaceConnect() confirms
that it is never called for any other type of network, so the dead
code in the else statement (logging an internal error if it is called
for any other type of network) is eliminated in the process.)
2014-12-08 14:50:50 -05:00
Laine Stump
7cb822c2a5 qemu: setup tap devices for macTableManager='libvirt'
When libvirt is managing the MAC table of a Linux host bridge, it must
turn off learning and unicast_flood for each tap device attached to
that bridge, then add a Forwarding Database (fdb) entry for the tap
device using the MAC address from the domain interface config.

Once we have disabled learning and flooding, any packet that has a
destination MAC address not present in the fdb will be dropped by the
bridge. This, along with the opportunistic disabling of promiscuous
mode[*], can result in enhanced network performance. and a potential
slight security improvement.

[*] If there is only one device on the bridge with learning/unicast_flood
enabled, then that device will automatically have promiscuous mode
disabled. If there are *no* devices with learning/unicast_flood
enabled (e.g. for a libvirt "route", "nat", or isolated network that
has no physical device attached), then all non-tap devices will have
promiscuous mode disabled (tap devices always have promiscuous mode
enabled, which may be a bug in the kernel, but in practice has 0
effect).

None of this has any effect for kernels prior to 3.15 (upstream kernel
commit 2796d0c648c940b4796f84384fbcfb0a2399db84 "bridge: Automatically
manage port promiscuous mode"). Even after that, until kernel 3.17
(upstream commit 5be5a2df40f005ea7fb7e280e87bbbcfcf1c2fc0 "bridge: Add
filtering support for default_pvid") traffic will not be properly
forwarded without manually adding vlan table entries. Unfortunately,
although the presence of the first patch is signalled by existence of
the "learning" and "unicast_flood" options in sysfs, there is no
reliable way to query whether or not the system's kernel has the
second of those patches installed, the only thing that can be done is
to try the setting and see if traffic continues to pass.
2014-12-08 14:49:09 -05:00
Eric Blake
7b499262cb getstats: add block.n.path stat
I'm about to make block stats optionally more complex to cover
backing chains, where block.count will no longer equal the number
of <disks> for a domain.  For these reasons, it is nicer if the
statistics output includes the source path (for local files).
This patch doesn't add anything for network disks, although we
may decide to add that later.

With this patch, I now see the following for the same domain as
in the previous patch (one qcow2 file, and an empty cdrom drive):
$ virsh domstats --block foo
Domain: 'foo'
  block.count=2
  block.0.name=hda
  block.0.path=/var/lib/libvirt/images/foo.qcow2
  block.1.name=hdc

* src/libvirt-domain.c (virConnectGetAllDomainStats): Document
new field.
* tools/virsh.pod (domstats): Document new field.
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Return the new
stat for local files/block devices.
(QEMU_ADD_NAME_PARAM): Add parameter.
(qemuDomainGetStatsInterface): Update caller.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-08 11:58:39 -07:00
Eric Blake
56b21dfe0c getstats: start giving offline block stats
I noticed that for an offline domain, 'virsh domstats --block $dom'
was producing just the domain name, with no stats.  But the older
'virsh domblkinfo' works just fine on offline domains.  This patch
starts to get us closer, by at least reporting the disk names for
an offline domain.

With this patch, I now see the following for an offline domain
with one qcow2 disk and an empty cdrom drive:
$ virsh domstats --block foo
Domain: 'foo'
  block.count=2
  block.0.name=hda
  block.1.name=hdc

* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Don't short-circuit
output of block name.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-08 11:55:12 -07:00
Eric Blake
2f61602edb getstats: avoid memory leak on OOM
qemuDomainGetStatsBlock() could leak a stats hash table if it
encountered OOM while populating the virTypedParameters.
Oddly, the fix doesn't even touch qemuDomainGetStatsBlock :)

* src/qemu/qemu_driver.c (QEMU_ADD_COUNT_PARAM)
(QEMU_ADD_NAME_PARAM): Don't return early.
(qemuDomainGetStatsInterface): Adjust caller.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-08 09:43:35 -07:00
Daniel P. Berrange
25bf888a66 Report original error when QMP probing fails with new QEMU
If probing capabilities via QMP fails, we now have a check
that prevents us falling back to -help parsing. Unfortunately
the error message

  "Failed to probe capabilities for /usr/bin/qemu-kvm:
   unsupported configuration: QEMU 2.1.2 is too new for help parsing"

is proving rather unhelpful to the user. We need to be telling
them why QMP failed (the root cause), rather than they can't
use -help (the side effect).

To do this we should capture stderr during QMP probing, and
if -help parsing then sees a new QEMU version, we know that
QMP should have worked, and so we can show the messages from
stderr. The message thus becomes

  "Failed to probe capabilities for /usr/bin/qemu-kvm:
   internal error: QEMU / QMP failed: Could not access
   KVM kernel module: No such file or directory
   failed to initialize KVM: No such file or directory"
2014-12-05 10:57:46 +00:00
Shanzhi Yu
d1e460136a qemu: snapshot: Forbid internal snapshot with passthrough devices
When attempting to create internal system checkpoint with a passthrough
device qemu will report the following error:

error: operation failed: Error -22 while writing VM

This patch calls the function to check if migration is possible with
given VM and thus improves the error to:

error: Requested operation is not valid: domain has assigned non-USB host devices

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=874418#c19
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2014-12-05 11:08:45 +01:00
Peter Krempa
38bde5776a qemu: process: Avoid uninitialized use two vars when reconnecting to vm
3ecebf0711 breaks the build as it adds a
way to jump to cleanup before the 'cfg' object is retrieved and 'priv'
is initialized.
2014-12-04 16:24:25 +01:00
Peter Krempa
3ecebf0711 qemu: process: Refactor reconnecting to qemu processes
Move entering the job into the thread to simplify the program flow. Also
as the code holds a separate reference to the domain object some
conditions can be simplified.

After this patch qemuDomainObjTransferJob is no longer needed so this
patch removes it.
2014-12-04 15:28:39 +01:00
Erik Skultety
fe3691f663 qemu: Fix virsh freeze when blockcopy storage file is removed
If someone removes blockcopy storage file when still in mirroring phase
and then requesting blockjob abort using pivot, virsh cmd freezes. This
is not an issue with older qemu versions which did not support
asynchronous jobs (which we prefer by default).
As we have reached the mirroring phase successfully, polling monitor for
blockjob info always returns 1 and the loop never ends.
This fix introduces a check for qemuDomainBlockPivot return code, possibly
skipping the asynchronous waiting completely, if an error occurred and
asynchronous waiting was the preferred method.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1139567
2014-12-04 09:05:59 +01:00
Peter Krempa
48a055607c qemu: driver: Reload snapshots and managedsaves prior to reconnecting
Reconnect to the VM is a possibly long-running job spawned in a separate
thread. We should reload the snapshot defs and managedsave state prior
to spawning the thread to avoid blocking of the daemon startup which
would serialize on the VM lock.

Also the reloading code would violate the domain job held while
reconnecting as the loader functions don't create jobs.
2014-12-03 18:50:22 +01:00
Michal Privoznik
cf54c60699 qemu_migration: Precreate missing storage
Based on previous commit, we can now precreate missing volumes. While
digging out the functionality from storage driver would be nicer, if
you've seen the code it's nearly impossible. So I'm going from the
other end:

1) For given disk target, disk path is looked up.
2) For the disk path, storage pool is looked up, a volume XML is
constructed and then passed to virStorageVolCreateXML() which has all
the knowledge how to create raw images, (encrypted) qcow(2) images,
etc.

One of the advantages of this approach is, we don't have to care about
image conversion - qemu does that for us. So for instance, users can
transform qcow2 into raw on migration (if the correct XML is passed to
the migration API).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-12-02 18:02:13 +01:00
Michal Privoznik
e1466dc7fa qemu_migration: Send disk sizes to the other side
Up 'til now, users need to precreate non-shared storage on migration
themselves. This is not very friendly requirement and we should do
something about it. In this patch, the migration cookie is extended,
so that <nbd/> section does not only contain NBD port, but info on
disks being migrated. This patch sends a list of pairs of:

    <disk target; disk size>

to the destination. The actual storage allocation is left for next
commit.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-12-02 17:51:57 +01:00
Michal Privoznik
a714533b2b qemuMonitorJSONBlockStatsUpdateCapacity: Don't skip disks
The function queries the block devices visible to qemu
('query-block') and parses the qemu's output. The info is
returned in a hash table which is expected to be pre-filled by
qemuMonitorJSONGetAllBlockStatsInfo(). However, in the next patch
we are not going to call the latter function at all, so we should
make the former function add devices into the hash table if not
found there.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-12-02 17:51:57 +01:00
John Ferlan
c8230c4ded Replace virDomainSnapshotFree with virObjectUnref
Since virDomainSnapshotFree will call virObjectUnref anyway, let's just use
that directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
2014-12-02 11:03:41 -05:00
John Ferlan
121c09a90b Replace virNetworkFree with virObjectUnref
Since virNetworkFree will call virObjectUnref anyway, let's just use that
directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
2014-12-02 11:03:40 -05:00
John Ferlan
8fb3aee2f8 Replace virDomainFree with virObjectUnref
Since virDomainFree will call virObjectUnref anyway, let's just use that
directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
2014-12-02 11:03:40 -05:00
Eduardo Costa
ff018e686a Fix race condition in qemuGetProcessInfo
There is a race condition between the fopen and fscanf calls
in qemuGetProcessInfo. If fopen succeeds, there is a small
possibility that the file no longer exists before reading from it.
Now, if either fopen or fscanf calls fail, the function will behave
just as only fopen had failed.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1169055

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-12-01 15:42:47 -07:00
John Ferlan
59802f23bc hotplug: Resolve Coverity FORWARD_NULL
Coverity complained that because the cfg->macFilter call checked
net->ifname != NULL before calling ebtablesRemoveForwardAllowIn, then
the virNetDevOpenvswitchRemovePort call should have the same check.

However, if I move the ebtables call prior to the check for TYPE_DIRECT
(where there is a VIR_FREE(net->ifname)), then it seems Coverity is
happy.  Since firewall info is tacked on last during setup, removing
it in the opposite order of initialization seems to be natural anyway
2014-12-01 11:07:31 -05:00
Luyao Huang
f8c1fb3d2e qemu: Make pid available for security managers in qemuProcessAttach
There are some small issue in qemuProcessAttach:

1.Fix virSecurityManagerGetProcessLabel always get pid = 0,
move 'vm->pid = pid' before call virSecurityManagerGetProcessLabel.

2.Use virSecurityManagerGenLabel to get image label.

3.Fix always set selinux label for other security driver label.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
2014-12-01 12:04:38 +01:00
Martin Kletzander
03caa543c2 conf: Add device-related code for panic devices
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1169183

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-01 12:01:27 +01:00
Martin Kletzander
bfeee8dee4 conf: Add device-related code for TPM devices
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1169183

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-01 12:01:27 +01:00
Erik Skultety
8e23e0e977 qemu: fix block{commit,copy} abort handling
When a block{commit,copy} job was aborted on a domain, block job handler
did not process it correctly, leaving a phantom job in the background.
Any further calls to any blockjob causes "block <jobtype> still active"
error. This patch fixes the blockjob handler so that it checks not only
for VIR_DOMAIN_BLOCK_JOB_FAILED status, but VIR_DOMAIN_BLOCK_JOB_CANCELED
status as well, followed by our existing cleanup routine.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1135169

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2014-12-01 10:09:03 +01:00
Wang Rui
111198210b qemu: set jobinfo type to FAILED if job is failed in qemuMigrationRun
If job is failed in qemuMigrationRun, we expect the jobinfo type as
FAILED. But jobinfo type won't be updated until entering
qemuMigrationWaitForCompletion. We should make it updated in all
conditions. Moreover, we can't use qemuMigrationUpdateJobStatus
here because job may fail in libvirt, so we can't query job status
from QEMU.

Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
2014-12-01 08:17:24 +01:00
Wang Rui
0b0cba4dba qemu: set jobinfo type to CANCELLED if migration is cancelled in all conditions
The migration job status is traced in qemuMigrationUpdateJobStatus
which is called in qemuMigrationRun. But if migration is cancelled
before the trace such as in qemuMigrationDriveMirror, the jobinfo
type won't be updated to CANCELLED. After this patch, we can get
jobinfo type CANCELLED if migration is cancelled during drive
mirror.  Moreover, we can't use qemuMigrationUpdateJobStatus
because from qemu's point of view it's just the drive mirror being
cancelled and the migration hasn't even started yet.

Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
2014-12-01 08:17:24 +01:00
Michal Privoznik
6085d917d5 qemu: Don't track quiesced state of FSs
https://bugzilla.redhat.com/show_bug.cgi?id=1160084

As of b6d4dad11b (1.2.5) we are trying to keep the status of FSFreeze
in the guest. Even though I've tried to fixed couple of corner cases
(6ea54769ba), it occurred to me just recently, that the approach is
broken by design. Firstly, there are many other ways to talk to
qemu-ga (even through libvirt) that filesystems can be thawed (e.g.
qemu-agent-command) without libvirt noticing. Moreover, there are
plenty of ways to thaw filesystems without even qemu-ga noticing (yes,
qemu-ga keeps internal track of FSFreeze status). So, instead of
keeping the track ourselves, or asking qemu-ga for stale state, it's
the best to let qemu-ga deal with that (and possibly let guest kernel
propagate an error).

Moreover, there's one bug with the following approach, if fsfreeze
command failed, we've executed fsthaw subsequently. So issuing
domfsfreeze in virsh gave the following result:

virsh # domfsfreeze gentoo
Froze 1 filesystem(s)

virsh # domfsfreeze gentoo
error: Unable to freeze filesystems
error: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance

virsh # domfsfreeze gentoo
Froze 1 filesystem(s)

virsh # domfsfreeze gentoo
error: Unable to freeze filesystems
error: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-11-28 11:22:24 +01:00
Jiri Denemark
9340528a7f Fix usage of virReportSystemError
virReportSystemError is reserved for reporting system errors, calling it
with VIR_ERR_* error codes produces error messages that do not make any
sense, such as

    internal error: guest failed to start: Kernel doesn't support user
    namespace: Link has been severed

We should prohibit wrong usage with a syntax-check rule.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2014-11-28 09:42:13 +01:00
Wang Rui
64b84911ce qemu: add the missing jobinfo type in qemuDomainGetJobInfo
Commit 6fcddfcd refactored job statistics but missed the jobinfo type updated
in qemuDomainGetJobInfo. After this patch, we can use virDomainGetJobInfo to
get jobinfo type again.

Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2014-11-25 14:40:19 +01:00
Pavel Hrdina
742d49fa17 qemu-command: introduce new vgamem attribute for QXL video device
Add attribute to set vgamem_mb parameter of QXL device for QEMU. This
value sets the size of VGA framebuffer for QXL device. Default value in
QEMU is 8MB so reuse it also in libvirt to not break things.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1076098

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2014-11-24 22:20:13 +01:00
Pavel Hrdina
24c6ca860e qemu-command: use vram attribute for all video devices
So far we didn't have any option to set video memory size for qemu video
devices. There was only the vram (ram for QXL) attribute but it was valid
only for the QXL video device.

To provide this feature to users QEMU has a dedicated device attribute
called 'vgamem_mb' to set the video memory size. We will use the 'vram'
attribute for setting video memory size for other QEMU video devices.

For the cirrus device we will ignore the vram value because it has
hardcoded video size in QEMU.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1076098

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2014-11-24 22:18:18 +01:00
Pavel Hrdina
f480a87aa6 caps: introduce new QEMU capability for vgamem_mb device property
Allow setting vgamem size for video devices.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1076098

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2014-11-24 22:05:56 +01:00
Pavel Hrdina
c32cfc6d3f QXL: fix setting ram and vram values for QEMU QXL device
QEMU has two different type of QXL display device. The first "qxl-vga"
is for primary video device and second "qxl" is for secondary video
device.

There are also two different ways how to specify those devices on qemu
command line, the first one and obsolete is using "-vga" option and the
current new one is using "-device" option. The "-vga" could be used only
to setup primary video device, so the "-vga qxl" equal to
"-device qxl-vga". Unfortunately the "-vga qxl" doesn't support setting
additional parameters for the device and "-global" option must be used
for this purpose. It's mandatory to use "-global qxl-vga...." to set the
parameters of primary video device previously defined with "-vga qxl".

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1076098

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2014-11-24 22:05:56 +01:00
Pavel Hrdina
81ba2298b2 video: cleanup usage of vram attribute and update documentation
The vram attribute was introduced to set the video memory but it is
usable only for few hypervisors excluding QEMU/KVM and the old XEN
driver. Only in case of QEMU the vram was used for QXL.

This patch updates the documentation to reflect current code in libvirt
and also changes the cases when we will set the default vram attribute.
It also fixes existing strange default value for VGA devices 9MB to 16MB
because the video ram should be rounded to power of two.

The change of default value could affect migrations but I found out that
QEMU always round the video ram to power of two internally so it's safe
to change the default value to the next closest power of two and also
silently correct every domain XML definition. And it's also safe because
we don't pass the value to QEMU.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1076098

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2014-11-24 22:05:55 +01:00
Tomoki Sekiyama
5c9cfa4976 qemu: Implement the qemu driver for virDomainGetFSInfo
Get mounted filesystems list, which contains hardware info of disks and its
controllers, from QEMU guest agent 2.2+. Then, convert the hardware info
to corresponding device aliases for the disks.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
2014-11-24 10:29:12 -05:00
Peter Krempa
b29f2436ac qemu: Emit the guest agent lifecycle event
Add code to emit the event on change of the channel state and reconnect
to the qemu process.
2014-11-24 15:50:59 +01:00
Peter Krempa
21c676c2aa qemu: process: Refresh virtio channel guest state when connecting to mon
Use data provided by "query-chardev" to refresh the guest frontend state
of virtio channels.
2014-11-24 08:58:30 +01:00
Peter Krempa
4d7eb90311 qemu: chardev: Extract more information about character devices
Improve the monitor function to also retrieve the guest state of
character device (if provided) so that we can refresh the state of
virtio-serial channels and perhaps react to changes in the state in
future patches.

This patch changes the returned data from qemuMonitorGetChardevInfo to
return a structure containing the pty path and the state for all the
character devices.

The change to the testsuite makes sure that the data is parsed
correctly.
2014-11-24 08:58:30 +01:00
Peter Krempa
b7d1bee2b9 storage: rbd: Implement support for passing config file option
To be able to express some use cases of the RBD backing with libvirt, we
need to be able to specify a config file for the RBD client to qemu as
that is one of the commonly used options.
2014-11-21 14:37:03 +01:00
Peter Krempa
0255660658 storage: rbd: qemu: Add support for specifying internal RBD snapshots
Some storage systems have internal support for snapshots. Libvirt should
be able to select a correct snapshot when starting a VM.

This patch adds a XML element to select a storage source snapshot for
the RBD protocol which supports this feature.
2014-11-21 14:37:02 +01:00
Peter Krempa
5604c056bf util: split out qemuParseRBDString into a common helper
To allow reuse this non-trivial parser code in the backing store parser
this part of the command line parser needs to be split out into a
separate funciton.
2014-11-21 14:37:02 +01:00
Peter Krempa
dc0175f535 qemu: Refactor qemuBuildNetworkDriveURI to take a virStorageSourcePtr
Instead of splitting out various fields, pass the complete structure and
let the function pick various things of it.

As one of the callers isn't using virStorageSourcePtr to store the data,
this patch adds glue code that fills the data into a dummy
virStorageSourcePtr before calling the func.

This change will help when adding new fields that need output processing
in the future.
2014-11-21 14:37:02 +01:00
Peter Krempa
15bbaaf014 qemu: Add handling for VSERPORT_CHANGE event
New qemu added a new event that is emitted when a virtio serial channel
is opened in the guest OS. This allows us to update the state of the
port in the output-only XML element.

This patch implements the monitor callbacks and necessary handlers to
update the state in the definition.
2014-11-21 11:00:11 +01:00
Peter Krempa
e9a4506963 qemu: monitor: Rename and improve qemuMonitorGetPtyPaths
To unify future additions that require information from "query-chardev"
rename qemuMonitorGetPtyPaths and friends to qemuMonitorGetChardevInfo
and move the allocation of the returned hash into the top level
function.
2014-11-21 11:00:10 +01:00
Peter Krempa
6692ba731b qemu: process: report useful error if alias formatting fails
When retrieving the paths for PTY devices the alias gets formatted into
a static string. If it doesn't fit we wouldn't report an error.
2014-11-21 11:00:10 +01:00
Peter Krempa
7e130e8b35 storage: qemu: Fix security labelling of new image chain elements
When creating a disk image snapshot the libvirt code would blindly copy
the parents label to the newly created image. This runs into problems
when you start a VM from an image hosted on NFS (or other storage system
that doesn't support selinux labels) and the snapshot destination is on
a storage system that does support selinux labels. Libvirt's code in
that case generates a different security label for the image hosted on
NFS. This label is valid only for NFS images and doesn't allow access in
case of a locally stored image.

To fix this issue libvirt needs to refrain from copying security
information in cases where the default domain seclabel is a better
choice.

This patch repurposes the now unused @force argument of
virStorageSourceInitChainElement to denote whether a copy of the
security labelling stuff should be attempted or not. This allows to
fine-control the copy operation for cases where we need to keep the
label of the old disk vs. the cases where we need to keep the label
unset to use the default domain imagelabel.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1151718
2014-11-21 09:28:26 +01:00
Jiri Denemark
800454e45e qemu: Really fix crash in tunnelled migration
Oops, I forgot to squash one more instance of the same check in the
previous commit (v1.2.10-144-g52691f9).

https://bugzilla.redhat.com/show_bug.cgi?id=1147331
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2014-11-20 13:51:08 +01:00
Jiri Denemark
52691f99fa qemu: Fix crash in tunnelled migration
Any attempt to start a tunnelled migration with libvirtd that supports
RDMA migration (specifically commit v1.2.8-226-ged22a47) crashes
libvirtd on the destination host.

The crash is inevitable because qemuMigrationPrepareAny is always called
with NULL protocol in case of tunnelled migration.

https://bugzilla.redhat.com/show_bug.cgi?id=1147331
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2014-11-20 13:22:20 +01:00
Michal Privoznik
36148120c1 qemu: Drop OVMF whitelist
As discussed on the upstream list, it's better not to make this
kind of predictions in libvirt. It may happen that qemu learns
how to enable OVMF on other architectures too and we shouldn't
try to chase that.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-11-19 18:16:12 +01:00
Michal Privoznik
6d8054b684 qemu: Support OVMF on armv7l aarch64 guests
Currently, we are whitelisting architectures, that we know how to run
OVMF on. So far, only x86_64 was enabled. However, looking at qemu
code, the same commandline can be used to enable OVMF for armv7l and
aarch64.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-11-19 17:31:07 +01:00
Eric Blake
eb9093763f maint: forbid 'int foo = true'
I noticed this while working on qemuDomainGetBlockInfo.  Assigning
a bool value to an int variable compiles fine, but raises red flags
on the maintenance front as it becomes too easy to assign -1 or 2
or any other non-bool value to the same variable.

* cfg.mk (sc_prohibit_int_assign_bool): New rule.
* src/conf/snapshot_conf.c (virDomainSnapshotRedefinePrep): Fix
offenders.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo)
(qemuDomainSnapshotCreateXML): Likewise.
* src/test/test_driver.c (testDomainSnapshotAlignDisks):
Likewise.
* src/util/vircgroup.c (virCgroupSupportsCpuBW): Likewise.
* src/util/virpci.c (virPCIDeviceBindToStub): Likewise.
* src/util/virutil.c (virIsCapableVport): Likewise.
* tools/virsh-domain-monitor.c (cmdDomMemStat): Likewise.
* tools/virsh-domain.c (cmdBlockResize, cmdScreenshot)
(cmdInjectNMI, cmdSendKey, cmdSendProcessSignal)
(cmdDetachInterface): Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-11-19 08:20:39 -07:00
Anirban Chakraborty
22cff52a2b network: Add network bandwidth support to ethernet interfaces
Ethernet interfaces in libvirt currently do not support bandwidth setting.
For example, following xml file for an interface will not apply these
settings to corresponding qdiscs.

    <interface type="ethernet">
      <mac address="02:36:1d:18:2a:e4"/>
      <model type="virtio"/>
      <script path=""/>
      <target dev="tap361d182a-e4"/>
      <bandwidth>
        <inbound average="984" peak="1024" burst="64"/>
        <outbound average="2000" peak="2048" burst="128"/>
      </bandwidth>
    </interface>

Signed-off-by: Anirban Chakraborty <abchak@juniper.net>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-11-19 10:36:49 +01:00
John Ferlan
121fc4f9f3 qemu: Resolve Coverity UNINIT
For some reason, commit id '72b4151f' triggered a Coverity uninitialized
'reply' variable check when referenced within the for loop.

It seems Coverity doesn't know that flags will have to be either AFFECT_LIVE
or AFFECT_CONFIG after the virDomainLiveConfigHelperMethod call.

By adding a "sa_assert()" to confirm that fact, Coverity is happy again.
2014-11-15 08:09:53 -05:00
Luyao Huang
72b4151f85 qemu: Fix get blkiodevtune for a disk that has been hot unplugged
https://bugzilla.redhat.com/show_bug.cgi?id=1164080

After a disk is hotunplugged a subsequent call to qemuDomainGetBlockIoTune
to get the --config settings of that disk will fail because the disk is no
longer found by qemuDiskPathToAlias causing an unexpected failure.

Since only the --live flag needs to have the disk device pointer, move the
fetch inside the (flags & VIR_DOMAIN_AFFECT_LIVE) condition. This will also
affect the results if no flags are provided or the --current flag is provided.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
2014-11-14 17:30:55 -05:00
John Ferlan
a01eea3020 qemu: Add checks for blkdeviotune 'size_iops_sec' and adjust error
Seems the 'size_iops_sec' was a late add and the checks for whether
the field was defined, but unsupported and the maximum size of the
field were not being made.

Also, adjust blkdeviotune support error message for grammar, spelling
(paramater), and remove the "(need QEMU 1.7 or superior)".  None of
our other similar error messages list which QEMU version is required.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2014-11-14 11:57:03 -05:00
Martin Kletzander
5cca4cd16f Remove unnecessary curly brackets in src/qemu/
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-11-14 17:13:01 +01:00
Jiri Denemark
ae3e29e6e7 qemu: Don't try to parse -help for new QEMU
Since QEMU 1.2.0, we switched to QMP probing instead of parsing -help
(and other commands, such as -cpu ?) output. However, if QMP probing
failed, we still tried starting QEMU with various options and parsing
the output, which was guaranteed to fail because the output changed.
Let's just refuse parsing -help for QEMU >= 1.2.0.

https://bugzilla.redhat.com/show_bug.cgi?id=1160318
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2014-11-13 21:25:50 +01:00
Jiri Denemark
ab393383c8 qemu: Always set migration capabilities
We used to set migration capabilities only when a user asked for them in
flags. This is fine when migration succeeds since the QEMU process is
killed in the end but in case migration fails or if it's cancelled, some
capabilities may remain turned on with no way to turn them off. To fix
that, migration capabilities have to be turned on if requested but
explicitly turned off in case they were not requested but QEMU supports
them.

https://bugzilla.redhat.com/show_bug.cgi?id=1163953
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2014-11-13 20:33:28 +01:00
Pavel Hrdina
41127244fb nwfilter: fix deadlock caused updating network device and nwfilter
Commit 6e5c79a1 tried to fix deadlock between nwfilter{Define,Undefine}
and starting of guest, but this same deadlock exists for
updating/attaching network device to domain.

The deadlock was introduced by removing global QEMU driver lock because
nwfilter was counting on this lock and ensure that all driver locks are
locked inside of nwfilter{Define,Undefine}.

This patch extends usage of virNWFilterReadLockFilterUpdates to prevent
the deadlock for all possible paths in QEMU driver. LXC and UML drivers
still have global lock.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1143780

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2014-11-13 10:45:19 +01:00
Michal Privoznik
54ddc08ddb qemuPrepareNVRAM: Save domain conf only if domain's persistent
In one of my previous patches (3a3c3780b) I've tried to fix the
problem of nvram path disappearing on a domain that's been
started and shut down again. I fixed this by explicitly saving
domain's config file.  However, I did a bit of clumsy without
realizing we have a transient domains for which we don't save the
config file. Hence, any domain using UEFI became persistent.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-11-13 09:35:25 +01:00
Matthias Gatto
6c1347ec06 qemu: Resolve Coverity DEADCODE.
reported here: http://www.redhat.com/archives/libvir-list/2014-November/msg00327.html

I could have just remove bool supportMaxOptions variable, but
if I had do this, we could not check anymore if the nparams variable is
superior to QEMU_NB_BLOCK_IO_TUNE_PARAM_MAX.

v2: change following this proposal:
http://www.redhat.com/archives/libvir-list/2014-November/msg00379.html
2014-11-12 09:43:55 -05:00
Matthias Gatto
5fb007b035 qemu: Fix copy_paste_error in qemuBuildDriveStr.
Fix for this: http://www.redhat.com/archives/libvir-list/2014-November/msg00324.html

Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
2014-11-12 09:43:49 -05:00
Ján Tomko
cce8e5f739 Display nicer error message for unsupported chardev hotplug
Use the device type name if we know it instead of its number,
even if we can't hotplug it:
qemuMonitorJSONAttachCharDevCommand:6094 : operation failed: Unsupported
char device type '10'
2014-11-11 14:21:08 +01:00
Wang Rui
c6e9024867 qemu: fix domain startup failing with 'strict' mode in numatune
If the memory mode is specified as 'strict' and with one node, we
get the following error when starting domain.

error: Unable to write to '$cgroup_path/cpuset.mems': Device or resource busy

XML is configured with numatune as follows:
  <numatune>
    <memory mode='strict' nodeset='0'/>
  </numatune>

It's broken by Commit 411cea638f
which moved qemuSetupCgroupForEmulator() before setting cpuset.mems
in qemuSetupCgroupPostInit.

Directory '$cgroup_path/emulator/' is created in qemuSetupCgroupForEmulator.
But '$cgroup_path/emulator/cpuset.mems' it not set and has a default value
(all nodes, such as 0-1). Then we setup '$cgroup_path/cpuset.mems' to the
nodemask (in this case it's '0') in qemuSetupCgroupPostInit. It must fail.

This patch makes '$cgroup_path/emulator/cpuset.mems' is set before
'$cgroup_path/cpuset.mems'. The action is similar with that in
qemuDomainSetNumaParamsLive.

Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
2014-11-11 12:14:09 +01:00
Wang Rui
38a0f6df64 qemu: don't setup cpuset.mems if memory mode in numatune is not 'strict'
If the memory mode in numatune is specified as 'preferred' with one node
(such as nodeset='0'), domain's memory is not all in node 0 absolutely.
Assumption that node 0 doesn't have enough memory, memory can be allocated
on node 1 when qemu process startup. Then if we set cpuset.mems to '0',
it may invoke OOM.

Commit 1a7be8c600 changed the former logic of
checking memory mode in virDomainNumatuneGetNodeset. This patch adds the
check as before.

Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
2014-11-11 12:14:09 +01:00
Matthias Gatto
12952bb14a qemu: Add bps_max and friends to qemu command generation
Check the arability of the options with the current qemu binary,
add them in the varable opt if yes, print a message if not.

Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
2014-11-10 17:19:25 +01:00
Matthias Gatto
901ffda286 qemu: Add bps_max and friends QMP suport
Detect if the the qemu binary currently in use support the bps_max option,
If yes add it to the command, if not, just ignore the option.
We don't print error here, because the check for invalide arguments
has alerady been made in qemu_driver.c

Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
2014-11-10 17:19:25 +01:00
Matthias Gatto
d506a51aeb qemu: Add bps_max and friends qemu driver
Add support for bps_max and friends in the driver part.
In the part checking if a qemu is running, check if the running binary
support bps_max, if not print an error message, if yes add it to
"info" variable

Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-11-10 17:18:17 +01:00
Matthias Gatto
c5b71619bd qemu: Add Qemu capability for bps_max and friends
Add the capability to detect if the qemu binary have the capability
to use bps_max and friends
Add a value in the enum virQEMUCapsFlags for the qemu capability.
Set it with virQEMUCapsSet if the binary suport bps_max and they friends.

Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
2014-11-10 15:48:59 +01:00
Prerna Saxena
addce06c92 PowerPC : Add support for launching VM in 'compat' mode.
PowerISA allows processors to run VMs in binary compatibility ("compat")
mode supporting an older version of ISA. QEMU has recently added support to
explicitly denote a VM running in compatibility mode through commit 6d9412ea
& 8dfa3a5e85. Now, a "compat" mode VM can be run by invoking this qemu
commandline on a POWER8 host:  -cpu host,compat=power7.

This patch allows libvirt to exploit cpu mode 'host-model' to describe this
new mode for PowerKVM guests. For example, when a user wants to request a
power7 vm to run in compatibility mode on a Power8 host, this can be
described in XML as follows :

  <cpu mode='host-model'>
    <model>power7</model>
  </cpu>

Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
2014-11-07 09:18:50 +01:00
Prerna Saxena
da636d83dc Cpu: Add support for Power LE Architecture.
This adds support for PowerPC Little Endian architecture.,
and allows libvirt to spawn VMs based on 'ppc64le' architecture.

Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2014-11-07 09:16:37 +01:00
Michal Privoznik
6ea54769ba qemu: Update fsfreeze status on domain state transitions
https://bugzilla.redhat.com/show_bug.cgi?id=1160084

As of b6d4dad1 (1.2.5) libvirt keeps track if domain disks have been
frozen. However, this falls into that set of information which don't
survive domain restart. Therefore, we need to clear the flag upon some
state transitions. Moreover, once we clear the flag we must update the
status file too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-11-06 15:20:01 +01:00
Boris Fiuczynski
b84be34f43 qemu: Allow use of iothreads for virtio ccw disk definitions
Extending the iothread disk support from pci to pci and ccw.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-06 15:13:55 +01:00
Boris Fiuczynski
8402be5c10 qemu: Correct disk type checking logic for iothreads
Finding the right type of disk should check for virtio as bus and
pci as device address type.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2014-11-06 15:13:55 +01:00
Martin Kletzander
c63ef0452b numa: split util/ and conf/ and support non-contiguous nodesets
This is a reaction to Michal's fix [1] for non-NUMA systems that also
splits out conf/ out of util/ because libvirt_util shouldn't require
libvirt_conf if it is the other way around.  This particular use case
worked, but we're trying to avoid it as mentioned [2], many times.

The only functions from virnuma.c that needed numatune_conf were
virDomainNumatuneNodesetIsAvailable() and virNumaSetupMemoryPolicy().
The first one should be in numatune_conf as it works with
virDomainNumatune, the second one just needs nodeset and mode, both of
which can be passed without the need of numatune_conf.

Apart from fixing that, this patch also fixes recently added
code (between commits d2460f85^..5c8515620) that doesn't support
non-contiguous nodesets.  It uses new function
virNumaNodesetIsAvailable(), which doesn't need a stub as it doesn't use
any libnuma functions, to check if every specified nodeset is available.

[1] https://www.redhat.com/archives/libvir-list/2014-November/msg00118.html
[2] http://www.redhat.com/archives/libvir-list/2011-June/msg01040.html

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-11-06 15:13:55 +01:00
Erik Skultety
74ae5be44e qemu: revert patch - bandwidth tuning in session mode
Since there was a valid note to patch 43b67f2e about the best spot to
check for bandwidth set call while having libvirt daemon run in session
mode, this patch reverts previous changes dealing with bandwith
(also reverts adding variable @cfg in qemuDomainGetNumaParameters which
 does not have any use at the moment, but getting and unreferencing
 driver's config) in qemu_driver.c and qemu_command.c. There will be
another patch in the series which introduces the fix itself.
2014-11-06 14:28:37 +01:00
Ján Tomko
1d1c5ecd13 Free job statistics from the migration cookie
==404== 232 bytes in 1 blocks are definitely lost in loss record 669 of 758
==404==    at 0x4C2B934: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==404==    by 0x52A2BF3: virAlloc (viralloc.c:144)
==404==    by 0x1D49AD70: qemuMigrationCookieAddStatistics (qemu_migration.c:554)
==404==    by 0x1D49AD70: qemuMigrationBakeCookie (qemu_migration.c:1228)
==404==    by 0x1D4A43B8: qemuMigrationFinish (qemu_migration.c:5002)
==404==    by 0x1D4C9339: qemuDomainMigrateFinish3Params (qemu_driver.c:11526)

Introduced by commit 5d6fb96
2014-11-06 13:52:33 +01:00
Michal Privoznik
11e058ca58 qemuDomainUpdateDeviceConfig: Allow startupPolicy update
https://bugzilla.redhat.com/show_bug.cgi?id=1159219

Users might want to update startupPolicy via the
virDomainUpdateDeviceFlags API too. This patch
implements the feature on config layer.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-11-05 18:34:08 +01:00
Prerna Saxena
d426431fde Memory: Use consistent type for all memory elements.
Domain memory elements such as max_balloon and cur_balloon are
implemented as 'unsigned long long', whereas the 'memory' element
in NUMA cells is implemented as 'unsigned int'.

Use the same data type (unsigned long long) for 'memory' element
in NUMA cells.

Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
2014-11-05 14:21:15 +01:00
Weiwei Li
c3012a023f qemu: stop NBD server after successful migration
In qemuMigrationFinish mig->nbd can not be initialized by
qemuMigrationEatCookie without the QEMU_MIGRATION_COOKIE_NBD flag.
That causes qemuMigrationStopNBDServer to return early without
stopping the NBD server properly.

Signed-off-by: Weiwei Li <nuonuoli@tencent.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2014-11-04 10:54:53 +01:00
Chen Fan
902864184e numatune: add check for numatune nodeset range
There was no check for 'nodeset' attribute in numatune-related
elements.  This patch adds validation that any nodeset specified does
not exceed maximum host node.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
2014-11-04 07:03:36 +01:00
Martin Kletzander
b629c64e5e qemu: avoid rare race when undefining domain
When one domain is being undefined and at the same time started, for
example, there is a possibility of a rare problem occuring.

 - Thread 1 does virDomainUndefine(), has the lock, checks that the
   domain is active and because it's not, calls
   virDomainObjListRemove().

 - Thread 2 does virDomainCreate() and tries to lock the domain.

 - Thread 1 needs to lock domain list in order to remove the domain from
   it, but must unlock domain first (proper order is to lock domain list
   first and the domain itself second).

 - Thread 2 grabs the lock, starts the domain and releases the lock.

 - Thread 1 grabs the lock and removes the domain from list.

With this patch:

 - qemuDomainRemoveInactive() creates a QEMU_JOB_MODIFY if that's
   possible, but since it must remove the domain from list either way,
   it continues even when starting the job failed.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1150505

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-11-03 16:43:23 +01:00
Martin Kletzander
278bf0acbf qemu: improve error message for invalid blkiotune settings
Before:
  $ virsh blkiotune dummy --device-read-bytes-sec /dev/sda,-1
  error: Unable to change blkio parameters
  error: invalid argument: unable to parse blkio device
  'device_read_bytes_sec' '/dev/sda,-1'

After:
  $ virsh blkiotune dummy --device-read-bytes-sec /dev/sda,-1
  error: Unable to change blkio parameters
  error: invalid argument: invalid value '-1' for parameter
  'device_read_bytes_sec' of device '/dev/sda'

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1131306

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-11-03 16:43:23 +01:00
Martin Kletzander
0ed1b55b20 qemu: make sure capability probing process can start
When daemon is killed right in the middle of probing a qemu binary for
its capabilities, the qemu process is left running.  Next time the
daemon is starting, it cannot start the probing qemu process because the
one that's already running does have the pidfile flock()'d.

Reported-by: Wang Yufei <james.wangyufei@huawei.com>

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-11-03 16:43:23 +01:00
Martin Kletzander
11a48758a7 qemu: make advice from numad available when building commandline
Particularly in qemuBuildNumaArgStr(), there was a need for the advice
due to memory backing, which needs to know the nodeset it will be pinned
to.  With newer qemu this caused the following error when starting
domain:

  error: internal error: Advice from numad is needed in case of
  automatic numa placement

even when starting perfectly valid domain, e.g.:

  ...
  <vcpu placement='auto'>4</vcpu>
  <numatune>
    <memory mode='strict' placement='auto'/>
  </numatune>
  <cpu>
    <numa>
      <cell id='0' cpus='0' memory='524288'/>
      <cell id='1' cpus='1' memory='524288'/>
    </numa>
  </cpu>
  ...

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1138545

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-11-03 16:43:22 +01:00
Pavel Hrdina
e7e05801e5 hotplug: fix char device detach
Hotplugging and hotunplugging char devices is only supported through
'-device' and the check for device capability should be independently.

Coverity also complains about 'tmpChr->info.alias' could be NULL and we
are dereferencing it but it somehow only in this case don't recognize
that the value is set by 'qemuAssignDeviceChrAlias' so it's clearly
false positive. Add sa_assert to make coverity happy.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2014-11-01 16:18:34 +01:00
weiwei li
be598c5ff8 qemu: Release nbd port from migrationPorts instead of remotePorts
commit 3e1e16aa8d (Use a port from the
migration range for NBD as well) changed ndb port allocation from
remotePorts to migrationPorts, but did not change the port releasing
process, which makes an error when migrating several times (above 64):
error: internal error: Unable to find an unused port in range
'migration' (49152-49215)

https://bugzilla.redhat.com/show_bug.cgi?id=1159245

Signed-off-by: Weiwei Li <nuonuoli@tencent.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2014-10-31 12:20:06 +01:00
Ján Tomko
4abcf04e7c Reject live update of offloading options
https://bugzilla.redhat.com/show_bug.cgi?id=1155441
2014-10-30 13:32:00 +01:00
Eric Blake
00331bfbc9 qemu: better error message when block job can't succeed
https://bugzilla.redhat.com/show_bug.cgi?id=1140981 reports that
the qemu-kvm shipped as part of RHEL 7.0 intentionally[1] cripples
block jobs by removing the 'block-stream' QMP command, while still
leaving 'block-job-cancel' as an unusable no-op.  Meanwhile, we
already had existing code that checked whether block jobs were
completely missing (such as qemu 0.15), old style (cancel is
synchronous, and all commands spelled with '_'), or new style
(cancel is asynchronous, and all commands spelled with '-'), and
used that three-way probe to give decent error messages.  At the
time that code was added, all existing qemu versions fell in one
of three buckets, and the code was using the presence of
'block-job-cancel' as the witness of which of the three buckets.
But now that RHEL qemu has shipped with intentionally crippled
'block-stream', we have a fourth bucket, which results in ugly
error messages when trying 'virsh blockpull':

 error: Requested operation is not valid: Command 'block-stream' is not found

In reality, the fourth bucket should be treated the same as the
first bucket (no block job support); we can do that by realizing
that no existing build of qemu has working block-stream while
lacking block-job-cancel, so it is easiest to change our witness
to the command that starts a job rather than ends one.  We still
act correctly regarding command spelling and whether cancel is
asynchronous.  And on crippled RHEL builds, we now get the desired:

 error: unsupported configuration: block jobs not supported with this qemu binary

[1] The intentional cripple is limited to qemu-kvm of RHEL; when using
qemu-kvm-rhev of RHEV, block job functionality is supported.  Don't ask
me to explain the "why" behind it all - I'm just dealing with fallout
from someone else's decision.

* src/qemu/qemu_capabilities.h (QEMU_CAPS_BLOCKJOB_SYNC): Tweak comment.
* src/qemu/qemu_capabilities.c (virQEMUCapsCommands): Look for stream
rather than cancel when determining the flavor of block jobs supported.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-10-29 14:57:44 -06:00
Eric Blake
85f2d0dd55 maint: add syntax check to prohibit static zero init
Now that all offenders have been cleaned, turn on a syntax-check
rule to prevent future offenders.

* cfg.mk (sc_prohibit_static_zero_init): New rule.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Avoid false
positive.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-10-29 09:55:09 -06:00
John Ferlan
e3a52afcfc qemu-attach: Assign device aliases
https://bugzilla.redhat.com/show_bug.cgi?id=1141621

As part of attach processing, assign the device aliases by calling
qemuAssignDeviceAliases during qemuDomainQemuAttach once all the devices
are found after the qemuParseCommandLinePid processing.

This will alleviate a symptom that caused a libvirtd crash during an
attempted device detach.
2014-10-28 21:12:08 -04:00
John Ferlan
96af61ddc1 hotplug: Check for alias in net detach
https://bugzilla.redhat.com/show_bug.cgi?id=1141621

If the QEMU_CAPS_DEVICE is set, then ensure the host device alias has
been properly set before making the calls to detach the device
2014-10-28 21:12:08 -04:00
John Ferlan
4d8a4165a7 hotplug: Check for alias in chrdev detach
If the QEMU_CAPS_DEVICE is set, then ensure the chr device alias has
been properly set before making the calls to detach the device
2014-10-28 21:12:08 -04:00
John Ferlan
9de26f27cf hotplug: Check for alias in hostdev detach
If the QEMU_CAPS_DEVICE is set, then ensure the host device alias has
been properly set before making the calls to detach the device
2014-10-28 21:12:08 -04:00
John Ferlan
5d02a9a0c5 hotplug: Check for alias in disk detach
If the QEMU_CAPS_DEVICE is set, then ensure the disk device alias has
been properly set in prior to making the calls to detach the device.
2014-10-28 21:12:08 -04:00
John Ferlan
65be7572d2 hotplug: Check for alias in controller detach
In qemuDomainDetachControllerDevice if the info.alias already exists
a call to qemuAssignDeviceControllerAlias would overwrite the existing
so avoid this possibility.
2014-10-28 21:12:08 -04:00
Michal Privoznik
b7fe5a6555 qemu_agent: Produce more readable error messages
Not every error message from qemu-ga has to have the 'class' field
filled out. For instance, I've seen this error message lately:

  qemuAgentCheckError:1047 : unable to execute QEMU agent command \
  {"execute":"guest-set-time"}: \
  {"error":{"desc":"Invalid parameter type, expected: integer"}}

However, this got translated into rather generic error message:

  internal error: unable to execute QEMU agent command
  'guest-set-time': unknown QEMU command error

So we've dropped better error message in favor of a generic one.
This is due to our code which expects 'class' which is not
present here.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-10-28 20:06:27 +01:00
Tony Krowiak
d70cc1fa72 qemu: change macvtap multicast list in response to NIC_RX_FILTER_CHANGED
This patch adds functionality to processNicRxFilterChangedEvent().
The old and new multicast lists are compared and the filters in
the macvtap are programmed to match the guest's filters.

Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
2014-10-28 14:14:25 -04:00
Eric Blake
2086a9905a qemu: forbid snapshot-delete --children-only on external snapshot
https://bugzilla.redhat.com/show_bug.cgi?id=956506 documents that
given a domain where an internal snapshot parent has an external
snapshot child, we lacked a safety check when trying to use the
--children-only option to snapshot-delete:

$ virsh start dom
$ virsh snapshot-create-as dom internal
$ virsh snapshot-create-as dom external --disk-only
$ virsh snapshot-delete dom external
error: Failed to delete snapshot external
error: unsupported configuration: deletion of 1 external disk snapshots not supported yet
$ virsh snapshot-delete dom internal --children
error: Failed to delete snapshot internal
error: unsupported configuration: deletion of 1 external disk snapshots not supported yet
$ virsh snapshot-delete dom internal --children-only
Domain snapshot internal children deleted

While I'd still like to see patches that actually do proper external
snapshot deletion, we should at least fix the inconsistency in the
meantime.  With this patch:

$ virsh snapshot-delete dom internal --children-only
error: Failed to delete snapshot internal
error: unsupported configuration: deletion of 1 external disk snapshots not supported yet

* src/qemu/qemu_driver.c (qemuDomainSnapshotDelete): Fix condition.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-10-27 14:04:47 -06:00
Daniel P. Berrange
931dff992e Rename virDriver to virHypervisorDriver
To prepare for introducing a single global driver, rename the
virDriver struct to virHypervisorDriver and the registration
API to virRegisterHypervisorDriver()
2014-10-23 11:09:54 +01:00
Erik Skultety
43b67f2e71 qemu: Disallow NUMA/network tuning for session mode
Tuning NUMA or network interface parameters requires root
privileges to manage cgroups. Thus an attempt to set some of these
parameters in session mode on a running domain should be invalid
followed by an error. An example might be memory tuning which raises
an error in such case.

The following behavior in session mode will be present after applying
this patch:

  Tuning  |      SET      |   GET  |
----------|---------------|--------|
NUMA      | shut off only | always |
Memory    |     never     | never  |
Interface |     never     | always |

Resolves https://bugzilla.redhat.com/show_bug.cgi?id=1126762
2014-10-22 14:35:06 -04:00
Peter Krempa
19b1ee42b4 qemu: migration: Make check for empty hook XML robust
Also consider whitespace only strings returned from the hook as empty
result.
2014-10-22 17:51:31 +02:00
Peter Krempa
e386779937 qemu: restore: Fix restoring of VM when the restore hook returns empty XML
The documentation for the restore hook states that returning an empty
XML is equivalent with copying the input. There was a bug in the code
checking the returned string by checking the string instead of the
contents. Use the new helper to check if the string is empty.
2014-10-22 17:51:31 +02:00
Martin Kletzander
9661ac2f46 qemu: unref cfg after TerminateMachine has been called
Commit 4882618ed1 added the code that
requests driver cfg, but forgot to unref it.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-10-21 13:54:09 +02:00
Lubomir Rintel
afe8f4200f qemu: x86_64 is good enough for i686
virt-manager on Fedora sets up i686 hosts with "/usr/bin/qemu-kvm" emulator,
which in turn unconditionally execs qemu-system-x86_64 querying capabilities
then fails:

Error launching details: invalid argument: architecture from emulator 'x86_64' doesn't match given architecture 'i686'

Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/engine.py", line 748, in _show_vm_helper
    details = self._get_details_dialog(uri, vm.get_connkey())
  File "/usr/share/virt-manager/virtManager/engine.py", line 726, in _get_details_dialog
    obj = vmmDetails(conn.get_vm(connkey))
  File "/usr/share/virt-manager/virtManager/details.py", line 399, in __init__
    self.init_details()
  File "/usr/share/virt-manager/virtManager/details.py", line 784, in init_details
    domcaps = self.vm.get_domain_capabilities()
  File "/usr/share/virt-manager/virtManager/domain.py", line 518, in get_domain_capabilities
    self.get_xmlobj().os.machine, self.get_xmlobj().type)
  File "/usr/lib/python2.7/site-packages/libvirt.py", line 3492, in getDomainCapabilities
    if ret is None: raise libvirtError ('virConnectGetDomainCapabilities() failed', conn=self)
libvirtError: invalid argument: architecture from emulator 'x86_64' doesn't match given architecture 'i686'

Journal:

Oct 16 21:08:26 goatlord.localdomain libvirtd[1530]: invalid argument: architecture from emulator 'x86_64' doesn't match given architecture 'i686'
2014-10-21 13:36:25 +02:00
Zhou yimin
411cea638f qemu: move setting emulatorpin ahead of monitor showing up
If VM is configured with many devices(including passthrough devices)
and large memory, libvirtd will take seconds(in the worst case) to
wait for monitor. In this period the qemu process may run on any
PCPU though I intend to pin emulator to the specified PCPU in xml
configuration.

Actually qemu process takes high cpu usage during vm startup.
So this is not the strict CPU isolation in this case.

Signed-off-by: Zhou yimin <zhouyimin@huawei.com>
2014-10-21 12:26:38 +02:00
Peter Krempa
e9a1c4384c qemu: Convert qemuDomainUpdateDeviceConfig to typecasted enum 2014-10-15 12:39:30 +02:00
Peter Krempa
fa3701a94c qemu: Convert qemuDomainDetachDeviceConfig to typecasted enum 2014-10-15 12:39:30 +02:00
Peter Krempa
2536b1b952 qemu: Convert qemuDomainAttachDeviceConfig to typecasted enum 2014-10-15 12:39:29 +02:00
Peter Krempa
714dff938c qemu: Convert qemuDomainUpdateDeviceLive to typecasted enum 2014-10-15 12:39:29 +02:00
Peter Krempa
9bb21f4287 qemu: Convert qemuDomainDetachDeviceLive to typecasted enum 2014-10-15 12:39:29 +02:00
Peter Krempa
6908f8cab3 qemu: monitor: Add functions for object hot-add/remove
To allow live modification of device backends in qemu libvirt needs to
be able to hot-add/remove "objects". Add monitor backend functions to
allow this.

This function will be used for hot-add/remove of RNG backends,
IOThreads, memory backing objects, etc.
2014-10-15 10:27:50 +02:00
Peter Krempa
881c46595e util: json: Split out code to create json value objects
Our qemu monitor code has a converter from key-value pairs to a json
value object. I want to re-use the code later and having it part of the
monitor command generator is inflexible. Split it out into a separate
helper.
2014-10-15 10:27:50 +02:00
Peter Krempa
3444fdefb1 qemu: hotplug: Use typecasted switch statement when plugging new devices 2014-10-15 10:27:50 +02:00
Chen Fan
5e0561e115 conf: Check whether migration_address is localhost
When enabling the migration_address option, by default it is
set to "127.0.0.1", but it's not a valid address for migration.
so we should add verification and set the default migration_address
to "0.0.0.0".

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2014-10-15 09:25:33 +02:00
Chen Fan
24c1603762 conf: add check if migration_host is a localhost address
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>

Signed-off-by: Ján Tomko <jtomko@redhat.com>
2014-10-15 09:25:33 +02:00
Chen Fan
69f7b67d55 migration: add migration_host support for IPv6 address without brackets
if specifying migration_host to an Ipv6 address without brackets,
it was resolved to an incorrect address, such as:
    tcp:2001:0DB8::1428:4444,
but the correct address should be:
    tcp:[2001:0DB8::1428]:4444
so we should add brackets when parsing it.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
2014-10-15 09:25:33 +02:00
Shanzhi Yu
566d5de7bf qemu: save domain status after set domain's numa parameters
After set domain's numa parameters for running domain, save the change,
save the change into live xml is needed to survive restarting the libvirtd,
same story with bug 1146511; meanwihle add call
qemuDomainObjBeginJob/qemuDomainObjEndJob in qemuDomainSetNumaParameters

Signed-off-by: Shanzhi Yu <shyu@redhat.com>
2014-10-09 11:50:51 +02:00
Shanzhi Yu
99fe8755b9 qemu: call qemuDomainObjBeginJob/qemuDomainObjEndJob in qemuDomainSetInterfaceParameters
add call qemuDomainObjBeginJob/qemuDomainObjEndJob in
qemuDomainSetInterfaceParameters

Signed-off-by: Shanzhi Yu <shyu@redhat.com>
2014-10-09 11:50:39 +02:00
Shanzhi Yu
bde879c184 qemu: save domain status after set the blkio parameters
After set the blkio parameters for running domain, save the change into
live xml is needed to survive restarting the libvirtd, same story with
bug 1146511, meanwhile add call qemuDomainObjBeginJob/qemuDomainObjEndJob
in qemuDomainSetBlkioParameters

Signed-off-by: Shanzhi Yu <shyu@redhat.com>
2014-10-09 11:50:26 +02:00
Laine Stump
db6b738dde qemu: change macvtap device MAC address in response to NIC_RX_FILTER_CHANGED
This patch fills in the functionality of
processNicRxFilterChangedEvent().  It now checks if it is appropriate
to respond to the NIC_RX_FILTER_CHANGED event (based on device type
and configuration) and takes appropriate action. Currently it checks
if the guest interface has been configured with
trustGuestRxFilters='yes', and if the host side device is macvtap. If
so, and the MAC address on the guest has changed, the MAC address of
the macvtap device is changed to match.

The result of this is that networking from the guest will continue to
work if the mac address of a macvtap-connected network device is
changed from within the guest, as long as trustGuestRxFilters='yes'
(previously changing the MAC address in the guest would break
networking).
2014-10-06 13:52:37 -04:00
Laine Stump
b6bdda458a qemu: setup infrastructure to handle NIC_RX_FILTER_CHANGED event
NIC_RX_FILTER_CHANGED is sent by qemu any time a NIC driver in the
guest modified the NIC's RX Filter (for example, if the MAC address of
the NIC is changed by the guest).

This patch doesn't do anything useful with that event; it just sets up
all the plumbing to get news of the event into a worker thread with
all proper locking/reference counting, and provide an easy place to
add in desired functionality.

See src/qemu/EVENTHANDLERS.txt for information/instructions on adding
a libvirt-internal handler for a qemu event (using
NIC_RX_FILTER_CHANGED as an example).
2014-10-06 13:50:57 -04:00
Laine Stump
ac4f8be422 qemu: add short document on qemu event handlers
This text was in the commit log for the patch that added the event
handler for NIC_RX_FILTER_CHANGED, and John Ferlan expressed a desire
that the information not be "lost", so I've put it into a file in the
qemu directory, hoping that it might catch the attention of future
writers of handlers for qemu events.
2014-10-06 13:50:57 -04:00
Laine Stump
ab989962d4 qemu: qemuMonitorQueryRxFilter - retrieve guest netdev rx-filter
This function can be called at any time to get the current status of a
guest's network device rx-filter. In particular it is useful to call
after libvirt recieves a NIC_RX_FILTER_CHANGED event - this event only
tells you that something has changed in the rx-filter, the details are
retrieved with the query-rx-filter monitor command (only available in
the json monitor). The command sent to the qemu monitor looks like this:

  {"execute":"query-rx-filter", "arguments": {"name":"net2"} }'

and the results will look something like this:

{
    "return": [
        {
            "promiscuous": false,
            "name": "net2",
            "main-mac": "52:54:00:98:2d:e3",
            "unicast": "normal",
            "vlan": "normal",
            "vlan-table": [
                42,
                0
            ],
            "unicast-table": [

            ],
            "multicast": "normal",
            "multicast-overflow": false,
            "unicast-overflow": false,
            "multicast-table": [
                "33:33:ff:98:2d:e3",
                "01:80:c2:00:00:21",
                "01:00:5e:00:00:fb",
                "33:33:ff:98:2d:e2",
                "01:00:5e:00:00:01",
                "33:33:00:00:00:01"
            ],
            "broadcast-allowed": false
        }
    ],
    "id": "libvirt-14"
}

This is all parsed from JSON into a virNetDevRxFilter object for
easier consumption. (unicast-table is usually empty, but is also an
array of mac addresses similar to multicast-table).

(NB: LIBNL_CFLAGS was added to tests/Makefile.am because virnetdev.h
now includes util/virnetlink.h, which includes netlink/msg.h when
appropriate. Without LIBNL_CFLAGS, gcc can't find that file (if
libnl/netlink isn't available, LIBNL_CFLAGS will be empty and
virnetlink.h won't try to include netlink/msg.h anyway).)
2014-10-06 13:32:38 -04:00
John Ferlan
b7890a8c28 qemu: Remove possible NULL deref in debug output
Check for !dev->info.alias was done after a VIR_DEBUG() statement
that already tried to print - just flip sequence
2014-10-06 10:35:26 -04:00
John Ferlan
99186c4103 qemu: Remove need for virConnectPtr in hotunplug detach host, net
Prior patch removed the need for the virConnectPtr in the unplug
detach host path which caused ripple effect to remove in multiple
callers.  The previous patch just left things as ATTRIBUTE_UNUSED -
this patch will remove the variable.
2014-10-06 10:35:26 -04:00
John Ferlan
d2774e54cd qemu: Fix hot unplug of SCSI_HOST device
https://bugzilla.redhat.com/show_bug.cgi?id=1141732

Introduced by commit id '8f76ad99' the logic to detach a scsi_host
device (SCSI or iSCSI) fails when attempting to remove the 'drive'
because as I found in my investigation - the DelDevice takes care of
that for us.

The investigation turned up commits to adjust the logic for the
qemuMonitorDelDevice and qemuMonitorDriveDel processing for interfaces
(commit id '81f76598'), disk bus=VIRTIO,SCSI,USB (commit id '0635785b'),
and chr devices (commit id '55b21f9b'), but nothing with the host devices.

This commit uses the model for the previous set of changes and applies
it to the hostdev path. The call to qemuDomainDetachHostSCSIDevice will
return to qemuDomainDetachThisHostDevice handling either the audit of
the failure or the wait for the removal and then call into
qemuDomainRemoveHostDevice for the event, removal from the domain hostdev
list, and audit of the removal similar to other paths.

NOTE: For now the 'conn' param to +qemuDomainDetachHostSCSIDevice is left
as ATTRIBUTE_UNUSED.  Removing requires a cascade of other changes to be
left for a future patch.
2014-10-06 10:35:25 -04:00
Martin Kletzander
34f514778b minor shmem clean-ups
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-10-04 10:46:22 +02:00
Martin Kletzander
b90a9a6374 qemu: Build command line for ivshmem device
This patch implements support for the ivshmem device in QEMU.

Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-10-03 22:43:09 +02:00
Maxime Leroy
e3d478eb51 qemu: add capability probing for ivshmem device
Ivshmem is supported by QEMU since 0.13 release.

Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-10-03 22:43:08 +02:00
Martin Kletzander
540a84ec89 docs, conf, schema: add support for shmem device
This patch adds parsing/formatting code as well as documentation for
shared memory devices.  This will currently be only accessible in QEMU
using it's ivshmem device, but is designed as generic as possible to
allow future expansion for other hypervisors.

In the devices section in the domain XML users may specify:

- For shmem device using a server:

 <shmem name='shmem0'>
   <server path='/tmp/socket-ivshmem0'/>
   <size unit='M'>32</size>
   <msi vectors='32' ioeventfd='on'/>
 </shmem>

- For ivshmem device not using an ivshmem server:

 <shmem name='shmem1'>
   <size unit='M'>32</size>
 </shmem>

Most of the configuration is made optional so it also allows
specifications like:

 <shmem name='shmem1/>
 <shmem name='shmem2'>
   <server/>
 </shmem>

Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-10-03 22:43:08 +02:00
Eric Blake
e9392e48d4 qemu: support nospace reason in io error event
Aeons ago (commit 34dcbbb4, v0.8.2), we added a new libvirt event
(VIR_DOMAIN_EVENT_ID_IO_ERROR_REASON) in order to tell the user WHY
the guest halted.  This is because at least VDSM wants to react
differently to ENOSPC events (resize the lvm partition to be larger,
and resume the guest as if nothing had happened) from all other events
(I/O is hosed, throw up our hands and flag things as broken).  At the
time this was done, downstream RHEL qemu added a vendor extension
'__com.redhat_reason', which would be exactly one of these strings:
"enospc", "eperm", "eio", and "eother".  In our stupidity, we exposed
those exact strings to clients, rather than an enum, and we also
return "" if we did not have access to a reason (which was the case
for upstream qemu).

Fast forward to now: upstream qemu commit c7c2ff0c (will be qemu 2.2)
FINALLY adds a 'nospace' boolean, after discussion with multiple
projects determined that VDSM really doesn't care about distinction
between any other error types.  So this patch converts 'nospace' into
the string "enospc" for compatibility with RHEL clients that were
already used to the downstream extension, while leaving the reason
blank for all other cases (no change from the status quo).

See also https://bugzilla.redhat.com/show_bug.cgi?id=1119784

* src/qemu/qemu_monitor_json.c (qewmuMonitorJSONHandleIOError):
Parse reason field from modern qemu.
* include/libvirt/libvirt.h.in
(virConnectDomainEventIOErrorReasonCallback): Document it.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-10-03 12:43:53 -06:00
Cole Robinson
445a09bdc9 qemu: Don't compare CPU against host for TCG
Right now when building the qemu command line, we try to do various
unconditional validations of the guest CPU against the host CPU. However
this checks are overly applied. The only time we should use the checks
are:

- The user requests host-model/host-passthrough, or

- When KVM is requsted. CPU features requested in TCG mode are always
  emulated by qemu and are independent of the host CPU, so no host CPU
  checks should be performed.

Right now if trying to specify a CPU for arm on an x86 host, it attempts
to do non-sensical validation and falls over.

Switch all the test cases that were intending to test CPU validation to
use KVM, so they continue to test the intended code.

Amend some aarch64 XML tests with a CPU model, to ensure things work
correctly.
2014-10-03 11:30:29 -04:00
Cole Robinson
3bc6dda6c5 qemu_command: Split qemuBuildCpuArgStr
Move the CPU mode/model handling to its own function. This is just
code movement and re-indentation.
2014-10-03 11:30:29 -04:00
Shanzhi Yu
a4771c5860 qemu: Improve domainSetTime error info report
check domain's status before call virQEMUCapsGet to report a accurate
error when domain is shut off

Resolve: https://bugzilla.redhat.com/show_bug.cgi?id=1147847
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
2014-10-03 15:48:07 +02:00
Erik Skultety
e3a7b8740f qemu: Fix updating balloon period in live XML
Up until now, we set memballoon period in monitor successfully, however
we did not update domain definition structure, thus dumpxml was omitting
period attribute in memballoon element

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1140960
2014-10-02 16:10:53 -04:00
Erik Skultety
f4ba3385ba qemu: Fix updating bandwidth limits in live XML
When trying to update bandwidth limits on a running domain, limits get
updated in our internal structures, however XML parser reads
bandwidth limits from network 'actual' definition. Committing this patch
it is now available to update bandwidth 'actual' definition as well,
thus updating domain runtime XML.
2014-10-02 16:10:53 -04:00
Guido Günther
4882618ed1 qemu: use systemd's TerminateMachine to kill all processes
If we don't properly clean up all processes in the
machine-<vmname>.scope systemd won't remove the cgroup and subsequent vm
starts fail with

  'CreateMachine: File exists'

Additional processes can e.g. be added via

  echo $PID > /sys/fs/cgroup/systemd/machine.slice/machine-${VMNAME}.scope/tasks

but there are other cases like

  http://bugs.debian.org/761521

Invoke TerminateMachine to be on the safe side since systemd tracks the
cgroup anyway. This is a noop if all processes have terminated already.
2014-10-01 20:17:46 +02:00