Commit f7afeddc added code to report to systemd an array of interface
indexes for all tap devices used by a guest. Unfortunately it not only
didn't add code to report the ifindexes for macvtap interfaces
(interface type='direct') or the tap devices used by type='ethernet',
it ended up sending "-1" as the ifindex for each macvtap or hostdev
interface. This resulted in a failure to start any domain that had a
macvtap or hostdev interface (or actually any type other than
"network" or "bridge").
This patch does the following with the nicindexes array:
1) Modify qemuBuildInterfaceCommandLine() to only fill in the
nicindexes array if given a non-NULL pointer to an array (and modifies
the test jig calls to the function to send NULL). This is because
there are tests in the test suite that have type='ethernet' and still
have an ifname specified, but that device of course doesn't actually
exist on the test system, so attempts to call virNetDevGetIndex() will
fail.
2) Even then, only add an entry to the nicindexes array for
appropriate types, and to do so for all appropriate types ("network",
"bridge", and "direct"), but only if the ifname is known (since that
is required to call virNetDevGetIndex().
libvirt was unconditionally calling virNetDevBandwidthClear() for
every interface (and network bridge) of a type that supported
bandwidth, whether it actually had anything set or not. This doesn't
hurt anything (unless ifname == NULL!), but is wasteful.
This patch makes sure that all calls to virNetDevBandwidthClear() are
qualified by checking that the interface really had some bandwidth
setup done, and checks for a null ifname inside
virNetDevBandwidthClear(), silently returning success if it is null
(as well as removing the ATTRIBUTE_NONNULL from that function's
prototype, since we can't guarantee that it is never null,
e.g. sometimes a type='ethernet' interface has no ifname as it is
provided on the fly by qemu).
If the qemu binary on x86 does not support lsi SCSI controller,
but it supports virtio-scsi, we reject the virtio-specific attributes
for no reason.
Move the default controller assignment before the check.
https://bugzilla.redhat.com/show_bug.cgi?id=1168849
https://bugzilla.redhat.com/show_bug.cgi?id=1183869
Soo. you've successfully started yourself a domain. And since you want
to use it on your host exclusively you are confident enough to
passthrough the host CPU model, like this:
<cpu mode='host-passthrough'/>
Then, after a while, you want to save the domain into a file (e.g.
virsh save dom dom.save). And here comes the trouble. The file consist
of two parts: Libvirt header (containing domain XML among other
things), and qemu migration data. Now, the domain XML in the header is
formatted using special flags (VIR_DOMAIN_XML_SECURE |
VIR_DOMAIN_XML_UPDATE_CPU | VIR_DOMAIN_XML_INACTIVE |
VIR_DOMAIN_XML_MIGRATABLE).
Then, on your way back from the bar, you think of changing something
in the XML in the saved file (we have a command for it after all), say
listen address for graphics console. So you successfully type in the
command:
virsh save-image-edit dom.save
Change all the bits, and exit the editor. But instead of success
you're left with sad error message:
error: unsupported configuration: Target CPU model <null> does not
match source Pentium Pro
Sigh. Digging into the code you see lines, where we check for ABI
stability. The new XML you've produced is compared with the old one
from the saved file to see if qemu ABI will break or not. Wait, what?
We are using different flags to parse the XML you've provided so we
were just lucky it worked in some cases? Yep, that's right.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In commit cc41c648 I've re-factored qemuMonitorFindBalloonObjectPath, but
missed that there is a memory leak. The "nextpath" variable is
overwritten while looping in for cycle and we have to free it before next
cycle.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Making use of the ARCH_IS_S390 macro introduced with
e808357528d8be1ebc3970424b4a7b7c04eda2b6
Signed-off-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Since s390 does not support usb the default creation of a usb controller
for a domain should not occur.
Also adjust s390 test cases by removing usb device instances since
usb devices are no longer created by default for s390 the s390
test cases need to be adjusted.
Signed-off-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
For historical reasons data regarding NUMA configuration were split
between the CPU definition and numatune. We cannot do anything about the
XML still being split, but we certainly can at least store the relevant
data in one place.
This patch moves the NUMA stuff to the right place.
As virDomainNumatuneSet now doesn't allocate the virDomainNuma object
any longer it's not necessary to pass the pointer to a pointer to store
the object as it will not change any longer.
While touching the parameter definitions I've also changed the name of
the parameter to "numa".
Name it virNumaMemAccess and add it to conf/numa_conf.[ch]
Note that to avoid a circular dependency the type of the NUMA cell
memAccess variable was changed to int. It will be turned back later
after the circular dependency will not exist.
Not all machine types support all devices, device properties, backends,
etc. So until we create a matrix of [machineType, qemuCaps], lets just
filter out some capabilities before we return them to the consumer
(which is going to make decisions based on them straight away).
Currently, as qemu is unable to tell which capabilities are (not)
enabled for given machine types, it's us who has to hardcode the matrix.
One day maybe the hardcoding will go away and we can create the matrix
dynamically on the fly based on a few monitor calls.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1179678
When migrating with storage, libvirt iterates over domain disks and
instruct qemu to migrate the ones we are interested in (shared, RO and
source-less disks are skipped). The disks are migrated in series. No
new disk is transferred until the previous one hasn't been quiesced.
This is checked on the qemu monitor via 'query-jobs' command. If the
disk has been quiesced, it practically went from copying its content
to mirroring state, where all disk writes are mirrored to the other
side of migration too. Having said that, there's one inherent error in
the design. The monitor command we use reports only active jobs. So if
the job fails for whatever reason, we will not see it anymore in the
command output. And this can happen fairly simply: just try to migrate
a domain with storage. If the storage migration fails (e.g. due to
ENOSPC on the destination) we resume the host on the destination and
let it run on partly copied disk.
The proper fix is what even the comment in the code says: listen for
qemu events instead of polling. If storage migration changes state an
event is emitted and we can act accordingly: either consider disk
copied and continue the process, or consider disk mangled and abort
the migration.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Upon BLOCK_JOB_COMPLETED event delivery, we check if the job has
completed (in qemuMonitorJSONHandleBlockJobImpl()). For better image,
the event looks something like this:
"timestamp": {"seconds": 1423582694, "microseconds": 372666}, "event":
"BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-disk0", "len":
8412790784, "offset": 409993216, "speed": 8796093022207, "type":
"mirror", "error": "No space left on device"}}
If "len" does not equal "offset" it's considered an error, and we can
clearly see "error" field filled in. However, later in the event
processing this case was handled no differently to case of job being
aborted via separate API. It's time that we start differentiate these
two because of the future work.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently, upon BLOCK_JOB_* event, disk->mirrorState is not updated
each time. The callback code handling the events checks if a blockjob
was started via our public APIs prior to setting the mirrorState.
However, some block jobs may be started internally (e.g. during
storage migration), in which case we don't bother with setting
disk->mirror (there's nothing we can set it to anyway), or other
fields. But it will come handy if we update the mirrorState in these
cases too. The event wasn't delivered just for fun - we've started the
job after all.
So, in this commit, the mirrorState is set to whatever job status
we've obtained. Of course, there are some actions on some statuses
that we want to perform. But instead of if {} else if {} else {} ...
enumeration, let's move to switch().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If 'virNumaGetHostNodeset()' fails then the error path will try to free
uninitialized pointer mem_mask. Introduced by commit af2a1f058.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
PowerPC : Forbid NULL CPU model with 'host-model' mode in qemu command line.
This ensures that an XML such as following:
...
<cpu mode='host-model'>
<model fallback='allow'/>
</cpu>
...
will not generate a '-cpu host,compat=(null)' command line with qemu-system-ppc64.
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
PowerPC : Explicitly associate 'qemu-system-ppc64' as the
default emulator for all 64-bit PowerPC guests ( both Big & Little Endian )
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1126762
Commit 43b67f introduced a deadlock issue when we use numatune
to change numa settings to a vm in session mode.
Jump to endjob instead of jump to cleanup.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
So, when building the '-numa' command line, the
qemuBuildMemoryBackendStr() function does quite a lot of checks to
chose the best backend, or to check if one is in fact needed. However,
it returned that backend is needed even for this little fella:
<numatune>
<memory mode="strict" nodeset="0,2"/>
</numatune>
This can be guaranteed via CGroups entirely, there's no need to use
memory-backend-ram to let qemu know where to get memory from. Well, as
long as there's no <memnode/> element, which explicitly requires the
backend. Long story short, we wouldn't have to care, as qemu works
either way. However, the problem is migration (as always). Previously,
libvirt would have started qemu with:
-numa node,memory=X
in this case and restricted memory placement in CGroups. Today, libvirt
creates more complicated command line:
-object memory-backend-ram,id=ram-node0,size=X
-numa node,memdev=ram-node0
Again, one wouldn't find anything wrong with these two approaches.
Both work just fine. Unless you try to migrated from the older libvirt
into the newer one. These two approaches are, unfortunately, not
compatible. My suggestion is, in order to allow users to migrate, lets
use the older approach for as long as the newer one is not needed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We do have a check for valid per-domain security model, however we still
do permit an invalid security model for a domain's device (those which
are specified with <source> element).
This patch introduces a new function virSecurityManagerCheckAllLabel
which compares user specified security model against currently
registered security drivers. That being said, it also permits 'none'
being specified as a device security model.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1165485
Signed-off-by: Ján Tomko <jtomko@redhat.com>
The qemuDomainHelperGetVcpus attempted to report an error when the
vcpupids info was NULL. Unfortunately earlier code would clamp the
value of 'maxinfo' to 0 when nvcpupids was 0, so the error reporting
would end up being skipped.
This lead to 'virsh vcpuinfo <dom>' just returning an empty list
instead of giving the user a clear error.
If a previous commit I fixed the incorrect handling of vcpu pids
for TCG mode QEMU:
commit b07f3d821dfb11a118ee75ea275fd6ab737d9500
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Thu Dec 18 16:34:39 2014 +0000
Don't setup fake CPU pids for old QEMU
The code assumes that def->vcpus == nvcpupids, so when we setup
fake CPU pids for old QEMU with nvcpupids == 1, we cause the
later code to read off the end of the array. This has fun results
like sche_setaffinity(0, ...) which changes libvirtd's own CPU
affinity, or even better sched_setaffinity($RANDOM, ...) which
changes the affinity of a random OS process.
The intent was that this would merely disable the ability to set
per-vCPU affinity. It should still have been possible to set VM
level host CPU affinity.
Unfortunately, when you set <vcpu cpuset='0-1'>4</vcpu>, the XML
parser will internally take this & initialize an entry in the
def->cputune.vcpupin array for every VCPU. IOW this is implicitly
being treated as
<cputune>
<vcpupin cpuset='0-1' vcpu='0'/>
<vcpupin cpuset='0-1' vcpu='1'/>
<vcpupin cpuset='0-1' vcpu='2'/>
<vcpupin cpuset='0-1' vcpu='3'/>
</cputune>
Even more fun, the faked cputune elements are hidden from view when
querying the live XML, because their cpuset mask is the same as the
VM default cpumask.
The upshot was that it was impossible to set VM level CPU affinity.
To fix this we must update qemuProcessSetVcpuAffinities so that it
only reports a fatal error if the per-VCPU cpu mask is different
from the VM level cpu mask.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In the event we're falling into the code that tries to create the file
in a forked environment (VIR_FILE_OPEN_FORK) we pass different mode bits,
but those are never set because the virFileOpenForceOwnerMode has a check
if the OPEN_FORCE_MODE bit is set before attempting to change the mode.
Since this is a special case it seems reasonable to set u+rw,g+rw,o
https://bugzilla.redhat.com/show_bug.cgi?id=1191355
When we attempt to migrate a vm with a migrateuri that has no scheme:
# virsh migrate test4 --live qemu+ssh://lhuang/system --migrateuri 127.0.0.1
target libvirtd will crash because uri->scheme is NULL in
qemuMigrationPrepareDirect on this line:
if (STRNEQ(uri->scheme, "tcp") &&
Add a value check before this line. Also fix a bug like this in
doNativeMigrate, that could only happen when destination libvirtd
returned an incorrect URI.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Export the required helpers and add backend code to hotplug RNG devices.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
As the RNG device is using an -object as backend refactor the code to
use the JSON to commandline generator so that we can reuse the code
later in hotplug.
Move the alias name right after the object type for rng-egd backend so
that we can later use the JSON to commandline generator to create the
command line.
Libvirt didn't prefix the random number generator backend object alias
with any string thus the device alias and object alias were identical.
To avoid possible problems, rename the alias for the backend object and
tweak tests to comply with the change.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Rename qemuBuildRNGDeviceArgs to qemuBuildRNGDevStr and change the
return type so that it can be reused in the device hotplug code later.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
This function is used to assign an alias for a RNG device. It will be
later reused when hotplugging RNGs.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
It is only usable for NETWORK and BRIDGE type interfaces.
Error out when trying to start a domain where the custom
tap device path is specified for interfaces of other types,
or when the daemon is not privileged.
Note that this cannot be checked at definition time, because
the comparison is against actual type.
https://bugzilla.redhat.com/show_bug.cgi?id=1147195
It is often helpful to know which version of libvirt and QEMU
was present when a guest was first launched. Ensure this info
is written into the QEMU log file for each guest.