Commit 93c8ca tried to fix the issue with auto-adding of a PCI bridge
controller, but didn't work properly in all scenarios.
This patch provides a better fix of the issue when all slots on a PCI bus
are reserved by devices with user specified addresses and no additional
bridges need to be created.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1132900
In order to be able to test for fully reserved PCI buses, assignment of
PCI slots for integrated devices needs to be moved to a separate function.
This also might be a good preparation if we decide to add support for
other chipsets as well.
Move qemuDomainAssignPCIAddresses after the definition
of the static function qemuDomainValidateDevicePCISlotsQ35.
This lets us define a new static function using
qemuDomainValidateDevicePCISlots* and use it in
qemuDomainAssignPCIAddresses without a forward declaration.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
As it turned out, fix of dead code 419a22 changed the affected condition
from "never true" to "always true", so better fix would be to change the
return code of virDomainMaybeAddController from 0 to 1 if
a new bridge has been added, thus distinguishing case when we didn't need to
add any controller and case we successfully added one.
The return code is changed in the next commit
The ACL check didn't check the VIR_DOMAIN_XML_SECURE flag and the
appropriate permission for it. Found via code inspection while fixing
permissions for save images.
https://bugzilla.redhat.com/show_bug.cgi?id=1164627
When using 'virsh attach-device' to hotplug an unsupported console type
into a qemu guest the attachment would succeed as the command line
formatter didn't report error in such case.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1130390
The listen address is not mandatory for <interface type='server'>
but when it's not specified, we've been formatting it as:
-netdev socket,listen=(null):5558,id=hostnet0
which failed with:
Device 'socket' could not be initialized
Omit the address completely and only format the port in the listen
attribute.
Also fix the schema to allow specifying a model.
Using the same driver multiple times is pointless and
it can result in confusing errors:
$ virsh start test
error: Failed to start domain test
error: internal error: security label already defined for VM
https://bugzilla.redhat.com/show_bug.cgi?id=1153891
Depending on the context, either error out if the domain
has disappeared in the meantime, or just ignore the value
to allow marking the function as ATTRIBUTE_RETURN_CHECK.
https://bugzilla.redhat.com/show_bug.cgi?id=1161024
If the domain crashed while we were in monitor,
we cannot rely on the REALLOC done on live definition,
since vm->def now points to the persistent definition.
Skip adding the attached devices to domain definition
if the domain crashed.
In AttachChrDevice, the chardev was already added to the
live definition and freed by qemuProcessStop in the case
of a crash. Skip the device removal in that case.
Also skip audit if the domain crashed in the meantime.
https://bugzilla.redhat.com/show_bug.cgi?id=1161024
In the device type-specific functions, exit early
if the domain has disappeared, because the cleanup
should have been done by qemuProcessStop.
Check the return value in processDeviceDeletedEvent
and qemuProcessUpdateDevices.
Skip audit and removing the device from live def because
it has already been cleaned up.
Ploop is a pseudo device which makeit possible to access
to an image in a file as a block device. Like loop devices,
but with additional features, like snapshots, write tracker
and without double-caching.
It used in PCS for containers and in OpenVZ. You can manage
ploop devices and images with ploop utility
(http://git.openvz.org/?p=ploop).
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
We tested for positive return value from virDomainMaybeAddController,
but it returns 0 or -1 only resulting in a dead code.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In case we find out, there are more PCI devices to be connected
than there are available slots on the default PCI bus, we automatically add a
new bus and a related PCI bridge controller as well. As there are no free slots
left on the default PCI bus, PCI bridge controller gets a free slot on a
newly created PCI bus which causes qemu to refuse to start the guest.
This fix introduces a new function qemuDomainPCIBusFullyReserved which
is checked right before we possibly try to reserve a slot for PCI bridge
controller.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1132900
The virDomainDefineXMLFlags and virDomainCreateXML APIs both
gain new flags allowing them to be told to validate XML.
This updates all the drivers to turn on validation in the
XML parser when the flags are set
systemd-machined introduced a new method CreateMachineWithNetwork
that obsoletes CreateMachine. It expects to be given a list of
VETH/TAP device indexes for the host side device(s) associated
with a container/machine.
This falls back to the old CreateMachine method when the new
one is not supported.
https://bugzilla.redhat.com/show_bug.cgi?id=1181182
When we meet error in qemuMigrationPrepareAny and goto
cleanup with rc < 0, we forget clear the priv->origname and this
will make this vm migrate fail next time because leave a wrong
origname in priv, and will Generate a wrong cookie when do
migrate next time.
This patch will make priv->origname is NULL when migrate fail
in target host.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Make local copy of the disk alias in qemuProcessInitPasswords,
instead of referencing the one in domain definition, which
might get freed if the domain crashes while we're in monitor.
Also copy the memballoon period value.
Make a local copy of the disk alias instead of pointing
to the domain definition, which might get freed if
the domain dies while we're in monitor.
Also exit early if that happens.
Exit the monitor right after we've done with it to get
the virDomainObjPtr lock back, otherwise we might be accessing
vm->def while it's being cleaned up by qemuProcessStop.
If the domain crashed while we were in the monitor, exit
early instead of changing vm->def which is now the persistent
definition.
The domain might disappear during the time in monitor when
the virDomainObjPtr is unlocked, so the caller needs to check
if it's still alive.
Since most of the callers are going to need it, put the
check inside qemuDomainObjExitMonitor and return -1 if
the domain died in the meantime.
QEMU internally updates the size of video memory if the domain XML had
provided too low memory size or there are some dependencies for a QXL
devices 'vgamem' and 'ram' size. We need to know about the changes and
store them into the status XML to not break migration or managedsave
through different libvirt versions.
The values would be loaded only if the "vgamem_mb" property exists for
the device. The presence of the "vgamem_mb" also tells that the
"ram_size" and "vram_size" exists for QXL devices.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The search is done recursively only through QOM object that has a type
prefixed with "child<" as this indicate that the QOM is a parent for
other QOM objects.
The usage is that you give known device name with starting path where to
search.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Commit e3435caf fixed hot-plugging of vcpus with strict memory pinning
on NUMA hosts, but unfortunately it also broke updating number of vcpus
for offline guests using our API.
The issue is that we try to create a cpu cgroup for non-running guest
which fails as there are no cgroups for that domain. We should create
cgroups and update cpuset.mems only if we are hot-plugging.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1165993
So, there are still plenty of vNIC types that we don't know how to set
bandwidth on. Let's warn explicitly in case user has requested it
instead of pretending everything was set.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When create inactive external snapshot, after update disk definitions,
virDomainSaveConfig is needed, if not after restart libvirtd the new snapshot
file definitions in xml will be lost.
Reproduce steps:
1. prepare a shut off guest
$ virsh domstate rhel7 && virsh domblklist rhel7
shut off
Target Source
------------------------------------------------
vda /var/lib/libvirt/images/rhel7.img
2. create external disk snapshot
$ virsh snapshot-create rhel7 --disk-only && virsh domblklist rhel7
Domain snapshot 1417882967 created
Target Source
------------------------------------------------
vda /var/lib/libvirt/images/rhel7.1417882967
3. restart libvirtd then check guest source file
$ service libvirtd restart && virsh domblklist rhel7
Redirecting to /bin/systemctl restart libvirtd.service
Target Source
------------------------------------------------
vda /var/lib/libvirt/images/rhel7.img
This was first reported by Eric Blake
http://www.redhat.com/archives/libvir-list/2014-December/msg00369.html
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
The virDomainDefParse* and virDomainDefFormat* methods both
accept the VIR_DOMAIN_XML_* flags defined in the public API,
along with a set of other VIR_DOMAIN_XML_INTERNAL_* flags
defined in domain_conf.c.
This is seriously confusing & error prone for a number of
reasons:
- VIR_DOMAIN_XML_SECURE, VIR_DOMAIN_XML_MIGRATABLE and
VIR_DOMAIN_XML_UPDATE_CPU are only relevant for the
formatting operation
- Some of the VIR_DOMAIN_XML_INTERNAL_* flags only apply
to parse or to format, but not both.
This patch cleanly separates out the flags. There are two
distint VIR_DOMAIN_DEF_PARSE_* and VIR_DOMAIN_DEF_FORMAT_*
flags that are used by the corresponding methods. The
VIR_DOMAIN_XML_* flags received via public API calls must
be converted to the VIR_DOMAIN_DEF_FORMAT_* flags where
needed.
The various calls to virDomainDefParse which hardcoded the
use of the VIR_DOMAIN_XML_INACTIVE flag change to use the
VIR_DOMAIN_DEF_PARSE_INACTIVE flag.
https://bugzilla.redhat.com/show_bug.cgi?id=1135339 documents some
confusing behavior when a user tries to start an inactive block
commit in a second connection while there is already an on-going
active commit from a first connection. Eventually, qemu will
support multiple simultaneous block jobs, but as of now, it does
not; furthermore, libvirt also needs an overhaul before we can
support simultaneous jobs. So, the best way to avoid confusing
ourselves is to quit relying on qemu to tell us about the situation
(where we risk getting in weird states) and instead forbid a
duplicate block commit ourselves.
Note that we are still relying on qemu to diagnose attempts to
interrupt an inactive commit (since we only track XML of an active
commit), but as inactive commit is less confusing for libvirt to
manage, there is less that can go wrong by leaving that detection
up to qemu.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Hoist check for
active commit to occur earlier outside of conditions.
Signed-off-by: Eric Blake <eblake@redhat.com>
QEMU supports feature specification with -cpu host and we just skip
using that. Since QEMU developers themselves would like to use this
feature, this patch modifies the code to work.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1178850
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The default value should be 16 MiB instead of 8 MiB. Only really old
version of upstream QEMU used the 8 MiB as default for vga framebuffer.
Without this change if you update your libvirt where we introduced the
"vgamem" attribute for QXL video device the value will be set to 8 MiB,
but previously your guest had 16 MiB because we didn't pass any value to
QEMU command line which means QEMU used its own 16 MiB as default.
This will affect all users with guest's display resolution higher than
1920x1080.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
In one of my previous commits (311b4a67) I've tried to allow to
pass regular system pages to <hugepages>. However, there was a
little bug that wasn't caught. If domain has guest NUMA topology
defined, qemuBuildNumaArgStr() function takes care of generating
corresponding command line. The hugepages backing for guest NUMA
nodes is handled there too. And here comes the bug: the hugepages
setting from XML is stored in KiB internally, however, the system
pages size was queried and stored in Bytes. So the check whether
these two are equal was failing even if it shouldn't.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In commit 540c339a2535ec30d79e5ef84d8f50a17bc60723 the whole domain
reference counting was refactored in the qemu driver. Domain jobs now
don't need to reference the domain object as they now expect the
reference from the calling function.
However, the patch forgot to remove the unref call in case we exit the
monitor when we were acquiring a nested job. This caused the daemon to
crash on a subsequent access to the domain object once we've done an
operation requiring a nested job for a monitor access.
An easy reproducer case:
1) Start a vm with qcow disks
2) virsh snapshot-create-as DOMNAME
3) virsh dumpxml DOMNAME
4) daemon crashes in a semi-random spot while accessing a now-removed VM
object.
Fortunately, the commit wasn't released yet, so there are no security
implications.
Reported-by: Shanzi Yu <shyu@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1177723
When setting new bandwidth limits via
virDomainSetInterfaceParameters, the old ones are cleared first.
However, if setting the new ones fails, the old are already gone
and interface is left in inconsistent state. Therefore, right
before failing we ought to try to restore the old bandwidth.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1178652
We will get a warning when we have a guest in paused
status (caused by kernel panic) and restart libvirtd,
warning message like this:
Qemu reported unknown VM status: 'guest-panicked'
and this seems because we set a wrong status name in
qemu_monitor.c, and from qemu qapi-schema.json file
we know this status should named 'guest-panicked'.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Add the possibility to have more than one IP address configured for a
domain network interface. IP addresses can also have a prefix to define
the corresponding netmask.
There is one problem that causes various errors in the daemon. When
domain is waiting for a job, it is unlocked while waiting on the
condition. However, if that domain is for example transient and being
removed in another API (e.g. cancelling incoming migration), it get's
unref'd. If the first call, that was waiting, fails to get the job, it
unref's the domain object, and because it was the last reference, it
causes clearing of the whole domain object. However, when finishing the
call, the domain must be unlocked, but there is no way for the API to
know whether it was cleaned or not (unless there is some ugly temporary
variable, but let's scratch that).
The root cause is that our APIs don't ref the objects they are using and
all use the implicit reference that the object has when it is in the
domain list. That reference can be removed when the API is waiting for
a job. And because each domain doesn't do its ref'ing, it results in
the ugly checking of the return value of virObjectUnref() that we have
everywhere.
This patch changes qemuDomObjFromDomain() to ref the domain (using
virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which
should be the only function in which the return value of
virObjectUnref() is checked. This makes all reference counting
deterministic and makes the code a bit clearer.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Although QMP returns info about vCPU threads in TCG mode, the
data it returns is mostly lies. Only the first vCPU has a valid
thread_id returned. The thread_id given for the other vCPUs is
in fact the main emulator thread. All vCPUs actually run under
the same thread in TCG mode.
Our vCPU pinning code is not at all able to cope with this
so if you try to set CPU affinity per-vCPU you end up with
wierd errors
error: Failed to start domain instance-00000007
error: cannot set CPU affinity on process 24365: Invalid argument
Since few people will care about the performance of TCG with
strict CPU pinning, lets just disable that for now, so we get
a clear error message
error: Failed to start domain instance-00000007
error: Requested operation is not valid: cpu affinity is not supported
The code assumes that def->vcpus == nvcpupids, so when we setup
fake CPU pids for old QEMU with nvcpupids == 1, we cause the
later code to read off the end of the array. This has fun results
like sche_setaffinity(0, ...) which changes libvirtd's own CPU
affinity, or even better sched_setaffinity($RANDOM, ...) which
changes the affinity of a random OS process.
Libvirt BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1175397
QEMU BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1170093
In qemu there are two interesting arguments:
1) -numa to create a guest NUMA node
2) -object memory-backend-{ram,file} to tell qemu which memory
region on which host's NUMA node it should allocate the guest
memory from.
Combining these two together we can instruct qemu to create a
guest NUMA node that is tied to a host NUMA node. And it works
just fine. However, depending on machine type used, there might
be some issued during migration when OVMF is enabled (see QEMU
BZ). While this truly is a QEMU bug, we can help avoiding it. The
problem lies within the memory backend objects somewhere. Having
said that, fix on our side consists on putting those objects on
the command line if and only if needed. For instance, while
previously we would construct this (in all ways correct) command
line:
-object memory-backend-ram,size=256M,id=ram-node0 \
-numa node,nodeid=0,cpus=0,memdev=ram-node0
now we create just:
-numa node,nodeid=0,cpus=0,mem=256
because the backend object is obviously not tied to any specific
host NUMA node.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Coverity flagged commit 0282ca45 as introducing a memory leak;
in all my refactoring to make capacity probing conditional on
whether the image is non-raw, I missed deleting the unconditional
probe.
* src/qemu/qemu_driver.c (qemuStorageLimitsRefresh): Drop
redundant assignment.
Signed-off-by: Eric Blake <eblake@redhat.com>