https://bugzilla.redhat.com/show_bug.cgi?id=1228007
When attaching a scsi volume lun via the attach-device --config or
--persistent options, there was no translation of the source pool
like there was for the live path, thus the attempt to modify the config
would fail since not enough was known about the disk.
When getting block device I/O tuning data there is no check for whether
QEMU supports such options and the call fails on
qemuMonitorGetBlockIoThrottle() when getting the particular throttle
data. So try reporting a better error when blkdeviotune is not
supported.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1224053
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In the pre-NUMA ages pinning a vCPU to all pCPUs was eaqual to deleting
the pinning info. Now it does not entirely work that way. Pinning a vCPU
to all pCPUs might be a desired operation. Additionally removal of the
pinning will result into using the default pinning information at the
next boot which might be different from all vcpus.
This patch removes the false assumption that we should remove the
pinning after pinning to all vCPUs and tweaks the documentation for
virsh.
A later patch will implement a new flag for the virDomainPinVcpuFlags
API that will allow to remove the pinning in a sane way.
While we probably won't see machines with more than 65536 cpus for a
while lets store the cpu count as an integer so that we can avoid quite
a lot of overflow checks in our code.
Since the returned structure uses "unsigned long" for memory sizes add a
few overflow checks to notify the user in case we are not able to
represent given values.
When qemu does not support the balloon event the current memory size
needs to be queried. Since there are two places that implement the same
logic, split it out into a function and reuse.
Refactor the function to return the bitmap instead of an integer and the
inner workings so that they make more sense.
This patch also fixes possible segfault on old systems that was
introduced by commit:
commit f1a43a8e41
Author: Hu Tao <hutao@cn.fujitsu.com>
Date: Fri Sep 14 15:46:59 2012 +0800
use virBitmap to store cpu affinity info
Store the emulator pinning cpu mask as a pure virBitmap rather than the
virDomainPinDef since it stores only the bitmap and refactor
qemuDomainPinEmulator to do the same operations in a much saner way.
As a side effect virDomainEmulatorPinAdd and virDomainEmulatorPinDel can
be removed since they don't add any value.
The QMP command, like the interrupt reinjection logic it's connected
to, is only implemented in QEMU when TARGET_I386 is defined, so
checking for its availability on any other architecture is pointless.
On the other hand, when we're on x86, we shouldn still make sure that
rtc-reset-reinjection is available and refuse to set the time
otherwise.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1211938
Since commit bcd9a564b6 virDomainNumatuneGetMode returns the value
via a pointer rather than in the return value. The change triggered
problems with platforms where the compiler decides to use a data type of
size different than integer at the point where we typecast it.
Work around the issue by using an intermediate variable of the correct
type that gets casted back by the default typecasting rules.
Most virDomainDiskIndexByName callers do not care about the index; what
they really want is a disk def pointer.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
So far, we are not reporting if numatune was even defined. The
value of zero is blindly returned (which maps onto
VIR_DOMAIN_NUMATUNE_MEM_STRICT). Unfortunately, we are making
decisions based on this value. Instead, we should not only return
the correct value, but report to the caller if the value is valid
at all.
For better viewing of this patch use '-w'.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=976387
For a domain configured using the host cdrom, we should taint the domain
due to problems encountered when the host and guest try to control the tray.
Since af2a1f0587,
qemuDomainGetNumaParameters() returns invalid value for a running
guest. The problem is that it is getting the information from cgroups,
but the parent cgroup is being left alone since the mentioned commit.
Since the running guest's XML is in sync with cgroups, there is no need
to look into cgroups (unless someone changes the configuration behind
libvirt's back). Returning the info from the definition fixes a bug and
is also a cleanup.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1221047
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
For some reason a union (_virNodeDevCapData) that had only been
declared inside the toplevel struct virNodeDevCapsDef was being used
as an argument to functions all over the place. Since it was only a
union, the "type" attribute wasn't necessarily sent with it. While
this works, it just seems wrong.
This patch creates a toplevel typedef for virNodeDevCapData and
virNodeDevCapDataPtr, making it a struct that has the type attribute
as a member, along with an anonymous union of everything that used to
be in union _virNodeDevCapData. This way we only have to change the
following:
s/union _virNodeDevCapData */virNodeDevCapDataPtr /
and
s/caps->type/caps->data.type/
This will make me feel less guilty when adding functions that need a
pointer to one of these.
https://bugzilla.redhat.com/show_bug.cgi?id=1220809
When cold-plugging an RNG device but something fails in
qemuDomainAssignAddresses, we will double free the RNG device.
Once a device is plugged into the domain, we should set the
device pointer to NULL to fix this issue.
...
5 0x00007fb7d180ac8a in virFree at util/viralloc.c:582
6 0x00007fb7d1895cdd in virDomainRNGDefFree at conf/domain_conf.c:19786
7 0x00007fb7d1895d99 in virDomainDeviceDefFree at conf/domain_conf.c:2022
8 0x00007fb7b92b8baf in qemuDomainAttachDeviceFlags at qemu/qemu_driver.c:8785
9 0x00007fb7d190c5d7 in virDomainAttachDeviceFlags at libvirt-domain.c:8488
10 0x00007fb7d23af9d2 in remoteDispatchDomainAttachDeviceFlags at remote_dispatch.h:2842
...
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Since libvirt doesn't call to update the new balloon size in qemu add
code that will handle tweaking of the size of the current balloon
statistic until qemu reports the new size using the event.
Use the new domain list collection helpers to avoid going through
virDomainPtrs.
This additionally implements filter capability when called through the
api that accepts domain list filters.
Extend it to a universal helper used for clearing lists of any objects.
Note that the argument type is specifically void * to allow implicit
typecasting.
Additionally add a helper that works on non-NULL terminated arrays once
we know the length.
https://bugzilla.redhat.com/show_bug.cgi?id=890648
So, imagine you've issued an API that involves guest agent. For
instance, you want to query guest's IP addresses. So the API acquires
QUERY_JOB, locks the guest agent and issues the agent command.
However, for some reason, guest agent replies to initial ping
correctly, but then crashes tragically while executing real command
(in this case guest-network-get-interfaces). Since initial ping went
well, libvirt thinks guest agent is accessible and awaits reply to the
real command. But it will never come. What will is a monitor event.
Our handler (processSerialChangedEvent) will try to acquire
MODIFY_JOB, which will fail obviously because the other thread that's
executing the API already holds a job. So the event handler exits
early, and the QUERY_JOB is never released nor ended.
The way how to solve this is to put flag somewhere in the monitor
internals. The flag is called @running and agent commands are issued
iff the flag is set. The flag itself is set when we connect to the
agent socket. And unset whenever we see DISCONNECT event from the
agent. Moreover, we must wake up all the threads waiting for the
agent. This is done by signalizing the condition they're waiting on.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Running shutdown with mode agent on a shutoff domain gives cryptic
error message:
virsh # shutdown --mode agent gentoo
error: Failed to shutdown domain gentoo
error: Guest agent is not responding: QEMU guest agent is not connected
After this patch, the error is more clear:
virsh # shutdown --mode agent gentoo
error: Failed to shutdown domain gentoo
error: Requested operation is not valid: domain is not running
Reported-by: Martin Kletzander <mkletzan@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Coverity points out that qemuMonitorGetAllBlockStatsInfo could return a
-1 and thus not fill in 'stats' (leaving it NULL). Then the call to
qemuMonitorBlockStatsUpdateCapacity will dereference it.
Now that we have macros for exclusive flags and flag requirements we can
use them to cleanup the code for setvcpus and error out for all wrong
flag combination.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Since the qemu capabilities are not initialized for offline VMs the
caller might get suboptimal error message:
$ virsh blockjob VM PATH --bandwidth 1
error: unsupported configuration: block jobs not supported with this QEMU binary
Move the checks after we make sure that the VM is alive.
The !modern code path needs to call qemuBlockJobEventProcess directly.
the modern code path will call it via qemuBlockJobSyncWait.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
We will want to use synchronous block jobs from qemu_migration as well,
so split this function out into a new source file.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
The documentation states that for shallow block copy the image has to
have the same guest visible content as backing file of the current
image if the file is being reused. This condition can be achieved also
with a raw file (or a qcow without a backing file) so remove the
condition that would disallow it.
(This patch additionally fixes crash described in
https://bugzilla.redhat.com/show_bug.cgi?id=1215569 )
Rather than have a separate routine to parse the alias of an iothread
returned from qemu in order to get the iothread_id value, parse the alias
when returning and just return the iothread_id in qemuMonitorIOThreadInfoPtr
This set of patches removes the function, changes the "char *name" to
"unsigned int" and handles all the fallout.
Coverity notes that the switch() used to check 'connected' values has
two DEADCODE paths (_DEFAULT & _LAST). Since 'connected' is a boolean
it can only be one or the other (CONNECTED or DISCONNECTED), so it just
seems pointless to use a switch to get "all" values. Convert to if-else
Add qemuDomainAddIOThread and qemuDomainDelIOThread in order to add or
remove an IOThread to/from the host either for live or config optoins
The implementation for the 'live' option will use the iothreadpids list
in order to make decision, while the 'config' option will use the
iothreadids list. Additionally, for deletion each may have to adjust
the iothreadpin list.
IOThreads are implemented by qmp objects, the code makes use of the existing
qemuMonitorAddObject or qemuMonitorDelObject APIs.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Remove the iothreadspin array from cputune and replace with a cpumask
to be stored in the iothreadids list.
Adjust the test output because our printing goes in order of the iothreadids
list now.
Add 'thread_id' to the virDomainIOThreadIDDef as a means to store the
'thread_id' as returned from the live qemu monitor data.
Remove the iothreadpids list from _qemuDomainObjPrivate and replace with
the new iothreadids 'thread_id' element.
Rather than use the default numbering scheme of 1..number of iothreads
defined for the domain, use the iothreadid's list for the iothread_id
Since iothreadids list keeps track of the iothread_id's, these are
now used in place of the many places where a for loop would "know"
that the ID was "+ 1" from the array element.
The new tests ensure usage of the <iothreadid> values for an exact number
of iothreads and the usage of a smaller number of <iothreadid> values than
iothreads that exist (and usage of the default numbering scheme).
If a user hot-attaches the guest agent channel libvirt would ignore it
until the restart of libvirtd or shutdown/destroy and start of the VM
itself.
This patch adds code that opens or closes the guest agent connection
according to the state of the guest agent channel according to
connect/disconnect events.
To allow opening the channel from the event handler qemuConnectAgent
needed to be exported.
Every domain that grabs a domain object to work over should
reference it to make sure it won't disappear meanwhile.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is basically turning qemuDomObjEndAPI into a more general
function. Other drivers which gets a reference to domain objects may
benefit from this function too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
just as what b8e25c35d7 did, we
fall back to the ACPI method when the guest agent is unresponsive
in qemuDomainReboot().
Signed-off-by: YueWenyuan <yuewenyuan@huawei.com>
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Because packets going through the egress from a bridge (where our
bandwidth limiting takes place) have no information about which
interface they came from, the QoS rules that we create instead
use the source MAC address of the packets to make their decisions
about which QDisc the packet should be in.
One flaw in this is that when a guest changed the MAC address it
used, packets from the guest would no longer be put into the
correct QDisc, but would instead be put in an "unprivileged"
class, resulting in the bandwidth "floor" (minimum guaranteed)
being no longer honored.
Now that libvirt has infrastructure to capture and respond to
RX_FILTER_CHANGE events from qemu (sent whenever a guest
interface modifies its MAC address, among other things), we can
notice when a guest MAC address changes, and update the QoS rules
accordingly, so that bandwidth floor is honored even after a
guest MAC address change.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
qemuDomainSetMemoryFlags() would allow to set the initial memory greater
than the <maxMemory> field. While the configuration would not work as
memory hotplug requires NUMA to be enabled and the
qemuDomainSetMemoryFlags() API does not work on NUMA guests this just
fixes a corner case.
The fix is still worth though as it allows to induce an invalid
configuration and make the VM vanish on libvirt restart.
Additionally this tweaks error message to be more accurate.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
This needs to specified in way too many places for a simple validation
check. The ostype/arch/virttype validation checks later in
DomainDefParseXML should catch most of the cases that this was covering.
https://bugzilla.redhat.com/show_bug.cgi?id=1209948
So we have this bug. The virConnectGetDomainCapabilities() API
performs a couple of checks before it produces any result. One of
the checks is if the architecture requested by user can be run by
the binary (again user provided). However, the check is pretty
dumb. It merely compares if the default binary architecture
matches the one provided by user. However, a qemu binary can run
multiple architectures. For instance: qemu-system-ppc64 can run:
ppc, ppcle, ppc64, ppc64le and ppcemb. The default is ppc64, so
if user requested something else, like ppc64le, the check would
have failed without obvious reason.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When a qemu domain is to be rebooted, from outside, at libvirt
level it looks like regular shutdown. To really restart the
domain, libvirt needs to issue reset command on the monitor once
SHUTDOWN event appeared. So, in order to differentiate bare
shutdown and reboot libvirt uses a variable within domain private
data. It's called fakeReboot. When the reboot API is called, the
variable is set, but when the shutdown API is called it must be
cleared out. But it was not for every possible case. So if user
called virDomainReboot(), and there was no ACPI daemon running
inside the guest (so guest didn't initiated shutdown sequence)
and then virDomainShutdown(mode=agent) was called bad thing
happened. We remembered the fakeReboot and instead of shutting
the domain down, we just rebooted it.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Rather than erroring out make the best attempt to retrieve other data if
disks are inaccessible or missing. The failure will still be logged
though.
Since the bulk stats API is called on multiple domains an error like
this makes the API unusable. This regression was introduced by commit
596a137134
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1209394
After set memory parameters for running domain, save the change to live
xml is needed otherwise it will disappear after restart libvirtd.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1211548
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Apparently for Xen-devel 'index' is a global and causes a build failure,
so just use the shortened 'idx' instead to avoid the conflict.
Signed-off-by: John Ferlan <jferlan@redhat.com>
QEMU does not abandon the mirror. The job carries on in the synchronised
phase and it might be either pivoted again or cancelled. The commit
hints that the described behavior was happening in a downstream version.
If the command returns false there are two possible options:
1) qemu did not reach the point where it would ask the block job to
pivot
2) pivotting failed in the actual qemu coroutine
If either of those would happen we return failure and reset the
condition that waits for the block job to complete. This makes the API
fail but in case where qemu would actually abandon the mirror the fact
is notified via the event and handled asynchronously.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1202704
qemuDomainBlockJobImpl become an unmaintainable mess over the years of
adding new stuff to it. This patch starts splitting up individual
functions from it until it can be killed entirely.
In bulk this will add lines of code rather than delete them but it will
be traded for maintainability.
Previously we checked that the vcpu we are trying to set is in range of
the number of threads presented by qemu. The problem is that if the VM
is offline the count is 0. Since the condition subtracted 1 from the
count the number would overflow and the check would never trigger.
Change the condition for more sensible ones with specific error
messages.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1208434
This patch adds checks for empty bitmaps right after the calls of
virBitmapParse. These only include spots where set API's are called and
where domain's XML is parsed.
Also, it partially reverts commit 983f5a which added a check for
invalid nodeset "0,^0" into virBitmapParse function. This change broke
the logic, as an empty bitmap should not cause an error.
https://bugzilla.redhat.com/show_bug.cgi?id=1210545
Future IOThread setting patches would copy the code anyway, so create
and generalize the adding of pindef for the vcpu and the pinning of the
thread into their own APIs.
Future IOThread setting patches would copy the code anyway, so create
and generalize a delete cgroup and pindef for the vcpu into its own API.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Future IOThread setting patches would copy the code anyway, so create
and generalize the add the vcpu to a cgroup into its own API.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Support for drive-reopen was never present in the upstream code so we
don't need to pause the VM when doing the block pivot. Kill all the
code related to this semi-upstream artifact.
Currently we check qemuCaps before starting the block job. But qemuCaps
isn't available on a stopped domain, which means we get a misleading
error message in this case:
# virsh domstate example
shut off
# virsh blockjob example vda
error: unsupported configuration: block jobs not supported with this QEMU binary
Move the qemuCaps check into the block job so that we are guaranteed the
domain is running.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
https://bugzilla.redhat.com/show_bug.cgi?id=1206479
As described in virDomainBlockCopy() parameters description, the
VIR_DOMAIN_BLOCK_COPY_GRANULARITY parameter may require the value to
have some specific attributes (e.g. be a power of two or fall within a
certain range). And in qemu, a power of two is required. However, our
code does not check that and let qemu operation fail. Moreover, the
virsh man page is not as exact as it could be in this respect.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
virDomainHasDiskMirror() currently detects only jobs that add the mirror
elements. Since some operations like migration are interlocked by
existing block jobs on the given domain the check needs to be
instrumented to check regular jobs too.
This patch renames virDomainHasDiskMirror to virDomainHasDiskBlockjob
and adds an argument that allows to select that it returns true only for
block copy jobs as those interlock making the domain persistent.
Other two uses trigger on any block job type.
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
If any disk of a VM was involved in a (copy) block job we refused to do
a snapshot. As not only copy jobs interlock snapshots and the
interlocking is applicable to individual disks only we can make the
check in a more individual fashion and interlock all block job types
supported by libvirt.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1203628
In the order of appearance:
* MAX_LISTEN - never used
added by 23ad665c (qemud) and addec57 (lock daemon)
* NEXT_FREE_CLASS_ID - never used, added by 07d1b6b
* virLockError - never used, added by eb8268a4
* OPENVZ_MAX_ARG, CMDBUF_LEN, CMDOP_LEN
unused since the removal of ADD_ARG_LIT in d8b31306
* QEMU_NB_PER_CPU_STAT_PARAM - unused since 897808e
* QEMU_CMD_PROMPT, QEMU_PASSWD_PROMPT - unused since 1dc10a7
* TEST_MODEL_WORDSIZE - unused since c25c18f7
* TEMPDIR - never used, added by 714bef5
* NSIG - workaround around old headers
added by commit 60ed1d2
unused since virExec was moved by commit 02e8691
* DO_TEST_PARSE - never used, added by 9afa006
* DIFF_MSEC, GETTIMEOFDAY - unused since eee6eb6
When the synchronous pivot option is selected, libvirt would not update
the backing chain until the job was exitted. Some applications then
received invalid data as their job serialized first.
This patch removes polling to wait for the ABORT/PIVOT job completion
and replaces it with a condition. If a synchronous operation is
requested the update of the XML is executed in the job of the caller of
the synchronous request. Otherwise the monitor event callback uses a
separate worker to update the backing chain with a new job.
This is a regression since 1a92c71910
When the ABORT job is finished synchronously you get the following call
stack:
#0 qemuBlockJobEventProcess
#1 qemuDomainBlockJobImpl
#2 qemuDomainBlockJobAbort
#3 virDomainBlockJobAbort
While previously or while using the _ASYNC flag you'd get:
#0 qemuBlockJobEventProcess
#1 processBlockJobEvent
#2 qemuProcessEventHandler
#3 virThreadPoolWorker
Later on I'll be adding a condition that will allow to synchronise a
SYNC block job abort. The approach will require this code to be called
from two different places so it has to be extracted into a helper.
Commit 1a92c719 moved code to handle block job events to a different
function that is executed in a separate thread. The caller of
processBlockJob handles locking and unlocking of @vm, so the we should
not do it in the function itself.
The block copy API takes the speed in bytes/s rather than MiB/s that was
the prior approach in virDomainBlockRebase. We correctly converted the
speed to bytes/s in the old API but we still called the common helper
virDomainBlockCopyCommon with the unadjusted variable.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1207122
When getting info on NUMA parameters for domain,
virCgroupGetCpusetMems() may be called. However, as of 43b67f2e
the call is guarded by check if memory controller is present.
Even though it may be not obvious instantly, NUMA parameters are
stored under cpuset controller. Therefore the check needs to look
like this:
if (!virCgroupHasController(priv->cgroup,
VIR_CGROUP_CONTROLLER_CPUSET) ||
virCgroupGetCpusetMems(priv->cgroup, &nodeset) < 0) {
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Blockcopy to non-file destination is not supported according the code,
but a 'goto endjob' is missed after checking the destination.
This leads to calling drive-mirror with wrong parameters.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1206406
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
We don't have to modify cpuset.mems on hosts without NUMA. It also
fixes an error message that you get instead of success if you trying
update vcpus of a guest on a host without NUMA.
error: internal error: NUMA isn't available on this host
Signer-off-by: Pavel Hrdina <phrdina@redhat.com>
We should call virDomainLiveConfigHelperMethod ASAP because this
function transfers VIR_DOMAIN_AFFECT_CURRENT to VIR_DOMAIN_AFFECT_LIVE
or VIR_DOMAIN_AFFECT_CONFIG. All other additional checks for those two
flags should consider that the user give us VIR_DOMAIN_AFFECT_CURRENT.
Remove the unnecessary check whether the domain is live in case of
VIR_DOMAIN_VCPU_GUEST because this check is done by
virDomainLiveConfigHelperMethod.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
While debugging the support for responding to qemu RX_FILTER_CHANGED
events, I had changed the "ignoring this event" log message from
VIR_DEBUG to VIR_WARN, but forgot to change it back before
pushing. Since many guest OSes make enough changes to multicast lists
and/or promiscuous mode settings to trigger this message, it's
starting to show up as a red herring in bug reports.
Add a few helpers that allow to operate with memory device definitions
on the domain config and use them to implement memory device coldplug in
the qemu driver.
This patch adds code that parses and formats configuration for memory
devices.
A simple configuration would be:
<memory model='dimm'>
<target>
<size unit='KiB'>524287</size>
<node>0</node>
</target>
</memory>
A complete configuration of a memory device:
<memory model='dimm'>
<source>
<pagesize unit='KiB'>4096</pagesize>
<nodemask>1-3</nodemask>
</source>
<target>
<size unit='KiB'>524287</size>
<node>1</node>
</target>
</memory>
This patch preemptively forbids use of the <memory> device in individual
drivers so the users are warned right away that the device is not
supported.
Wikipedia's list of common misspellings [1] has a machine-readable
version. This patch fixes those misspellings mentioned in the list
which don't have multiple right variants (as e.g. "accension", which can
be both "accession" and "ascension"), such misspellings are left
untouched. The list of changes was manually re-checked for false
positives.
[1] https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Issue #1 - A call to virBitmapNew did not check if the allocation
failed which could lead to a NULL dereference
Issue #2 - When deleting the pin entries from the config file, the
code loops from the number of elements down to the "new" vcpu count;
however, the pin id values are numbered 0..n-1 not 1..n, so the "first"
pin attempt would never work. Luckily the check was for whether the
incoming 'n' (vcpu id) matched the entry in the array from 0..arraysize
rather than a dereference of the 'n' entry
The function needs a pointer to the network to get list of DHCP
leases. The pointer is obtained via virNetworkLookupByName() which
requires callers to free the returned network once no longer needed.
Otherwise it's leaked.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1199182 documents that
after a series of disk snapshots into existing destination images,
followed by active commits of the top image, it is possible for
qemu 2.2 and earlier to end up tracking a different name for the
image than what it would have had when opening the chain afresh.
That is, when starting with the chain 'a <- b <- c', the name
associated with 'b' is how it was spelled in the metadata of 'c',
but when starting with 'a', taking two snapshots into 'a <- b <- c',
then committing 'c' back into 'b', the name associated with 'b' is
now the name used when taking the first snapshot.
Sadly, older qemu doesn't know how to treat different spellings of
the same filename as identical files (it uses strcmp() instead of
checking for the same inode), which means libvirt's attempt to
commit an image using solely the names learned from qcow2 metadata
fails with a cryptic:
error: internal error: unable to execute QEMU command 'block-commit': Top image file /tmp/images/c/../b/b not found
even though the file exists. Trying to teach libvirt the rules on
which name qemu will expect is not worth the effort (besides, we'd
have to remember it across libvirtd restarts, and track whether a
file was opened via metadata or via snapshot creation for a given
qemu process); it is easier to just always directly ask qemu what
string it expects to see in the first place.
As a safety valve, we validate that any name returned by qemu
still maps to the same local file as we have tracked it, so that
a compromised qemu cannot accidentally cause us to act on an
incorrect file.
* src/qemu/qemu_monitor.h (qemuMonitorDiskNameLookup): New
prototype.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskNameLookup):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorDiskNameLookup): New function.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskNameLookup)
(qemuMonitorJSONDiskNameLookupOne): Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit)
(qemuDomainBlockJobImpl): Use it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Only selected fields from the disk source were copied when cold updating
source in a CDROM drive. When such drive was backed by a network file
this resulted into corruption of the definition:
<disk type='network' device='cdrom'>
<driver name='qemu' type='raw' cache='none'/>
<source protocol='gluster' name='gluster-vol1(null)'>
<host name='localhost'/>
</source>
<target dev='vdc' bus='virtio'/>
<readonly/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
</disk>
Update the whole source instead of cherry-picking elements.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1166024
By querying the qemu guest agent with the QMP command
"guest-network-get-interfaces" and converting the received JSON
output to structured objects.
Although "ifconfig" is deprecated, IP aliases created by "ifconfig"
are supported by this API. The legacy syntax of an IP alias is:
"<ifname>:<alias-name>". Since we want all aliases to be clubbed
under parent interface, simply stripping ":<alias-name>" suffices.
Note that IP aliases formed by "ip" aren't visible to "ifconfig",
and aliases created by "ip" do not have any specific name. But
we are lucky, as qemu guest agent detects aliases created by both.
src/qemu/qemu_agent.h:
* Define qemuAgentGetInterfaces
src/qemu/qemu_agent.c:
* Implement qemuAgentGetInterface
src/qemu/qemu_driver.c:
* New function qemuGetDHCPInterfaces
* New function qemuDomainInterfaceAddresses
src/remote_protocol-sructs:
* Define new structs
tests/qemuagenttest.c:
* Add new test: testQemuAgentGetInterfaces
Test cases for IP aliases, 0 or multiple ipv4/ipv6 address(es)
Signed-off-by: Nehal J Wani <nehaljw.kkd1@gmail.com>
Patch 51f9f03a4c introduces a regression
where if a blockCommit operation fails the disk is still marked as being
part of a block job but can't be unmarked later.
As pointed out by jtomko in his review of the IOThreads pinning code:
http://www.redhat.com/archives/libvir-list/2015-March/msg00495.html
there are some comments sprinkled in indicating IOThreads were using
the same structure as the VcpuPin code...
This is the first patch of a few that will change the virDomainVcpuPin*
structures and code to just virDomainPin* - starting with the data
structure naming...
During his review of the iothreads pin setting code, Pavel noted that
there was a potential memory leak with respect to how the newVcpuPin
is handled and the goto endjob's in failure paths which would not free
the memory. For reference, See:
http://www.redhat.com/archives/libvir-list/2015-March/msg00415.html
As there are two possible approaches to define a domain's memory size -
one used with legacy, non-NUMA VMs configured in the <memory> element
and per-node based approach on NUMA machines - the user needs to make
sure that both are specified correctly in the NUMA case.
To avoid this burden on the user I'd like to replace the NUMA case with
automatic totaling of the memory size. To achieve this I need to replace
direct access to the virDomainMemtune's 'max_balloon' field with
two separate getters depending on the desired size.
The two sizes are needed as:
1) Startup memory size doesn't include memory modules in some
hypervisors.
2) After startup these count as the usable memory size.
Note that the comments for the functions are future aware and document
state that will be present after a few later patches.
Surprisingly we did not grab a VM job when a block job finished and we'd
happily rewrite the backing chain data. This made it possible to crash
libvirt when queueing two backing chains tightly and other badness.
To fix it, add yet another handler to the helper thread that handles
monitor events that require a job.
Now that qemuDomainBlocksStatsGather provides functions of both
qemuMonitorGetBlockStatsParamsNumber and qemuMonitorGetBlockStatsInfo we
can reuse it and kill a lot of code.
Additionally as a bonus qemuDomainBlockStatsFlags will now support
summary statistics so add a statement to the virsh man page about that.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1142636
In the LXC driver, if the disk path is not provided the API returns
total statistics for all disks of the domain. With the new text monitor
implementation this can be now done in the qemu driver too.
Add code that wil total the stats for all disks if the path is not
provided.
Extract the code to look up the disk alias and return the block stats
struct so that it can be reused later in qemuDomainBlockStatsFlags.
The function uses qemuMonitorGetAllBlockStatsInfo instead of
qemuMonitorGetBlockStatsInfo.
By adding a call and check of return of virBitmapToData to the
IOThreads code, my Coverity checker lets me know qemuDomainHelperGetVcpus
also needs to check the status...
Depending on the flags passed, either attempt to return the active/live
IOThread data for the domain or the config data.
The active/live path will call into the Monitor in order to get the
IOThread data and then correlate the thread_id's returned from the
monitor to the currently running system/threads in order to ascertain
the affinity for each iothread_id.
The config path will map each of the configured IOThreads and return
any configured iothreadspin data
Signed-off-by: John Ferlan <jferlan@redhat.com>
There was a mess in the way how we store unlimited value for memory
limits and how we handled values provided by user. Internally there
were two possible ways how to store unlimited value: as 0 value or as
VIR_DOMAIN_MEMORY_PARAM_UNLIMITED. Because we chose to store memory
limits as unsigned long long, we cannot use -1 to represent unlimited.
It's much easier for us to say that everything greater than
VIR_DOMAIN_MEMORY_PARAM_UNLIMITED means unlimited and leave 0 as valid
value despite that it makes no sense to set limit to 0.
Remove unnecessary function virCompareLimitUlong. The update of test
is to prevent the 0 to be miss-used as unlimited in future.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1146539
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
When the domain's source disk type is network, if source protocol is rbd
or sheepdog, the 'if().. break' will end the current case, which lead to
miss check the driver type is raw or qcow2. Libvirt will allow to create
internal snapshot for a running domain with raw format disk which based
on rbd storage.
While both protocols support internal snapshots of the disk qemu is not
able to use it as it requires some place to store the memory image. The
check if the disk is backed by a qcow2 image needs to be executed
always.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1179533
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
Previously when a domain would get stuck in a domain job due to a
programming mistake we'd report the following control state:
$ virsh domcontrol domain
occupied (1424343406.150s)
The timestamp is invalid as the monitor was not entered for that domain.
We can use that to detect that the domain has an active job and report a
better error instead:
$ virsh domcontrol domain
error: internal (locking) error
We have two different places that needs to be updated while touching
code for allocation spice ports. Add a bool option to
'qemuProcessSPICEAllocatePorts' function to switch between true and fake
allocation so we can use this function also in qemu_driver to generate
native domain definition.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The problem here was that when opening a channel, we were checking
whether the channel given is alias (can't be NULL for running domain) or
it's name, which can be NULL (for example with spicevmc). In case of
such domain qemuDomainOpenChannel() made the daemon crash.
STREQ_NULLABLE() is safe to use since the code in question is wrapped in
"if (name)" and is more readable, so use that instead of checking for
non-NULL "vm->def->channels[i]->target.name".
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
NUMA enabled guest configuration explicitly specifies memory sizes for
individual nodes. Allowing the virDomainSetMemoryFlags API (and friends)
to change the total doesn't make sense as the individual node configs
are not updated in that case.
Forbid use of the API in case NUMA is specified.
Commit f7afeddc added code to report to systemd an array of interface
indexes for all tap devices used by a guest. Unfortunately it not only
didn't add code to report the ifindexes for macvtap interfaces
(interface type='direct') or the tap devices used by type='ethernet',
it ended up sending "-1" as the ifindex for each macvtap or hostdev
interface. This resulted in a failure to start any domain that had a
macvtap or hostdev interface (or actually any type other than
"network" or "bridge").
This patch does the following with the nicindexes array:
1) Modify qemuBuildInterfaceCommandLine() to only fill in the
nicindexes array if given a non-NULL pointer to an array (and modifies
the test jig calls to the function to send NULL). This is because
there are tests in the test suite that have type='ethernet' and still
have an ifname specified, but that device of course doesn't actually
exist on the test system, so attempts to call virNetDevGetIndex() will
fail.
2) Even then, only add an entry to the nicindexes array for
appropriate types, and to do so for all appropriate types ("network",
"bridge", and "direct"), but only if the ifname is known (since that
is required to call virNetDevGetIndex().
https://bugzilla.redhat.com/show_bug.cgi?id=1183869
Soo. you've successfully started yourself a domain. And since you want
to use it on your host exclusively you are confident enough to
passthrough the host CPU model, like this:
<cpu mode='host-passthrough'/>
Then, after a while, you want to save the domain into a file (e.g.
virsh save dom dom.save). And here comes the trouble. The file consist
of two parts: Libvirt header (containing domain XML among other
things), and qemu migration data. Now, the domain XML in the header is
formatted using special flags (VIR_DOMAIN_XML_SECURE |
VIR_DOMAIN_XML_UPDATE_CPU | VIR_DOMAIN_XML_INACTIVE |
VIR_DOMAIN_XML_MIGRATABLE).
Then, on your way back from the bar, you think of changing something
in the XML in the saved file (we have a command for it after all), say
listen address for graphics console. So you successfully type in the
command:
virsh save-image-edit dom.save
Change all the bits, and exit the editor. But instead of success
you're left with sad error message:
error: unsupported configuration: Target CPU model <null> does not
match source Pentium Pro
Sigh. Digging into the code you see lines, where we check for ABI
stability. The new XML you've produced is compared with the old one
from the saved file to see if qemu ABI will break or not. Wait, what?
We are using different flags to parse the XML you've provided so we
were just lucky it worked in some cases? Yep, that's right.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
As virDomainNumatuneSet now doesn't allocate the virDomainNuma object
any longer it's not necessary to pass the pointer to a pointer to store
the object as it will not change any longer.
While touching the parameter definitions I've also changed the name of
the parameter to "numa".
https://bugzilla.redhat.com/show_bug.cgi?id=1126762
Commit 43b67f introduced a deadlock issue when we use numatune
to change numa settings to a vm in session mode.
Jump to endjob instead of jump to cleanup.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
The qemuDomainHelperGetVcpus attempted to report an error when the
vcpupids info was NULL. Unfortunately earlier code would clamp the
value of 'maxinfo' to 0 when nvcpupids was 0, so the error reporting
would end up being skipped.
This lead to 'virsh vcpuinfo <dom>' just returning an empty list
instead of giving the user a clear error.
In the event we're falling into the code that tries to create the file
in a forked environment (VIR_FILE_OPEN_FORK) we pass different mode bits,
but those are never set because the virFileOpenForceOwnerMode has a check
if the OPEN_FORCE_MODE bit is set before attempting to change the mode.
Since this is a special case it seems reasonable to set u+rw,g+rw,o
Export the required helpers and add backend code to hotplug RNG devices.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Some code paths have special logic depending on the page size
reported by sysconf, which in turn affects the test results.
We must mock this so tests always have a consistent page size.
This patch enables synchronization of the host macvtap
device options with the guest device's in response to the
NIC_RX_FILTER_CHANGED event.
The following device options will be synchronized:
* PROMISC
* MULTICAST
* ALLMULTI
Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1158034
If we're expecting to create a file somewhere and that fails for some
reason during qemuOpenFileAs, then we unlink the path we're attempting
to create leaving no way to determine what the "existing" privileges,
protections, or labels are that caused the failure (open, change owner
and group, change mode, etc.).
Furthermore, if we fall into the path where we'll be opening / creating
the file using VIR_FILE_OPEN_FORK, we need to first unlink/delete the file
we created in the first path; otherwise, the attempt by the child process
to open as some specific user:group may fail because the file was already
created using nfsnobody:nfsnobody. Again, if we didn't create the file we
don't want to blindly delete what already exists. Thus, a second reason for
the original check to set need_unlink to false when we find the file with
CREAT set, but already existing.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Commit id '540c339a' to fix issues with reference counting and transient
domains moved the qemuDomainObjEndAsyncJob call prior to the attempt to
restart the guest CPU's resulting in an error:
error: Failed to save domain rhel70 to /tmp/pl/rhel70.save
error: internal error: unexpected async job 3
when (ret != 0) - eg, the error path from qemuDomainSaveMemory.
This patch will adjust the logic to call the EndAsyncJob only after
we've tried to restart the guest CPUs. It also needs to adjust the
test for qemuDomainRemoveInactive to add the ret == 0 condition.
Additionally, if we get to endjob: because of some error earlier, then
we need to save that error in the event the CPU restart logic fails.
We don't want to return the error from CPU restart failure, rather we
want to return the error from the failed save that caused us to fall
into the retry to start the CPU logic.
Signed-off-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1183890
When we try to update a xml to a image file, we will clear the
graphics passwd settings, because we do not pass VIR_DOMAIN_XML_SECURE
to qemuDomainDefCopy, qemuDomainDefFormatBuf won't format the passwd.
Add VIR_DOMAIN_XML_SECURE flag when we call qemuDomainDefCopy
in qemuDomainSaveImageUpdateDef.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
For stateless, client side drivers, it is never correct to
probe for secondary drivers. It is only ever appropriate to
use the secondary driver that is associated with the
hypervisor in question. As a result the ESX & HyperV drivers
have both been forced to do hacks where they register no-op
drivers for the ones they don't implement.
For stateful, server side drivers, we always just want to
use the same built-in shared driver. The exception is
virtualbox which is really a stateless driver and so wants
to use its own server side secondary drivers. To deal with
this virtualbox has to be built as 3 separate loadable
modules to allow registration to work in the right order.
This can all be simplified by introducing a new struct
recording the precise set of secondary drivers each
hypervisor driver wants
struct _virConnectDriver {
virHypervisorDriverPtr hypervisorDriver;
virInterfaceDriverPtr interfaceDriver;
virNetworkDriverPtr networkDriver;
virNodeDeviceDriverPtr nodeDeviceDriver;
virNWFilterDriverPtr nwfilterDriver;
virSecretDriverPtr secretDriver;
virStorageDriverPtr storageDriver;
};
Instead of registering the hypervisor driver, we now
just register a virConnectDriver instead. This allows
us to remove all probing of secondary drivers. Once we
have chosen the primary driver, we immediately know the
correct secondary drivers to use.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The code modifies the domain configuration but doesn't take a MODIFY
type job to do so.
This patch also fixes a few very long lines of code around the touched
parts.
The ACL check didn't check the VIR_DOMAIN_XML_SECURE flag and the
appropriate permission for it. Found via code inspection while fixing
permissions for save images.
Depending on the context, either error out if the domain
has disappeared in the meantime, or just ignore the value
to allow marking the function as ATTRIBUTE_RETURN_CHECK.
https://bugzilla.redhat.com/show_bug.cgi?id=1161024
In the device type-specific functions, exit early
if the domain has disappeared, because the cleanup
should have been done by qemuProcessStop.
Check the return value in processDeviceDeletedEvent
and qemuProcessUpdateDevices.
Skip audit and removing the device from live def because
it has already been cleaned up.
The virDomainDefineXMLFlags and virDomainCreateXML APIs both
gain new flags allowing them to be told to validate XML.
This updates all the drivers to turn on validation in the
XML parser when the flags are set
Make a local copy of the disk alias instead of pointing
to the domain definition, which might get freed if
the domain dies while we're in monitor.
Also exit early if that happens.
Exit the monitor right after we've done with it to get
the virDomainObjPtr lock back, otherwise we might be accessing
vm->def while it's being cleaned up by qemuProcessStop.
If the domain crashed while we were in the monitor, exit
early instead of changing vm->def which is now the persistent
definition.
Commit e3435caf fixed hot-plugging of vcpus with strict memory pinning
on NUMA hosts, but unfortunately it also broke updating number of vcpus
for offline guests using our API.
The issue is that we try to create a cpu cgroup for non-running guest
which fails as there are no cgroups for that domain. We should create
cgroups and update cpuset.mems only if we are hot-plugging.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
When create inactive external snapshot, after update disk definitions,
virDomainSaveConfig is needed, if not after restart libvirtd the new snapshot
file definitions in xml will be lost.
Reproduce steps:
1. prepare a shut off guest
$ virsh domstate rhel7 && virsh domblklist rhel7
shut off
Target Source
------------------------------------------------
vda /var/lib/libvirt/images/rhel7.img
2. create external disk snapshot
$ virsh snapshot-create rhel7 --disk-only && virsh domblklist rhel7
Domain snapshot 1417882967 created
Target Source
------------------------------------------------
vda /var/lib/libvirt/images/rhel7.1417882967
3. restart libvirtd then check guest source file
$ service libvirtd restart && virsh domblklist rhel7
Redirecting to /bin/systemctl restart libvirtd.service
Target Source
------------------------------------------------
vda /var/lib/libvirt/images/rhel7.img
This was first reported by Eric Blake
http://www.redhat.com/archives/libvir-list/2014-December/msg00369.html
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
The virDomainDefParse* and virDomainDefFormat* methods both
accept the VIR_DOMAIN_XML_* flags defined in the public API,
along with a set of other VIR_DOMAIN_XML_INTERNAL_* flags
defined in domain_conf.c.
This is seriously confusing & error prone for a number of
reasons:
- VIR_DOMAIN_XML_SECURE, VIR_DOMAIN_XML_MIGRATABLE and
VIR_DOMAIN_XML_UPDATE_CPU are only relevant for the
formatting operation
- Some of the VIR_DOMAIN_XML_INTERNAL_* flags only apply
to parse or to format, but not both.
This patch cleanly separates out the flags. There are two
distint VIR_DOMAIN_DEF_PARSE_* and VIR_DOMAIN_DEF_FORMAT_*
flags that are used by the corresponding methods. The
VIR_DOMAIN_XML_* flags received via public API calls must
be converted to the VIR_DOMAIN_DEF_FORMAT_* flags where
needed.
The various calls to virDomainDefParse which hardcoded the
use of the VIR_DOMAIN_XML_INACTIVE flag change to use the
VIR_DOMAIN_DEF_PARSE_INACTIVE flag.
https://bugzilla.redhat.com/show_bug.cgi?id=1135339 documents some
confusing behavior when a user tries to start an inactive block
commit in a second connection while there is already an on-going
active commit from a first connection. Eventually, qemu will
support multiple simultaneous block jobs, but as of now, it does
not; furthermore, libvirt also needs an overhaul before we can
support simultaneous jobs. So, the best way to avoid confusing
ourselves is to quit relying on qemu to tell us about the situation
(where we risk getting in weird states) and instead forbid a
duplicate block commit ourselves.
Note that we are still relying on qemu to diagnose attempts to
interrupt an inactive commit (since we only track XML of an active
commit), but as inactive commit is less confusing for libvirt to
manage, there is less that can go wrong by leaving that detection
up to qemu.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Hoist check for
active commit to occur earlier outside of conditions.
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1177723
When setting new bandwidth limits via
virDomainSetInterfaceParameters, the old ones are cleared first.
However, if setting the new ones fails, the old are already gone
and interface is left in inconsistent state. Therefore, right
before failing we ought to try to restore the old bandwidth.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Add the possibility to have more than one IP address configured for a
domain network interface. IP addresses can also have a prefix to define
the corresponding netmask.
There is one problem that causes various errors in the daemon. When
domain is waiting for a job, it is unlocked while waiting on the
condition. However, if that domain is for example transient and being
removed in another API (e.g. cancelling incoming migration), it get's
unref'd. If the first call, that was waiting, fails to get the job, it
unref's the domain object, and because it was the last reference, it
causes clearing of the whole domain object. However, when finishing the
call, the domain must be unlocked, but there is no way for the API to
know whether it was cleaned or not (unless there is some ugly temporary
variable, but let's scratch that).
The root cause is that our APIs don't ref the objects they are using and
all use the implicit reference that the object has when it is in the
domain list. That reference can be removed when the API is waiting for
a job. And because each domain doesn't do its ref'ing, it results in
the ugly checking of the return value of virObjectUnref() that we have
everywhere.
This patch changes qemuDomObjFromDomain() to ref the domain (using
virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which
should be the only function in which the return value of
virObjectUnref() is checked. This makes all reference counting
deterministic and makes the code a bit clearer.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Coverity flagged commit 0282ca45 as introducing a memory leak;
in all my refactoring to make capacity probing conditional on
whether the image is non-raw, I missed deleting the unconditional
probe.
* src/qemu/qemu_driver.c (qemuStorageLimitsRefresh): Drop
redundant assignment.
Signed-off-by: Eric Blake <eblake@redhat.com>
Wire up backing chain recursion. For the first time, it is now
possible to get libvirt to expose that qemu tracks read statistics
on backing files, as well as report maximum extent written on a
backing file during a block-commit operation.
For a running domain, where one of the two images has a backing
file, I see the traditional output:
$ virsh domstats --block testvm2
Domain: 'testvm2'
block.count=2
block.0.name=vda
block.0.path=/tmp/wrapper.qcow2
block.0.rd.reqs=1
block.0.rd.bytes=512
block.0.rd.times=28858
block.0.wr.reqs=0
block.0.wr.bytes=0
block.0.wr.times=0
block.0.fl.reqs=0
block.0.fl.times=0
block.0.allocation=0
block.0.capacity=1310720000
block.0.physical=200704
block.1.name=vdb
block.1.path=/dev/sda7
block.1.rd.reqs=0
block.1.rd.bytes=0
block.1.rd.times=0
block.1.wr.reqs=0
block.1.wr.bytes=0
block.1.wr.times=0
block.1.fl.reqs=0
block.1.fl.times=0
block.1.allocation=0
block.1.capacity=1310720000
vs. the new output:
$ virsh domstats --block --backing testvm2
Domain: 'testvm2'
block.count=3
block.0.name=vda
block.0.path=/tmp/wrapper.qcow2
block.0.rd.reqs=1
block.0.rd.bytes=512
block.0.rd.times=28858
block.0.wr.reqs=0
block.0.wr.bytes=0
block.0.wr.times=0
block.0.fl.reqs=0
block.0.fl.times=0
block.0.allocation=0
block.0.capacity=1310720000
block.0.physical=200704
block.1.name=vda
block.1.path=/dev/sda6
block.1.backingIndex=1
block.1.rd.reqs=0
block.1.rd.bytes=0
block.1.rd.times=0
block.1.wr.reqs=0
block.1.wr.bytes=0
block.1.wr.times=0
block.1.fl.reqs=0
block.1.fl.times=0
block.1.allocation=327680
block.1.capacity=786432000
block.2.name=vdb
block.2.path=/dev/sda7
block.2.rd.reqs=0
block.2.rd.bytes=0
block.2.rd.times=0
block.2.wr.reqs=0
block.2.wr.bytes=0
block.2.wr.times=0
block.2.fl.reqs=0
block.2.fl.times=0
block.2.allocation=0
block.2.capacity=1310720000
I may later do a patch that trims the output to avoid 0 stats,
particularly for backing files (which are more likely to have
0 stats, at least for write statistics when no block-commit
is performed). Also, I still plan to expose physical size
information (qemu doesn't expose it yet, so it requires a stat,
and for block devices, a further open/seek operation). But
this patch is good enough without worrying about that yet.
* src/qemu/qemu_driver.c (QEMU_DOMAIN_STATS_BACKING): New internal
enum bit.
(qemuConnectGetAllDomainStats): Recognize new user flag, and pass
details to...
(qemuDomainGetStatsBlock): ...here, where we can do longer recursion.
(qemuDomainGetStatsOneBlock): Output new field.
Signed-off-by: Eric Blake <eblake@redhat.com>
In order to report stats on backing chains, we need to separate
the output of stats for one block from how we traverse blocks.
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Split...
(qemuDomainGetStatsOneBlock): ...into new helper.
Signed-off-by: Eric Blake <eblake@redhat.com>
A coming patch will make it optionally possible to list backing
chain block stats; in this mode of operation, block.counts is no
longer the number of <disks> in the domain, but the number of
blocks in the array being reported. We still want block.count
listed first, but rather than iterate the tree twice (once to
count, and once to list stats), it's easier to just touch things
up after the fact.
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Compute count
after the fact.
Signed-off-by: Eric Blake <eblake@redhat.com>
The prior refactoring can now be put to use. With the same domain
as the earlier commit 7b49926 (one qcow2 disk and an empty
cdrom drive):
$ virsh domstats --block foo
Domain: 'foo'
block.count=2
block.0.name=hda
block.0.path=/var/lib/libvirt/images/foo.qcow2
block.0.allocation=1309614080
block.0.capacity=42949672960
block.0.physical=1309671424
block.1.name=hdc
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Use
qemuStorageLimitsRefresh to report offline statistics.
Signed-off-by: Eric Blake <eblake@redhat.com>
Create a helper function that can be reused for gathering block
info from virDomainListGetStats.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Split guts...
(qemuStorageLimitsRefresh): ...into new helper function.
Signed-off-by: Eric Blake <eblake@redhat.com>
The documentation for virDomainBlockInfo was confusing: it stated
that 'physical' was the size of the container, then gave an example
of it being the amount of storage used by a sparse file (that is,
for a sparse raw image on a regular file, the wording implied
capacity==physical, while allocation was smaller; but the example
instead claimed physical==allocation). Since we use 'physical' for
the last offset of a block device, we should do likewise for
regular files.
Furthermore, the example claimed that for a qcow2 regular file,
allocation==physical. At the time the code was first written,
this was true (qcow2 files were allocated sequentially, and were
never sparse, so the last sector written happened to also match
the disk space occupied); but modern qemu does much better and
can punch holes for a qcow2 with allocation < physical.
Basically, after this patch, the three fields are now reliably
mapped as:
'capacity' - how much storage the guest can see (equal to
physical for raw images, determined by image metadata otherwise)
'allocation' - how much storage the image occupies (similar to
what 'du' would report)
'physical' - the last offset of the image (similar to what 'ls'
would report)
'capacity' can be larger than 'physical' (such as for a qcow2
image that does not vary much from a backing file) or smaller
(such as for a qcow2 file with lots of internal snapshots).
Likewise, 'allocation' can be (slightly) larger than 'physical'
(such as counting the tail of cluster allocations required to
round a file size up to filesystem granularity) or smaller
(for a sparse file). A block-resize operation changes capacity
(which, for raw images, also changes physical); many non-raw
images automatically grow physical and allocation as necessary
when starting with an allocation smaller than capacity; and even
when capacity and physical stay unchanged, allocation can change
when converting sectors from holes to data or back.
Note that this does not change semantics for qcow2 images stored
on block devices; there, we still rely on qemu to report the
highest written extent for allocation. So using this API to
track when to extend a block device because a qcow2 image is
about to exceed a threshold will not see any changes.
Also, note that virStorageVolInfo is unfortunately limited to
just 'capacity' and 'allocation' (we can't expand it to add
'physical', although we can expand the XML to add it there);
historically, that struct's 'allocation' value has reported
file size for qcow2 files (what this patch terms 'physical'
for a domain block device), but disk usage for raw files (what
this patch terms 'allocation'). So follow-up patches will be
needed to make storage volumes report the same allocation
values and get at physical values, where those differ.
* include/libvirt/libvirt-domain.h (_virDomainBlockInfo): Tweak
documentation to match saner definition.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): For regular
files, physical size is capacity, not allocation.
Signed-off-by: Eric Blake <eblake@redhat.com>
Ultimately, we want to avoid read()ing a file while qemu is running.
We still have to open() block devices to determine their physical
size, but that is safer. This patch rearranges code to group
together all code that reads the image, to make it easier for later
patches to skip the metadata collection when possible.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Check for empty
disk up front. Place metadata reading next to use.
Signed-off-by: Eric Blake <eblake@redhat.com>
A future patch will allow recursion into backing chains when
collecting block stats. This patch should not change behavior,
but merely moves out the common code that will be reused once
recursion is enabled, and adds the parameter that will turn on
recursion.
* src/qemu/qemu_monitor.h (qemuMonitorGetAllBlockStatsInfo)
(qemuMonitorBlockStatsUpdateCapacity): Add recursion parameter,
although it is ignored for now.
* src/qemu/qemu_monitor.h (qemuMonitorGetAllBlockStatsInfo)
(qemuMonitorBlockStatsUpdateCapacity): Likewise.
* src/qemu/qemu_monitor_json.h
(qemuMonitorJSONGetAllBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacity): Likewise.
* src/qemu/qemu_monitor_json.c
(qemuMonitorJSONGetAllBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacity): Add parameter, and
split...
(qemuMonitorJSONGetOneBlockStatsInfo)
(qemuMonitorJSONBlockStatsUpdateCapacityOne): ...into helpers.
(qemuMonitorJSONGetBlockStatsInfo): Update caller.
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Update caller.
* src/qemu/qemu_migration.c (qemuMigrationCookieAddNBD): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Right now, grabbing blockinfo always calls stat on the disk, then
opens the image to determine the capacity, using a throw-away
virStorageSourcePtr. This has a couple of drawbacks:
1. We are calling stat and opening a file on every invocation of
the API. However, there are cases where the stats should NOT be
changing between successive calls (if a domain is running, no
one should be changing the physical size of a block device or raw
image behind our backs; capacity of read-only files should not
be changing; and we are the gateway to the block-resize command
to know when the capacity of read-write files should be changing).
True, we still have to use stat in some cases (a sparse raw file
changes allocation if it is read-write and the amount of holes is
changing, and a read-write qcow2 image stored in a file changes
physical size if it was not fully pre-allocated). But for
read-only images, even this should be something we can remember
from the previous time, rather than repeating every call.
2. We want to enhance the power of virDomainListGetStats, by
sharing code. But we already have a virStorageSourcePtr for
each disk, and it would be easier to reuse the common structure
than to have to worry about the one-off virDomainBlockInfoPtr.
While this patch does not optimize reuse of information in point
1, it does get us closer to being able to do so; by updating a
structure that survives between consecutive calls.
* src/util/virstoragefile.h (_virStorageSource): Add physical, to
mirror virDomainBlockInfo; rearrange fields to match public struct.
(virStorageSourceCopy): Copy the new field.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Store into
storage source, then copy to block info.
Signed-off-by: Eric Blake <eblake@redhat.com>
In order for a future patch to virDomainListGetStats to reuse
some code for determining disk usage of offline domains, we
need to make it easier to pull out part of the guts of grabbing
blockinfo. The current implementation grabs a job fairly late
in the game, while getstats will already own a job; reordering
things so that the job is always grabbed up front in both
functions will make it easier to pull out the common code.
This patch results in grabbing a job in cases where one was not
previously needed, but as it is a query job, it should not be
noticeably slower.
This patch touches the same code as the fix for CVE-2014-6458
(commit b799259); in that patch, we avoided hotplug changing
a disk reference during the time of obtaining a monitor lock
by copying all data we needed and no longer referencing disk;
this patch goes the other way and ensures that by holding the
job, the disk cannot be changed so we no longer need to worry
about the disk being invalidated across the monitor lock.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Rearrange job
control to be outside of disk information.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit e3435caf added cleanup code to qemuDomainSetVcpusFlags() that was
not supposed to reset the error. Usual procedure was done, saving the
error to temporary variable, but it was never free'd, but rather leaked.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When hot-plugging a VCPU into the guest, kvm needs to allocate some data
from the DMA zone, which might be in a memory node that's not allowed in
cpuset.mems. Basically the same problem as there was with starting the
domain and due to which commit 7e72ac7878
exists. This patch just extends it to hotplugging as well.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1161540
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Instead of setting the value of cpuset.mems once when the domain starts
and then re-calculating the value every time we need to change the child
cgroup values, leave the cgroup alone and rather set the child data
every time there is new cgroup created. We don't leave any task in the
parent group anyway. This will ease both current and future code.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1174154
When we use attach-device add a hostdev or chr device which have a
iscsi address or others (just like guest agent, subsys iscsi disk...),
we will find there is no basic controller for our new attached device.
Somtimes this will make guest cannot start after we add them (although
they can start at the second time).
Signed-off-by: Luyao Huang <lhuang@redhat.com>
We can change vnc password by using virDomainUpdateDeviceFlags API with
live flag. But it can't be changed with config flag. Error is reported as
below.
error: Operation not supported: persistent update of device 'graphics' is not supported
This patch supports the graphics arguments changed with config flag.
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
A logic bug in qemuConnectGetAllDomainStats makes the code mark the
monitor as available when qemuDomainObjBeginJob fails, instead of when
it succeeds, as the correct flow requires.
This patch fixes the check and updates the code documentation
accordingly.
Broken by commit 57023c0a3a.
Signed-off-by: Francesco Romani <fromani@redhat.com>
When user doesn't have read access on one of the domains he requested,
the for loop could exit abruptly or continue and override pointer which
pointed to locked object.
This patch fixed two issues at once. One is that domflags might have
had QEMU_DOMAIN_STATS_HAVE_JOB even when there was no job started (this
is fixed by doing domflags |= QEMU_DOMAIN_STATS_HAVE_JOB only when the
job was acquired and cleaning domflags on every start of the loop.
Second one is that the domain is kept locked when
virConnectGetAllDomainStatsCheckACL() fails and continues the loop when
it didn't end. Adding a simple virObjectUnlock() and clearing the
pointer ought to do.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Avoid leaving the domain locked on a failed ACL check in
qemuDomainMigratePerform() and qemuDomainMigrateFinish2().
Introduced in commit abf75aea24 (Add ACL checks into the QEMU driver).
I'm about to make block stats optionally more complex to cover
backing chains, where block.count will no longer equal the number
of <disks> for a domain. For these reasons, it is nicer if the
statistics output includes the source path (for local files).
This patch doesn't add anything for network disks, although we
may decide to add that later.
With this patch, I now see the following for the same domain as
in the previous patch (one qcow2 file, and an empty cdrom drive):
$ virsh domstats --block foo
Domain: 'foo'
block.count=2
block.0.name=hda
block.0.path=/var/lib/libvirt/images/foo.qcow2
block.1.name=hdc
* src/libvirt-domain.c (virConnectGetAllDomainStats): Document
new field.
* tools/virsh.pod (domstats): Document new field.
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Return the new
stat for local files/block devices.
(QEMU_ADD_NAME_PARAM): Add parameter.
(qemuDomainGetStatsInterface): Update caller.
Signed-off-by: Eric Blake <eblake@redhat.com>
I noticed that for an offline domain, 'virsh domstats --block $dom'
was producing just the domain name, with no stats. But the older
'virsh domblkinfo' works just fine on offline domains. This patch
starts to get us closer, by at least reporting the disk names for
an offline domain.
With this patch, I now see the following for an offline domain
with one qcow2 disk and an empty cdrom drive:
$ virsh domstats --block foo
Domain: 'foo'
block.count=2
block.0.name=hda
block.1.name=hdc
* src/qemu/qemu_driver.c (qemuDomainGetStatsBlock): Don't short-circuit
output of block name.
Signed-off-by: Eric Blake <eblake@redhat.com>
qemuDomainGetStatsBlock() could leak a stats hash table if it
encountered OOM while populating the virTypedParameters.
Oddly, the fix doesn't even touch qemuDomainGetStatsBlock :)
* src/qemu/qemu_driver.c (QEMU_ADD_COUNT_PARAM)
(QEMU_ADD_NAME_PARAM): Don't return early.
(qemuDomainGetStatsInterface): Adjust caller.
Signed-off-by: Eric Blake <eblake@redhat.com>
When attempting to create internal system checkpoint with a passthrough
device qemu will report the following error:
error: operation failed: Error -22 while writing VM
This patch calls the function to check if migration is possible with
given VM and thus improves the error to:
error: Requested operation is not valid: domain has assigned non-USB host devices
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=874418#c19
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
If someone removes blockcopy storage file when still in mirroring phase
and then requesting blockjob abort using pivot, virsh cmd freezes. This
is not an issue with older qemu versions which did not support
asynchronous jobs (which we prefer by default).
As we have reached the mirroring phase successfully, polling monitor for
blockjob info always returns 1 and the loop never ends.
This fix introduces a check for qemuDomainBlockPivot return code, possibly
skipping the asynchronous waiting completely, if an error occurred and
asynchronous waiting was the preferred method.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1139567
Reconnect to the VM is a possibly long-running job spawned in a separate
thread. We should reload the snapshot defs and managedsave state prior
to spawning the thread to avoid blocking of the daemon startup which
would serialize on the VM lock.
Also the reloading code would violate the domain job held while
reconnecting as the loader functions don't create jobs.
Since virDomainSnapshotFree will call virObjectUnref anyway, let's just use
that directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
Since virDomainFree will call virObjectUnref anyway, let's just use that
directly so as to avoid the possibility that we inadvertently clear out
a pending error message when using the public API.
There is a race condition between the fopen and fscanf calls
in qemuGetProcessInfo. If fopen succeeds, there is a small
possibility that the file no longer exists before reading from it.
Now, if either fopen or fscanf calls fail, the function will behave
just as only fopen had failed.
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1169055
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1160084
As of b6d4dad11b (1.2.5) we are trying to keep the status of FSFreeze
in the guest. Even though I've tried to fixed couple of corner cases
(6ea54769ba), it occurred to me just recently, that the approach is
broken by design. Firstly, there are many other ways to talk to
qemu-ga (even through libvirt) that filesystems can be thawed (e.g.
qemu-agent-command) without libvirt noticing. Moreover, there are
plenty of ways to thaw filesystems without even qemu-ga noticing (yes,
qemu-ga keeps internal track of FSFreeze status). So, instead of
keeping the track ourselves, or asking qemu-ga for stale state, it's
the best to let qemu-ga deal with that (and possibly let guest kernel
propagate an error).
Moreover, there's one bug with the following approach, if fsfreeze
command failed, we've executed fsthaw subsequently. So issuing
domfsfreeze in virsh gave the following result:
virsh # domfsfreeze gentoo
Froze 1 filesystem(s)
virsh # domfsfreeze gentoo
error: Unable to freeze filesystems
error: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance
virsh # domfsfreeze gentoo
Froze 1 filesystem(s)
virsh # domfsfreeze gentoo
error: Unable to freeze filesystems
error: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Get mounted filesystems list, which contains hardware info of disks and its
controllers, from QEMU guest agent 2.2+. Then, convert the hardware info
to corresponding device aliases for the disks.
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
New qemu added a new event that is emitted when a virtio serial channel
is opened in the guest OS. This allows us to update the state of the
port in the output-only XML element.
This patch implements the monitor callbacks and necessary handlers to
update the state in the definition.