https://bugzilla.redhat.com/show_bug.cgi?id=1198645
Once upon a time, there was a little domain. And the domain was pinned
onto a NUMA node and hasn't fully allocated its memory:
<memory unit='KiB'>2355200</memory>
<currentMemory unit='KiB'>1048576</currentMemory>
<numatune>
<memory mode='strict' nodeset='0'/>
</numatune>
Oh little me, said the domain, what will I do with so little memory.
If I only had a few megabytes more. But the old admin noticed the
whimpering, barely audible to untrained human ear. And good admin he
was, he gave the domain yet more memory. But the old NUMA topology
witch forbade to allocate more memory on the node zero. So he
decided to allocate it on a different node:
virsh # numatune little_domain --nodeset 0-1
virsh # setmem little_domain 2355200
The little domain was happy. For a while. Until bad, sharp teeth
shaped creature came. Every process in the system was afraid of him.
The OOM Killer they called him. Oh no, he's after the little domain.
There's no escape.
Do you kids know why? Because when the little domain was born, her
father, Libvirt, called numa_set_membind(). So even if the admin
allowed her to allocate memory from other nodes in the cgroups, the
membind() forbid it.
So what's the lesson? Libvirt should rely on cgroups, whenever
possible and use numa_set_membind() as the last ditch effort.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This new internal API checks if given CGroup controller is
available. It is going to be needed later when we need to make a
decision whether pin domain memory onto NUMA nodes using cpuset
CGroup controller or using numa_set_membind().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently we check qemuCaps before starting the block job. But qemuCaps
isn't available on a stopped domain, which means we get a misleading
error message in this case:
# virsh domstate example
shut off
# virsh blockjob example vda
error: unsupported configuration: block jobs not supported with this QEMU binary
Move the qemuCaps check into the block job so that we are guaranteed the
domain is running.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
qemuMigrationCookieAddNBD is usually called from within an async
MIGRATION_OUT or MIGRATION_IN job, so it needs to start a nested job.
(The one exception is during the Begin phase when change protection
isn't enabled, but qemuDomainObjEnterMonitorAsync will behave the same
as qemuDomainObjEnterMonitor in this case.)
This bug was encountered with a libvirt client that repeatedly queries
the disk mirroring block job info during a migration. If one of these
queries occurs just as the Perform migration cookie is baked, libvirt
crashes.
Relevant logs are as follows:
6701: warning : qemuDomainObjEnterMonitorInternal:1544 : This thread seems to be the async job owner; entering monitor without asking for a nested job is dangerous
[1] 6701: info : qemuMonitorSend:972 : QEMU_MONITOR_SEND_MSG: mon=0x7fefdc004700 msg={"execute":"query-block","id":"libvirt-629"}
[2] 6699: info : qemuMonitorIOWrite:503 : QEMU_MONITOR_IO_WRITE: mon=0x7fefdc004700 buf={"execute":"query-block","id":"libvirt-629"}
[3] 6704: info : qemuMonitorSend:972 : QEMU_MONITOR_SEND_MSG: mon=0x7fefdc004700 msg={"execute":"query-block-jobs","id":"libvirt-630"}
[4] 6699: info : qemuMonitorJSONIOProcessLine:203 : QEMU_MONITOR_RECV_REPLY: mon=0x7fefdc004700 reply={"return": [...], "id": "libvirt-629"}
6699: error : qemuMonitorJSONIOProcessLine:211 : internal error: Unexpected JSON reply '{"return": [...], "id": "libvirt-629"}'
At [1] qemuMonitorBlockStatsUpdateCapacity sends its request, then waits
on mon->notify. At [2] the request is written out to the monitor socket.
At [3] qemuMonitorBlockJobInfo sends its request, and also waits on
mon->notify. The reply from the first request is received at [4].
However, qemuMonitorJSONIOProcessLine is not expecting this reply since
the second request hadn't completed sending. The reply is dropped and an
error is returned.
qemuMonitorIO signals mon->notify twice during its error handling,
waking up both of the threads waiting on it. One of them clears mon->msg
as it exits qemuMonitorSend; the other crashes:
qemuMonitorSend (mon=0x7fefdc004700, msg=<value optimized out>) at qemu/qemu_monitor.c:975
975 while (!mon->msg->finished) {
(gdb) print mon->msg
$1 = (qemuMonitorMessagePtr) 0x0
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
In order to change an existing domain we delete all existing devices and add
new from scratch. In case of network devices we should also delete corresponding
virtual networks (if any) before removing actual devices from xml. In the patch,
we do it by extending prlsdkDoApplyConfig with a new parameter, which stands for
old xml, and calling prlsdkDelNet every time old xml is specified.
Signed-off-by: Maxim Nestratov <mnestratov@parallels.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The close callbacks hash are keyed by a UUID-string, but
virCloseCallbacksRun was attempting to remove them by raw UUID. This
patch ensures the callback entries are removed by UUID-string as well.
This bug caused problems when guest migrations were abnormally aborted:
# timeout --signal KILL 1 \
virsh migrate example qemu+tls://remote/system \
--verbose --compressed --live --auto-converge \
--abort-on-error --unsafe --persistent \
--undefinesource --copy-storage-all --xml example.xml
Killed
# virsh migrate example qemu+tls://remote/system \
--verbose --compressed --live --auto-converge \
--abort-on-error --unsafe --persistent \
--undefinesource --copy-storage-all --xml example.xml
error: Requested operation is not valid: domain 'example' is not being migrated
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
If a VM migration is aborted, a disk mirror may be failed by QEMU before
libvirt has a chance to cancel it. The disk->mirrorState remains at
_ABORT in this case, and this breaks subsequent mirrorings of that disk.
We should instead check the mirrorState directly and transition to _NONE
if it is already aborted. Do the check *after* aborting the block job in
QEMU to avoid a race.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
If virCloseCallbacksSet fails, qemuMigrationBegin must return NULL to
indicate an error occurred.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
The destination libvirt daemon in a migration may segfault if the client
disconnects immediately after the migration has begun:
# virsh -c qemu+tls://remote/system list --all
Id Name State
----------------------------------------------------
...
# timeout --signal KILL 1 \
virsh migrate example qemu+tls://remote/system \
--verbose --compressed --live --auto-converge \
--abort-on-error --unsafe --persistent \
--undefinesource --copy-storage-all --xml example.xml
Killed
# virsh -c qemu+tls://remote/system list --all
error: failed to connect to the hypervisor
error: unable to connect to server at 'remote:16514': Connection refused
The crash is in:
1531 void
1532 qemuDomainObjEndJob(virQEMUDriverPtr driver, virDomainObjPtr obj)
1533 {
1534 qemuDomainObjPrivatePtr priv = obj->privateData;
1535 qemuDomainJob job = priv->job.active;
1536
1537 priv->jobs_queued--;
Backtrace:
#0 at qemuDomainObjEndJob at qemu/qemu_domain.c:1537
#1 in qemuDomainRemoveInactive at qemu/qemu_domain.c:2497
#2 in qemuProcessAutoDestroy at qemu/qemu_process.c:5646
#3 in virCloseCallbacksRun at util/virclosecallbacks.c:350
#4 in qemuConnectClose at qemu/qemu_driver.c:1154
...
qemuDomainRemoveInactive calls virDomainObjListRemove, which in this
case is holding the last remaining reference to the domain.
qemuDomainRemoveInactive then calls qemuDomainObjEndJob, but the domain
object has been freed and poisoned by then.
This patch bumps the domain's refcount until qemuDomainRemoveInactive
has completed. We also ensure qemuProcessAutoDestroy does not return the
domain to virCloseCallbacksRun to be unlocked in this case. There is
similar logic in bhyveProcessAutoDestroy and lxcProcessAutoDestroy
(which call virDomainObjListRemove directly).
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
==19015== 968 (416 direct, 552 indirect) bytes in 1 blocks are definitely lost in loss record 999 of 1,049
==19015== at 0x4C2C070: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==19015== by 0x52ADF14: virAllocVar (viralloc.c:560)
==19015== by 0x5302FD1: virObjectNew (virobject.c:193)
==19015== by 0x1DD9401E: virQEMUDriverConfigNew (qemu_conf.c:164)
==19015== by 0x1DDDF65D: qemuStateInitialize (qemu_driver.c:666)
==19015== by 0x53E0823: virStateInitialize (libvirt.c:777)
==19015== by 0x11E067: daemonRunStateInit (libvirtd.c:905)
==19015== by 0x53201AD: virThreadHelper (virthread.c:206)
==19015== by 0xA1EE1F2: start_thread (in /lib64/libpthread-2.19.so)
==19015== by 0xA4EFC8C: clone (in /lib64/libc-2.19.so)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
==19015== 8 bytes in 1 blocks are definitely lost in loss record 34 of 1,049
==19015== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==19015== by 0x4C2C32F: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==19015== by 0x52AD888: virReallocN (viralloc.c:245)
==19015== by 0x52AD97E: virExpandN (viralloc.c:294)
==19015== by 0x52ADC51: virInsertElementsN (viralloc.c:436)
==19015== by 0x5335864: virDomainVirtioSerialAddrSetAddController (domain_addr.c:816)
==19015== by 0x53358E0: virDomainVirtioSerialAddrSetAddControllers (domain_addr.c:839)
==19015== by 0x1DD5513B: qemuDomainAssignVirtioSerialAddresses (qemu_command.c:1422)
==19015== by 0x1DD55A6E: qemuDomainAssignAddresses (qemu_command.c:1711)
==19015== by 0x1DDA5818: qemuProcessStart (qemu_process.c:4616)
==19015== by 0x1DDF1807: qemuDomainObjStart (qemu_driver.c:7265)
==19015== by 0x1DDF1A66: qemuDomainCreateWithFlags (qemu_driver.c:7320)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
==19015== 1,064 (656 direct, 408 indirect) bytes in 2 blocks are definitely lost in loss record 1,002 of 1,049
==19015== at 0x4C2C070: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==19015== by 0x52AD74B: virAlloc (viralloc.c:144)
==19015== by 0x52B47CA: virCgroupNew (vircgroup.c:1057)
==19015== by 0x52B53E5: virCgroupNewVcpu (vircgroup.c:1451)
==19015== by 0x1DD85A40: qemuSetupCgroupForVcpu (qemu_cgroup.c:1013)
==19015== by 0x1DDA66EA: qemuProcessStart (qemu_process.c:4844)
==19015== by 0x1DDF1807: qemuDomainObjStart (qemu_driver.c:7265)
==19015== by 0x1DDF1A66: qemuDomainCreateWithFlags (qemu_driver.c:7320)
==19015== by 0x1DDF1ACD: qemuDomainCreate (qemu_driver.c:7337)
==19015== by 0x53F87EA: virDomainCreate (libvirt-domain.c:6820)
==19015== by 0x12690A: remoteDispatchDomainCreate (remote_dispatch.h:3481)
==19015== by 0x126827: remoteDispatchDomainCreateHelper (remote_dispatch.h:3457)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The 'checkPool' callback was originally part of the storageDriverAutostart function,
but the pools need to be checked earlier during initialization phase,
otherwise we can't start a domain which mounts a volume after the
libvirtd daemon restarted. This is because qemuProcessReconnect is called
earlier than storageDriverAutostart. Therefore the 'checkPool' logic has been
moved to storagePoolUpdateAllState which is called inside storageDriverInitialize.
We also need a valid 'conn' reference to be able to execute 'refreshPool'
during initialization phase. Though it isn't available until storageDriverAutostart
all of our storage backends do ignore 'conn' pointer, except for RBD,
but RBD doesn't support 'checkPool' callback, so it's safe to pass
conn = NULL in this case.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1177733
This patch introduces new virStorageDriverState element stateDir.
Also adds necessary changes to storageStateInitialize, so that
directories initialization becomes more generic.
The nodedev-detach can report the name of the domain using the device
just the way nodedev-reattach does it.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
The error output of snapshot-revert should be more friendly.
There is no need to show virDomainRevertToSnapshot to user.
virReportError already includes __FUNCTION__ information in a
separate member of the struct, so repeating it in the message is
redundant and leads to situations where higher level code ends up
reporting the lower level name. We correctly converted the error
output making it more succinct and user-friendly.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1086726
Before this patch, when connected via vCenter, the free memory returned
was from the resorcePool (usually a cluster). This is in conflict with
e.g esxNodeGetInfo which always pulls info from the ESX host.
Since libvirt ESX driver works primarily with ESX hosts, this patch
changes esxNodeGetFreeMemory to pull that information from ESX host so
it's consistent with behavior of esxNodeGetInfo.
Introduce virStoragePoolSaveState to properly format the state XML in
the same manner as virStoragePoolDefFormat, except for adding a
<poolstate> ... </poolstate> around the definition. This is similar to
virNetworkObjFormat used to save the live/active network information.
When modifying config/status XML, it might be handy to include some
additional XML elements (e.g. <poolstate>). In order to do so,
introduce new formatting function virStoragePoolDefFormatBuf and make
virStoragePoolDefFormat call it.
Recent testing on large memory systems revealed a bug in the Xen xl
tool's freemem() function. When autoballooning is enabled, freemem()
is used to ensure enough memory is available to start a domain,
ballooning dom0 if necessary. When ballooning large amounts of memory
from dom0, freemem() would exceed its self-imposed wait time and
return an error. Meanwhile, dom0 continued to balloon. Starting the
domain later, after sufficient memory was ballooned from dom0, would
succeed. The libvirt implementation in libxlDomainFreeMem() suffers
the same bug since it is modeled after freemem().
In the end, the best place to fix the bug on the Xen side was to
slightly change the behavior of libxl_wait_for_memory_target().
Instead of failing after caller-provided wait_sec, the function now
blocks as long as dom0 memory ballooning is progressing. It will return
failure only when more memory is needed to reach the target and wait_sec
have expired with no progress being made. See xen.git commit fd3aa246.
There was a dicussion on how this would affect other libxl apps like
libvirt
http://lists.xen.org/archives/html/xen-devel/2015-03/msg00739.html
If libvirt containing this patch was build against a Xen containing
the old libxl_wait_for_memory_target() behavior, libxlDomainFreeMem()
will fail after 30 sec and domain creation will be terminated.
Without this patch and with old libxl_wait_for_memory_target() behavior,
libxlDomainFreeMem() does not succeed after 30 sec, but returns success
anyway. Domain creation continues resulting in all sorts of fun stuff
like cpu soft lockups in the guest OS. It was decided to properly fix
libxl_wait_for_memory_target(), and if anything improve the default
behavior of apps using the freemem reference impl in xl.
xl was patched to accommodate the change in libxl_wait_for_memory_target()
with xen.git commit 883b30a0. This patch does the same in the libxl
driver. While at it, I changed the logic to essentially match
freemem() in $xensrc/tools/libxl/xl_cmdimpl.c. It was a bit cleaner
IMO and will make it easier to spot future, potentially interesting
divergences.
Dependant is flagged as wrong in US dictionary (only valid in UK
dictionary, and even then, it has only the financial sense and not the
inter-relatedness sense that we are more prone to be wanting throughout
code).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
'virPCIDeviceList' is actually an array. Removing one element makes the
rest of the element move.
Use while loop, increase index only when not virPCIDeviceListDel(pcidevs, dev)
Signed-off-by: Huanle Han <hanxueluo@gmail.com>
Instead of always using controller 0 and incrementing port number,
respect the maximum port numbers of controllers and use all of them.
Ports for virtio consoles are quietly reserved, but not formatted
(neither in XML nor on QEMU command line).
Also rejects duplicate virtio-serial addresses.
https://bugzilla.redhat.com/show_bug.cgi?id=890606https://bugzilla.redhat.com/show_bug.cgi?id=1076708
Test changes:
* virtio-auto.args
Filling out the port when just the controller is specified.
switched from using
maxport + 1
to:
first free port on the controller
* virtio-autoassign.args
Filling out the address when no <address> is specified.
Started using all the controllers instead of 0, also discards
the bus value.
* xml -> xml output of virtio-auto
The port assignment is no longer done as a part of XML parsing,
so the unspecified values stay 0.
Create a sorted array of virtio-serial controllers.
Each of the elements contains the controller index
and a bitmap of available ports.
Buses are not tracked, because they aren't supported by QEMU.
If the call to virStorageBackendISCSIGetHostNumber failed, we set
retval = -1, but yet still called virStorageBackendSCSIFindLUs.
Need to add a goto cleanup - while at it, adjust the logic to
initialize retval to -1 and only changed to 0 (zero) on success.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Don't supercede the error message virStorageBackendSCSIFindLUs as the
message such as "error: Failed to find LUs on host 60: ..." is not overly
clear as to what the real problem might be.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Make XML definition saving more generic by moving the common code into
virStoragePoolSaveXML and leave case specific code to
PoolSave{Status,Config,...} functions.
In order to be able to use 'checkPool' inside functions which do not
have any connection reference, 'conn' attribute needs to be discarded
from the checkPool's signature, since it's not used by any storage backend
anyway.
https://bugzilla.redhat.com/show_bug.cgi?id=1206479
As described in virDomainBlockCopy() parameters description, the
VIR_DOMAIN_BLOCK_COPY_GRANULARITY parameter may require the value to
have some specific attributes (e.g. be a power of two or fall within a
certain range). And in qemu, a power of two is required. However, our
code does not check that and let qemu operation fail. Moreover, the
virsh man page is not as exact as it could be in this respect.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When we shutdown/reboot a guest using agent-mode, if the guest itself blocks infinitely,
libvirt would block in qemuAgentShutdown() forever.
Thus, we set a timeout for shutdown/reboot, from our experience, 60 seconds would be fine.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>
virDomainHasDiskMirror() currently detects only jobs that add the mirror
elements. Since some operations like migration are interlocked by
existing block jobs on the given domain the check needs to be
instrumented to check regular jobs too.
This patch renames virDomainHasDiskMirror to virDomainHasDiskBlockjob
and adds an argument that allows to select that it returns true only for
block copy jobs as those interlock making the domain persistent.
Other two uses trigger on any block job type.
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
If any disk of a VM was involved in a (copy) block job we refused to do
a snapshot. As not only copy jobs interlock snapshots and the
interlocking is applicable to individual disks only we can make the
check in a more individual fashion and interlock all block job types
supported by libvirt.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1203628
In the order of appearance:
* MAX_LISTEN - never used
added by 23ad665c (qemud) and addec57 (lock daemon)
* NEXT_FREE_CLASS_ID - never used, added by 07d1b6b
* virLockError - never used, added by eb8268a4
* OPENVZ_MAX_ARG, CMDBUF_LEN, CMDOP_LEN
unused since the removal of ADD_ARG_LIT in d8b31306
* QEMU_NB_PER_CPU_STAT_PARAM - unused since 897808e
* QEMU_CMD_PROMPT, QEMU_PASSWD_PROMPT - unused since 1dc10a7
* TEST_MODEL_WORDSIZE - unused since c25c18f7
* TEMPDIR - never used, added by 714bef5
* NSIG - workaround around old headers
added by commit 60ed1d2
unused since virExec was moved by commit 02e8691
* DO_TEST_PARSE - never used, added by 9afa006
* DIFF_MSEC, GETTIMEOFDAY - unused since eee6eb6
Two places would call to qemuPrepareCpumap() with priv->autoNodeset to
convert it to a cpuset. Remove the function and use the prepared cpuset
automatically.
When the default cpuset or automatic numa placement is used libvirt
would place the whole parent cgroup in the specified cpuset. This then
disallowed to re-pin the vcpus to a different cpu.
This patch pins only the vcpu threads to the default cpuset and thus
allows to re-pin them later.
The following config would fail to start:
<domain type='kvm'>
...
<vcpu placement='static' cpuset='0-1' current='2'>4</vcpu>
<cputune>
<vcpupin vcpu='0' cpuset='2-3'/>
...
This is a regression since a39f69d2b.
When the synchronous pivot option is selected, libvirt would not update
the backing chain until the job was exitted. Some applications then
received invalid data as their job serialized first.
This patch removes polling to wait for the ABORT/PIVOT job completion
and replaces it with a condition. If a synchronous operation is
requested the update of the XML is executed in the job of the caller of
the synchronous request. Otherwise the monitor event callback uses a
separate worker to update the backing chain with a new job.
This is a regression since 1a92c71910
When the ABORT job is finished synchronously you get the following call
stack:
#0 qemuBlockJobEventProcess
#1 qemuDomainBlockJobImpl
#2 qemuDomainBlockJobAbort
#3 virDomainBlockJobAbort
While previously or while using the _ASYNC flag you'd get:
#0 qemuBlockJobEventProcess
#1 processBlockJobEvent
#2 qemuProcessEventHandler
#3 virThreadPoolWorker
Later on I'll be adding a condition that will allow to synchronise a
SYNC block job abort. The approach will require this code to be called
from two different places so it has to be extracted into a helper.
Commit 1a92c719 moved code to handle block job events to a different
function that is executed in a separate thread. The caller of
processBlockJob handles locking and unlocking of @vm, so the we should
not do it in the function itself.
The block copy API takes the speed in bytes/s rather than MiB/s that was
the prior approach in virDomainBlockRebase. We correctly converted the
speed to bytes/s in the old API but we still called the common helper
virDomainBlockCopyCommon with the unadjusted variable.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1207122
When getting info on NUMA parameters for domain,
virCgroupGetCpusetMems() may be called. However, as of 43b67f2e
the call is guarded by check if memory controller is present.
Even though it may be not obvious instantly, NUMA parameters are
stored under cpuset controller. Therefore the check needs to look
like this:
if (!virCgroupHasController(priv->cgroup,
VIR_CGROUP_CONTROLLER_CPUSET) ||
virCgroupGetCpusetMems(priv->cgroup, &nodeset) < 0) {
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Throughout our code, the virCgroupController enum is used in two ways.
First as an index to an array of cgroup controllers:
struct virCgroup {
char *path;
struct virCgroupController controllers[VIR_CGROUP_CONTROLLER_LAST];
};
Second way is that when calling virCgroupNew() a bitmask of the enum
items can be passed to selectively detect only some controllers. For
instance:
int
virCgroupNewVcpu(virCgroupPtr domain,
int vcpuid,
bool create,
virCgroupPtr *group)
{
...
controllers = ((1 << VIR_CGROUP_CONTROLLER_CPU) |
(1 << VIR_CGROUP_CONTROLLER_CPUACCT) |
(1 << VIR_CGROUP_CONTROLLER_CPUSET));
if (virCgroupNew(-1, name, domain, controllers, group) < 0)
goto cleanup;
}
Even though it's highly unlikely that so many new controllers will be
invented so that we would overflow when constructing the bitmask, it
doesn't hurt to check at compile time either.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When creating new internal representation of cgroups, all passed
arguments are logged. Well, except for two: pid and pointer for
return value. Lets log them too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The function has no argument named @name rather than @path
instead. The comment is, however, referring to @name while it
should have been referring to @path really.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit id '2dbfa716' exposed virCgroupDetectMountsFromFile, but did not
add the corresponding entry in the "#else /* !VIR_CGROUP_SUPPORTED */"
section of the module.
Commit id 'ba1dfc5' added virCgroupSetCpusetMemoryMigrate and
virCgroupGetCpusetMemoryMigrate, but did not add the corresponding
entry points into the "#else /* !VIR_CGROUP_SUPPORTED */" section
Commint 0473b45cc introduced new function virNetlinkDelLink, but in
it's counterpart for non-linux platform there should be ATTRIBUTE_UNUSED
instead of ATTRIBUTE_UNSUPPORTED.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Blockcopy to non-file destination is not supported according the code,
but a 'goto endjob' is missed after checking the destination.
This leads to calling drive-mirror with wrong parameters.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1206406
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Current libvirt can only handle up to 1023 bytes when it
reads Linux sysfs topology/thread_siblings. This isn't enough for
Linux distributions that support a large value. This patch fixes
the problem by using VIR_ALLOC()/VIR_FREE(), instead of using a
fixed-size (1024) local char array. In the meanwhile
SYSFS_THREAD_SIBLINGS_LIST_LENGTH_MAX is increased to 8192 which
should be large enough for a foreseeable future.
Signed-off-by: Wei Huang <wei@redhat.com>
Just as it is possible to delete a bridge device with the netlink
RTM_DELLINK message, one can be created with the RTM_NEWLINK
message. Because of differences in the format of the message, it's not
as straightforward as with virNetlinkDelLink() to create a single
utility function that can be used to create any type of interface, so
the new netlink version of virNetDevBridgeCreate() does its own
construction of the netlink message and calls virNetlinkCommand()
itself.
This doesn't provide any extra functionality, just provides symmetry
with the previous commit.
NB: We *could* alter the API of virNetDevBridgeCreate() to take a MAC
address, and directly program that mac address into the bridge (by
adding an IFLA_ADDRESS attribute, as is done in
virNetDevMacVLanCreate()) rather than separately creating the "dummy
tap" (e.g. virbr0-nic) to maintain a fixed mac address on the bridge,
but the commit history of virnetdevbridge.c shows that the presence of
this dummy tap is essential in some older versions of the kernel
(between 2.6.39 and 3.1 or 3.2, possibly?) to proper operation of IPv6
DAD, and I don't want to take the chance of breaking something that I
don't have the time/setup to test (my RHEL6 box is at kernel
2.6.32-544, and the next lowest kernel I have is 3.17)
https://bugzilla.redhat.com/show_bug.cgi?id=1125755
reported that a stray bridge device was left on the system when a
libvirt network failed to start due to an illegal iptables rule caused
by bad config. Apparently the reason this was happening was that
NetworkManager was noticing immediately when the bridge device was
created and automatically setting it IFF_UP. libvirt would then try to
setup the iptables rules, get an error back, and since libvirt had
never IFF_UPed the bridge, it didn't expect that it needed to set it
~IFF_UP before deleting it during the cleanup process. But the
ioctl(SIOCBRDELBR) ioctl will fail to delete a bridge if it is IFF_UP.
Since that bug was reported, NetworkManager has gotten a bit more
polite in this respect, but just in case something similar happens in
the future, this patch switches to using the netlink RTM_DELLINK
message to delete the bridge - unlike SIOCBRDELBR, it will delete the
requested bridge no matter what the setting of IFF_UP.
These two functions are identical, so no sense in having the
duplication. I resisted the temptation to replace calls to
virNetDevMacVLanDelete() with calls to virNetlinkDelLink() just in
case some mythical future platform has macvtap devices that aren't
managed with netlink (or in case we some day need to do more than just
tell the kernel to delete the device).
libvirt has always used the netlink RTM_DELLINK message to delete
macvtap/macvlan devices, but it can actually be used to delete other
types of network devices, such as bonds and bridges. This patch makes
virNetDevMacVLanDelete() available as a generic function so it can
intelligibly be called to delete these other types of interfaces.
Starting a qemu VM with a memory module that has the base address
specified results in the following error:
error: internal error: early end of file from monitor: possible problem:
2015-03-26T03:45:52.338891Z qemu-kvm: -device pc-dimm,node=0,memdev=memdimm0,
id=dimm0,slot=0,base=4294967296: Property '.base' not found
The correct property name for the base address is 'addr'.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Because of the microcode update to Haswell/Broadwell CPUs, existing
domains using these CPUs may fail to start even though they used to run
just fine. To help users solve this issue we try to suggest switching to
-noTSX variant of the CPU model:
virsh # start cd
error: Failed to start domain cd
error: unsupported configuration: guest and host CPU are not
compatible: Host CPU does not provide required features: rtm, hle;
try using 'Haswell-noTSX' CPU model
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
QEMU 2.3 adds these new models to cover Haswell and Broadwell CPUs with
updated microcode. Luckily, they also reverted former the machine type
specific changes to existing models. And since these changes were never
released, we don't need to hack around them in libvirt.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Use the virStorageSourceIsEmpty helper to determine whether the drive
source is empty rather than checking for src->path. This will fix start
of VM with empty network cdrom that would not report any error.
The function that formats the string for network drives would return
error code but did not set the error message when called on storage
source with VIR_STORAGE_NET_PROTOCOL_LAST or _NONE.
Report an error in this case if it would ever be called in that way.
In some circumstances where the build tree differs from the source,
libvirt's compile will try to create the symlink for cpu_map.xml before
creating the directory $(abs_builddir)/cpu:
'src/cpu/cpu_map.xml': No such file or directory'
Do not create the symlink, it is no longer needed after
commit e562e82f
Load CPU map from builddir when run uninstalled
Signed-off-by: Amy Fong <amy.fong@windriver.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Recently we've fixed a bug where the status XML could not be parsed as
the parser used absolute path XPath queries. This test enhancement tests
all XML files used in the qemu-xml-2-xml test as a part of a status XML
snippet to see whether they are parsed correctly. The status XML-2-XML is
currently tested in 223 cases with this patch.
The current auto-indentation buffer code applies indentation only on
complete strings. To allow adding a string containing newlines and
having it properly indented this patch adds virBufferAddStr.
While this thread is cleaning up the client and connection objects:
#2 virFileReadAll (path=0x7f28780012b0 "/proc/1319/stat", maxlen=maxlen@entry=1024, buf=buf@entry=0x7f289c60fc40) at util/virfile.c:1287
#3 0x00007f28adbb1539 in virProcessGetStartTime (pid=<optimized out>, timestamp=timestamp@entry=0x7f289c60fc98) at util/virprocess.c:838
#4 0x00007f28adb91981 in virIdentityGetSystem () at util/viridentity.c:151
#5 0x00007f28ae73f17c in remoteClientFreeFunc (data=<optimized out>) at remote.c:1131
#6 0x00007f28adcb7f33 in virNetServerClientDispose (obj=0x7f28aecad180) at rpc/virnetserverclient.c:858
#7 0x00007f28adba8eeb in virObjectUnref (anyobj=<optimized out>) at util/virobject.c:265
#8 0x00007f28ae74ad05 in virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x7f28aec93ff0) at rpc/virnetserver.c:205
#9 0x00007f28adbbef4e in virThreadPoolWorker (opaque=opaque@entry=0x7f28aec88030) at util/virthreadpool.c:145
In stack frame #6 the client->identity object got unref'd, but the code
that removes the event callbacks in frame #5 did not run yet as we are
trying to obtain the system identity (frames #4, #3, #2).
In other thead:
#0 virObjectUnref (anyobj=anyobj@entry=0x7f288c162c60) at util/virobject.c:264
klass = 0xdeadbeef
obj = 0x7f288c162c60
#1 0x00007f28ae71c709 in remoteRelayDomainEventCheckACL (client=<optimized out>, conn=<optimized out>, dom=dom@entry=0x7f28aecaafc0) at remote.c:164
#2 0x00007f28ae71fc83 in remoteRelayDomainEventTrayChange (conn=<optimized out>, dom=0x7f28aecaafc0, ... ) at remote.c:717
#3 0x00007f28adc04e53 in virDomainEventDispatchDefaultFunc (conn=0x7f287c0009a0, event=0x7f28aecab1a0, ...) at conf/domain_event.c:1455
#4 0x00007f28adc03831 in virObjectEventStateDispatchCallbacks (callbacks=<optimized out>, ....) at conf/object_event.c:724
#5 virObjectEventStateQueueDispatch (callbacks=0x7f288c083730, queue=0x7fff51f90030, state=0x7f288c18da20) at conf/object_event.c:738
#6 virObjectEventStateFlush (state=0x7f288c18da20) at conf/object_event.c:816
#7 virObjectEventTimer (timer=<optimized out>, opaque=0x7f288c18da20) at conf/object_event.c:562
#8 0x00007f28adb859cd in virEventPollDispatchTimeouts () at util/vireventpoll.c:459
Frame #0 is unrefing an invalid identity object while frame #2 hints
that the client is still dispatching the event.
For untrimmed backtrace see the bugzilla attachment.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1203030
Don't unref the old identity unless we set the new one correctly and
unref the new one on failure to set it so that we don't leak any
references or use invalid pointers.
While adding tests for status XML parsing and formatting I've noticed
that the device alias list is leaked.
==763001== 81 (48 direct, 33 indirect) bytes in 1 blocks are definitely lost in loss record 414 of 514
==763001== at 0x4C2B8F0: calloc (vg_replace_malloc.c:623)
==763001== by 0x6ACF70F: virAllocN (viralloc.c:191)
==763001== by 0x447B64: qemuDomainObjPrivateXMLParse (qemu_domain.c:727)
==763001== by 0x6B848F9: virDomainObjParseXML (domain_conf.c:15491)
==763001== by 0x6B84CAC: virDomainObjParseNode (domain_conf.c:15608)
When starting a VM with hotpluggable memory devices the user may specify
an invalid source NUMA node. Libvirt would pass through the error from
qemu:
# virsh start test3
error: Failed to start domain test3
error: internal error: process exited while connecting to monitor:
2015-03-25T01:12:17.205913Z qemu-kvm: -object memory-backend-ram,id=memdimm0
,size=536870912,host-nodes=1-3,policy=bind: cannot bind memory to host NUMA nodes:
Invalid argument
This patch adds a check that allows to report better error:
# virsh start test3
error: Failed to start domain test3
error: configuration unsupported: NUMA node 1 is unavailable
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Commit 95695388 introduced new util/virthreadjob.c/h files but the
makefile has type that breaks rpm build.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This is very helpful when we want to log and report why we could not
acquire a state change lock. Reporting what job keeps it locked helps
with understanding the issue. Moreover, after calling
virDomainGetControlInfo, it's possible to tell whether libvirt is just
stuck somewhere within the API (or it just forgot to cleanup the job) or
whether libvirt is waiting for QEMU to reply.
The error message will look like the following:
# virsh resume cd
error: Failed to resume domain cd
error: Timed out during operation: cannot acquire state change lock
(held by remoteDispatchDomainSuspend)
https://bugzilla.redhat.com/show_bug.cgi?id=853839
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Every thread created as a worker thread within a pool gets a name
according to virThreadPoolJobFunc name.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Automatically assign a job to every thread created by virThreadCreate.
The name of the virThreadFunc function passed to virThreadCreate is used
as the job or worker name in case no name is explicitly passed.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
We want all threads to be set as workers or to have a job assigned to
them, which can easily be achieved in virThreadCreate wrapper to
pthread_create. Let's make sure we always use the wrapper.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Each thread can use a thread local variable to keep the name of a job
which is currently running in the job.
The virThreadJobSetWorker API is supposed to be called once by any
thread which is used as a worker, i.e., it is waiting in a pool, woken
up to do a job, and returned back to the pool.
The virThreadJobSet/virThreadJobClear APIs are to be called at the
beginning/end of each job.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Although needed in the Xen 4.1 libxl days, there is no longer any
benefit to having per-domain libxl_ctx. On the contrary, their use
makes the code unecessarily complicated and prone to deadlocks under
load. As suggested by the libxl maintainers, use a single libxl_ctx
as a handle to libxl instead of per-domain ctx's.
One downside to using a single libxl_ctx is there are no longer
per-domain log files for log messages emitted by libxl. Messages
for all domains will be sent to /var/log/libvirt/libxl/libxl-driver.log.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
libxlDomainFreeMem() is only used in libxl_domain.c and thus should
be declared static. While at it, change the signature to take a
libxl_ctx instead of libxlDomainObjPrivatePtr, since only the
libxl_ctx is needed.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
This function now only enables domain death events. Simply call
libxl_evenable_domain_death() instead of an unnecessary wrapper.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Informing libxl how to handle its child proceses should be done once
during driver initialization, not once for each domain-specific
libxl_ctx object. The related libxl documentation in
$xen-src/tools/libxl/libxl_event.h even mentions that "it is best to
call this at initialisation".
Signed-off-by: Jim Fehlig <jfehlig@suse.com>