The close callbacks hash are keyed by a UUID-string, but
virCloseCallbacksRun was attempting to remove them by raw UUID. This
patch ensures the callback entries are removed by UUID-string as well.
This bug caused problems when guest migrations were abnormally aborted:
# timeout --signal KILL 1 \
virsh migrate example qemu+tls://remote/system \
--verbose --compressed --live --auto-converge \
--abort-on-error --unsafe --persistent \
--undefinesource --copy-storage-all --xml example.xml
Killed
# virsh migrate example qemu+tls://remote/system \
--verbose --compressed --live --auto-converge \
--abort-on-error --unsafe --persistent \
--undefinesource --copy-storage-all --xml example.xml
error: Requested operation is not valid: domain 'example' is not being migrated
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
If a VM migration is aborted, a disk mirror may be failed by QEMU before
libvirt has a chance to cancel it. The disk->mirrorState remains at
_ABORT in this case, and this breaks subsequent mirrorings of that disk.
We should instead check the mirrorState directly and transition to _NONE
if it is already aborted. Do the check *after* aborting the block job in
QEMU to avoid a race.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
If virCloseCallbacksSet fails, qemuMigrationBegin must return NULL to
indicate an error occurred.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
The destination libvirt daemon in a migration may segfault if the client
disconnects immediately after the migration has begun:
# virsh -c qemu+tls://remote/system list --all
Id Name State
----------------------------------------------------
...
# timeout --signal KILL 1 \
virsh migrate example qemu+tls://remote/system \
--verbose --compressed --live --auto-converge \
--abort-on-error --unsafe --persistent \
--undefinesource --copy-storage-all --xml example.xml
Killed
# virsh -c qemu+tls://remote/system list --all
error: failed to connect to the hypervisor
error: unable to connect to server at 'remote:16514': Connection refused
The crash is in:
1531 void
1532 qemuDomainObjEndJob(virQEMUDriverPtr driver, virDomainObjPtr obj)
1533 {
1534 qemuDomainObjPrivatePtr priv = obj->privateData;
1535 qemuDomainJob job = priv->job.active;
1536
1537 priv->jobs_queued--;
Backtrace:
#0 at qemuDomainObjEndJob at qemu/qemu_domain.c:1537
#1 in qemuDomainRemoveInactive at qemu/qemu_domain.c:2497
#2 in qemuProcessAutoDestroy at qemu/qemu_process.c:5646
#3 in virCloseCallbacksRun at util/virclosecallbacks.c:350
#4 in qemuConnectClose at qemu/qemu_driver.c:1154
...
qemuDomainRemoveInactive calls virDomainObjListRemove, which in this
case is holding the last remaining reference to the domain.
qemuDomainRemoveInactive then calls qemuDomainObjEndJob, but the domain
object has been freed and poisoned by then.
This patch bumps the domain's refcount until qemuDomainRemoveInactive
has completed. We also ensure qemuProcessAutoDestroy does not return the
domain to virCloseCallbacksRun to be unlocked in this case. There is
similar logic in bhyveProcessAutoDestroy and lxcProcessAutoDestroy
(which call virDomainObjListRemove directly).
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
==19015== 968 (416 direct, 552 indirect) bytes in 1 blocks are definitely lost in loss record 999 of 1,049
==19015== at 0x4C2C070: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==19015== by 0x52ADF14: virAllocVar (viralloc.c:560)
==19015== by 0x5302FD1: virObjectNew (virobject.c:193)
==19015== by 0x1DD9401E: virQEMUDriverConfigNew (qemu_conf.c:164)
==19015== by 0x1DDDF65D: qemuStateInitialize (qemu_driver.c:666)
==19015== by 0x53E0823: virStateInitialize (libvirt.c:777)
==19015== by 0x11E067: daemonRunStateInit (libvirtd.c:905)
==19015== by 0x53201AD: virThreadHelper (virthread.c:206)
==19015== by 0xA1EE1F2: start_thread (in /lib64/libpthread-2.19.so)
==19015== by 0xA4EFC8C: clone (in /lib64/libc-2.19.so)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
==19015== 8 bytes in 1 blocks are definitely lost in loss record 34 of 1,049
==19015== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==19015== by 0x4C2C32F: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==19015== by 0x52AD888: virReallocN (viralloc.c:245)
==19015== by 0x52AD97E: virExpandN (viralloc.c:294)
==19015== by 0x52ADC51: virInsertElementsN (viralloc.c:436)
==19015== by 0x5335864: virDomainVirtioSerialAddrSetAddController (domain_addr.c:816)
==19015== by 0x53358E0: virDomainVirtioSerialAddrSetAddControllers (domain_addr.c:839)
==19015== by 0x1DD5513B: qemuDomainAssignVirtioSerialAddresses (qemu_command.c:1422)
==19015== by 0x1DD55A6E: qemuDomainAssignAddresses (qemu_command.c:1711)
==19015== by 0x1DDA5818: qemuProcessStart (qemu_process.c:4616)
==19015== by 0x1DDF1807: qemuDomainObjStart (qemu_driver.c:7265)
==19015== by 0x1DDF1A66: qemuDomainCreateWithFlags (qemu_driver.c:7320)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
==19015== 1,064 (656 direct, 408 indirect) bytes in 2 blocks are definitely lost in loss record 1,002 of 1,049
==19015== at 0x4C2C070: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==19015== by 0x52AD74B: virAlloc (viralloc.c:144)
==19015== by 0x52B47CA: virCgroupNew (vircgroup.c:1057)
==19015== by 0x52B53E5: virCgroupNewVcpu (vircgroup.c:1451)
==19015== by 0x1DD85A40: qemuSetupCgroupForVcpu (qemu_cgroup.c:1013)
==19015== by 0x1DDA66EA: qemuProcessStart (qemu_process.c:4844)
==19015== by 0x1DDF1807: qemuDomainObjStart (qemu_driver.c:7265)
==19015== by 0x1DDF1A66: qemuDomainCreateWithFlags (qemu_driver.c:7320)
==19015== by 0x1DDF1ACD: qemuDomainCreate (qemu_driver.c:7337)
==19015== by 0x53F87EA: virDomainCreate (libvirt-domain.c:6820)
==19015== by 0x12690A: remoteDispatchDomainCreate (remote_dispatch.h:3481)
==19015== by 0x126827: remoteDispatchDomainCreateHelper (remote_dispatch.h:3457)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The 'checkPool' callback was originally part of the storageDriverAutostart function,
but the pools need to be checked earlier during initialization phase,
otherwise we can't start a domain which mounts a volume after the
libvirtd daemon restarted. This is because qemuProcessReconnect is called
earlier than storageDriverAutostart. Therefore the 'checkPool' logic has been
moved to storagePoolUpdateAllState which is called inside storageDriverInitialize.
We also need a valid 'conn' reference to be able to execute 'refreshPool'
during initialization phase. Though it isn't available until storageDriverAutostart
all of our storage backends do ignore 'conn' pointer, except for RBD,
but RBD doesn't support 'checkPool' callback, so it's safe to pass
conn = NULL in this case.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1177733
This patch introduces new virStorageDriverState element stateDir.
Also adds necessary changes to storageStateInitialize, so that
directories initialization becomes more generic.
The nodedev-detach can report the name of the domain using the device
just the way nodedev-reattach does it.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
$ sudo virsh change-media f19 hdc /mnt/data/devel/media/Fedora-16-x86_64-Live-KDE.iso
succeeded to complete action update on media
Change the message to:
Successfully {inserted,ejected,changed} media.
https://bugzilla.redhat.com/show_bug.cgi?id=967946
The error output of snapshot-revert should be more friendly.
There is no need to show virDomainRevertToSnapshot to user.
virReportError already includes __FUNCTION__ information in a
separate member of the struct, so repeating it in the message is
redundant and leads to situations where higher level code ends up
reporting the lower level name. We correctly converted the error
output making it more succinct and user-friendly.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1086726
In virsh we have two printing functions: vshPrint() which prints a
string onto stdout and vshPrintExtra() which does not print anything
if virsh is run in quiet mode. Usually, the former is used to print
actual results, while the latter to print strings like table headers
and other formatting stuff. However, in cmdDomIfAddr we have
mistakenly used vshPrintExtra even for actual data. After this patch,
the output should look like the following:
# virsh -q domifaddr test3 --source agent
lo 00:00:00:00:00:00 ipv4 127.0.0.1/8
- - ipv6 ::1/128
ens8 52:54:00:1a:cb:3f ipv6 fe80::5054:ff:fe1a:cb3f/64
virbr0 52:54:00:db:51:e7 ipv4 192.168.122.1/24
virbr0-nic 52:54:00:db:51:e7 N/A N/A
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Before this patch, when connected via vCenter, the free memory returned
was from the resorcePool (usually a cluster). This is in conflict with
e.g esxNodeGetInfo which always pulls info from the ESX host.
Since libvirt ESX driver works primarily with ESX hosts, this patch
changes esxNodeGetFreeMemory to pull that information from ESX host so
it's consistent with behavior of esxNodeGetInfo.
Introduce virStoragePoolSaveState to properly format the state XML in
the same manner as virStoragePoolDefFormat, except for adding a
<poolstate> ... </poolstate> around the definition. This is similar to
virNetworkObjFormat used to save the live/active network information.
When modifying config/status XML, it might be handy to include some
additional XML elements (e.g. <poolstate>). In order to do so,
introduce new formatting function virStoragePoolDefFormatBuf and make
virStoragePoolDefFormat call it.
Recent testing on large memory systems revealed a bug in the Xen xl
tool's freemem() function. When autoballooning is enabled, freemem()
is used to ensure enough memory is available to start a domain,
ballooning dom0 if necessary. When ballooning large amounts of memory
from dom0, freemem() would exceed its self-imposed wait time and
return an error. Meanwhile, dom0 continued to balloon. Starting the
domain later, after sufficient memory was ballooned from dom0, would
succeed. The libvirt implementation in libxlDomainFreeMem() suffers
the same bug since it is modeled after freemem().
In the end, the best place to fix the bug on the Xen side was to
slightly change the behavior of libxl_wait_for_memory_target().
Instead of failing after caller-provided wait_sec, the function now
blocks as long as dom0 memory ballooning is progressing. It will return
failure only when more memory is needed to reach the target and wait_sec
have expired with no progress being made. See xen.git commit fd3aa246.
There was a dicussion on how this would affect other libxl apps like
libvirt
http://lists.xen.org/archives/html/xen-devel/2015-03/msg00739.html
If libvirt containing this patch was build against a Xen containing
the old libxl_wait_for_memory_target() behavior, libxlDomainFreeMem()
will fail after 30 sec and domain creation will be terminated.
Without this patch and with old libxl_wait_for_memory_target() behavior,
libxlDomainFreeMem() does not succeed after 30 sec, but returns success
anyway. Domain creation continues resulting in all sorts of fun stuff
like cpu soft lockups in the guest OS. It was decided to properly fix
libxl_wait_for_memory_target(), and if anything improve the default
behavior of apps using the freemem reference impl in xl.
xl was patched to accommodate the change in libxl_wait_for_memory_target()
with xen.git commit 883b30a0. This patch does the same in the libxl
driver. While at it, I changed the logic to essentially match
freemem() in $xensrc/tools/libxl/xl_cmdimpl.c. It was a bit cleaner
IMO and will make it easier to spot future, potentially interesting
divergences.
Dependant is flagged as wrong in US dictionary (only valid in UK
dictionary, and even then, it has only the financial sense and not the
inter-relatedness sense that we are more prone to be wanting throughout
code).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
'virPCIDeviceList' is actually an array. Removing one element makes the
rest of the element move.
Use while loop, increase index only when not virPCIDeviceListDel(pcidevs, dev)
Signed-off-by: Huanle Han <hanxueluo@gmail.com>
Commit cd5dc30 added this test, but it fails if
LIBXL_HAVE_BUILDINFO_USBDEVICE_LIST is not defined:
6) Xen XM-2-XML Format fullvirt-multiusb
... libvirt: error : unsupported configuration: multiple USB
devices not supported
FAILED
Instead of always using controller 0 and incrementing port number,
respect the maximum port numbers of controllers and use all of them.
Ports for virtio consoles are quietly reserved, but not formatted
(neither in XML nor on QEMU command line).
Also rejects duplicate virtio-serial addresses.
https://bugzilla.redhat.com/show_bug.cgi?id=890606https://bugzilla.redhat.com/show_bug.cgi?id=1076708
Test changes:
* virtio-auto.args
Filling out the port when just the controller is specified.
switched from using
maxport + 1
to:
first free port on the controller
* virtio-autoassign.args
Filling out the address when no <address> is specified.
Started using all the controllers instead of 0, also discards
the bus value.
* xml -> xml output of virtio-auto
The port assignment is no longer done as a part of XML parsing,
so the unspecified values stay 0.
Create a sorted array of virtio-serial controllers.
Each of the elements contains the controller index
and a bitmap of available ports.
Buses are not tracked, because they aren't supported by QEMU.
If the call to virStorageBackendISCSIGetHostNumber failed, we set
retval = -1, but yet still called virStorageBackendSCSIFindLUs.
Need to add a goto cleanup - while at it, adjust the logic to
initialize retval to -1 and only changed to 0 (zero) on success.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Don't supercede the error message virStorageBackendSCSIFindLUs as the
message such as "error: Failed to find LUs on host 60: ..." is not overly
clear as to what the real problem might be.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Make XML definition saving more generic by moving the common code into
virStoragePoolSaveXML and leave case specific code to
PoolSave{Status,Config,...} functions.
In order to be able to use 'checkPool' inside functions which do not
have any connection reference, 'conn' attribute needs to be discarded
from the checkPool's signature, since it's not used by any storage backend
anyway.
https://bugzilla.redhat.com/show_bug.cgi?id=1206479
As described in virDomainBlockCopy() parameters description, the
VIR_DOMAIN_BLOCK_COPY_GRANULARITY parameter may require the value to
have some specific attributes (e.g. be a power of two or fall within a
certain range). And in qemu, a power of two is required. However, our
code does not check that and let qemu operation fail. Moreover, the
virsh man page is not as exact as it could be in this respect.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The 7c3c7f217e and f5c2d6 commits introduced a nodeinfo test.
In order to do that, some parts of sysfs had to be copied.
However, sysfs is full of symlinks, so during copying some
symlinks broke. Remove them, as on different systems they can
point to different files or be broken. At the same time, we don't
need all files added in those commits. For instance we don't care
about 'uevent' files, 'power' folders, and others.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When we shutdown/reboot a guest using agent-mode, if the guest itself blocks infinitely,
libvirt would block in qemuAgentShutdown() forever.
Thus, we set a timeout for shutdown/reboot, from our experience, 60 seconds would be fine.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>
virDomainHasDiskMirror() currently detects only jobs that add the mirror
elements. Since some operations like migration are interlocked by
existing block jobs on the given domain the check needs to be
instrumented to check regular jobs too.
This patch renames virDomainHasDiskMirror to virDomainHasDiskBlockjob
and adds an argument that allows to select that it returns true only for
block copy jobs as those interlock making the domain persistent.
Other two uses trigger on any block job type.
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
If any disk of a VM was involved in a (copy) block job we refused to do
a snapshot. As not only copy jobs interlock snapshots and the
interlocking is applicable to individual disks only we can make the
check in a more individual fashion and interlock all block job types
supported by libvirt.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1203628
In the order of appearance:
* MAX_LISTEN - never used
added by 23ad665c (qemud) and addec57 (lock daemon)
* NEXT_FREE_CLASS_ID - never used, added by 07d1b6b
* virLockError - never used, added by eb8268a4
* OPENVZ_MAX_ARG, CMDBUF_LEN, CMDOP_LEN
unused since the removal of ADD_ARG_LIT in d8b31306
* QEMU_NB_PER_CPU_STAT_PARAM - unused since 897808e
* QEMU_CMD_PROMPT, QEMU_PASSWD_PROMPT - unused since 1dc10a7
* TEST_MODEL_WORDSIZE - unused since c25c18f7
* TEMPDIR - never used, added by 714bef5
* NSIG - workaround around old headers
added by commit 60ed1d2
unused since virExec was moved by commit 02e8691
* DO_TEST_PARSE - never used, added by 9afa006
* DIFF_MSEC, GETTIMEOFDAY - unused since eee6eb6