If an async job run on a domain will stop the domain at the end of the
job, a concurrently run query job can hang in qemu monitor and nothing
can be done with that domain from this point on. An attempt to start
such domain results in "Timed out during operation: cannot acquire state
change lock" error.
However, quite a few things have to happen at the right time... There
must be an async job running which stops a domain at the end. This race
was reported with dump --crash but other similar jobs, such as
(managed)save and migration, should be able to trigger this bug as well.
While this async job is processing its last monitor command, that is a
query-migrate to which qemu replies with status "completed", a new
libvirt API that results in a query job must arrive and stay waiting
until the query-migrate command finishes. Once query-migrate is done but
before the async job closes qemu monitor while stopping the domain, the
other thread needs to wake up and call qemuMonitorSend to send its
command to qemu. Before qemu gets a chance to respond to this command,
the async job needs to close the monitor. At this point, the query job
thread is waiting for a condition that no-one will ever signal so it
never finishes the job.
* src/qemu/qemu_hostdev.c (qemuDomainReAttachHostdevDevices):
pciDeviceListFree(pcidevs) in the end free()s the device even if
it's in use by other domain, which can cause a race.
How to reproduce:
<script>
virsh nodedev-dettach pci_0000_00_19_0
virsh start test
virsh attach-device test hostdev.xml
virsh start test2
for i in {1..5}; do
echo "[ -- ${i}th time --]"
virsh nodedev-reattach pci_0000_00_19_0
done
echo "clean up"
virsh destroy test
virsh nodedev-reattach pci_0000_00_19_0
</script>
Device pci_0000_00_19_0 dettached
Domain test started
Device attached successfully
error: Failed to start domain test2
error: Requested operation is not valid: PCI device 0000:00:19.0 is in use by domain test
[ -- 1th time --]
Device pci_0000_00_19_0 re-attached
[ -- 2th time --]
Device pci_0000_00_19_0 re-attached
[ -- 3th time --]
Device pci_0000_00_19_0 re-attached
[ -- 4th time --]
Device pci_0000_00_19_0 re-attached
[ -- 5th time --]
Device pci_0000_00_19_0 re-attached
clean up
Domain test destroyed
Device pci_0000_00_19_0 re-attached
The patch also fixes another problem, there won't be error like
"qemuDomainReAttachHostdevDevices: Not reattaching active
device 0000:00:19.0" in daemon log if some device is in active.
As pciResetDevice and pciReattachDevice won't be called for
the device anymore. This is sensible as we already reported
error when preparing the device if it's active. Blindly trying
to pciResetDevice & pciReattachDevice on the device and getting
an error is just redundant.
This patch fixes two problems:
1) The device will be reattached to host even if it's not
managed, as there is a "pciDeviceSetManaged".
2) The device won't be reattached to host with original
driver properly. As it doesn't honor the device original
properties which are maintained by driver->activePciHostdevs.
Commit d336dbdb tried to refactor sanlock to avoid building it
on RHEL for architectures where it is not available, but used
the wrong conditional.
* libvirt.spec.in (with_sanlock): Use %ifarch, not %ifnarch.
I was wondering why 'virsh edit' didn't support the same
'--inactive' option as 'virsh dumpxml'; reading the source
code showed that --inactive was already implied, and that
the only way to alter a running guest rather than affecting
next boot is by hot-plugging individual devices, or by
something complex like saving the guest and modifying the
save image.
* tools/virsh.pod (define, edit): Mention behavior when guest is
already running.
Commit f2013c9dd1ce468b8620ee35c232a93ef7026fb0 added implementation of
virDomainSnapshotListChildrenNames override export, but registration of
the newly exported function was not added.
*python/libvirt-override.c: - register export of function
This chunk of code below repeated in several functions, factor it into
a helper method virDomainLiveConfigHelperMethod to eliminate duplicated code
based on Eric and Adam's suggestion. I have tested it for all the
relevant APIs changed.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
If parsing of arguments failed, virsh did silently exit returning and
error state, but not specifying the possible problem.
* tools/virsh: cmdNodesuspend: - error handling added
Detected by valgrind. Leak introduced in commit 82ff25e.
* tests/nodeinfotest.c: avoid memory leak on nodeinfo test case.
* how to reproduce?
% cd tests && valgrind -v --leak-check=full ./nodeinfotest
* actual valgrind result:
==22147== 65 bytes in 1 blocks are definitely lost in loss record 14 of 29
==22147== at 0x4A0610F: realloc (vg_replace_malloc.c:525)
==22147== by 0x330D6FED94: __vasprintf_chk (in /lib64/libc-2.12.so)
==22147== by 0x426697: virVasprintf (stdio2.h:199)
==22147== by 0x426757: virAsprintf (util.c:1695)
==22147== by 0x41585F: linuxTestNodeInfo (nodeinfotest.c:108)
==22147== by 0x416B21: virtTestRun (testutils.c:141)
==22147== by 0x4157EA: mymain (nodeinfotest.c:140)
==22147== by 0x416217: virtTestMain (testutils.c:696)
==22147== by 0x330D61ECDC: (below main) (in /lib64/libc-2.12.so)
==22147==
==22147== LEAK SUMMARY:
==22147== definitely lost: 65 bytes in 1 blocks
==22147== indirectly lost: 0 bytes in 0 blocks
==22147== possibly lost: 0 bytes in 0 blocks
==22147== still reachable: 126,126 bytes in 1,341 blocks
Signed-off-by: Alex Jia <ajia@redhat.com>
If the vol object is newly created, it increases the volumes count,
but doesn't decrease the volumes count when do cleanup. It can
cause libvirtd to crash when one trying to free the volume objects
like:
for (i = 0; i < pool->volumes.count; i++)
virStorageVolDefFree(pool->volumes.objs[i]);
It's more reliable if we add the newly created vol object in the
end.
Improve the documentation of what forms a valid <address> element,
since these elements appear in numerous devices.
* docs/formatdomain.html.in (elementsAddress): New section.
(elementsControllers, elementsUSB, elementsNICS, elementsInput)
(elementsHub, elementsCharChannel, elementsSound): Refer to it.
Commit 4d9e51f6 fixed a 'make uninstall' failure, but failed
to follow other conventions already present in src/Makefile.am.
In particular, we prefer MKDIR_P over mkdir -p, and should
have a matching rmdir during uninstall for every directory
created during install (the idea being that uninstall in a
DESTDIR should be clean, while installation in the final
system should not fail with non-empty directories left behind).
* tools/Makefile.am (install-sysconfig, install-initscript)
(install-systemd): Use MKDIR_P.
(uninstall-sysconfig, uninstall-initscript, uninstall-systemd):
Also remove directories.
* daemon/Makefile.am (install-data-local, install-data-polkit)
(install-logrotate, install-sysconfig, install-sysctl)
(install-init-redhat, install-init-upstart, install-init-systemd)
(install-data-sasl): Use MKDIR_P.
(uninstall-data-polkit, uninstall-sysconfig, uninstall-sysctl)
(uninstall-init-redhat, uninstall-init-upstart)
(uninstall-init-systemd): Also remove directory.
(uninstall-logrotate): New rule.
(uninstall-local): Add uninstall-logrotate.
When destroying a domain qemuDomainDestroy kills its qemu process and
starts a new job, which means it unlocks the domain object and locks it
again after some time. Although the object is usually unlocked for a
pretty short time, chances are another thread processing an EOF event on
qemu monitor is able to lock the object first and does all the cleanup
by itself. This leads to wrong shutoff reason and lifecycle event detail
and virDomainDestroy API incorrectly reporting failure to destroy an
inactive domain.
Reported by Charlie Smurthwaite.
Current "-ay | -an" has problems on pool starting/refreshing if
the volumes are clustered. Rommer has posted a patch to list 2
months ago.
https://www.redhat.com/archives/libvir-list/2011-October/msg01116.html
But IMO we shouldn't skip the inactived vols. So this is a squashed
patch by Rommer.
Signed-off-by: Rommer <rommer@active.by>
Network disks don't have paths to be resolved or files to be checked
for ownership. ee3efc41e6233e625aa03003bf3127319ccd546f checked this
for some image label functions, but was partially reverted in a
refactor. This finishes adding the check to each security driver's
set and restore label methods for images.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Make uninstall currently fails with the following message:
rmdir /etc/sasl2/
rmdir: failed to remove `/etc/sasl2/': Directory not empty
That's fine (correct in fact) so force the command to return success
with || :
One of the xml tests in the test suite was created using a
now-deprecated qemu machine type ("fedora-13", which was only ever
valid for Fedora builds of qemu). Although strictly speaking it's not
necessary to replace it with an actual supported qemu machine type
(since the xml in question is never actually sent to qemu), this patch
changes it to the actually-supported "pc-0.13" just for general
tidiness. (Also, on some Fedora builds which contain a special patch
to rid the world of "fedora-13", having it mentioned in the test suite
will cause make check to fail.)
This patch addresses https://bugzilla.redhat.com/show_bug.cgi?id=760442
When a network has any forward type other than route, nat or none, the
network configuration should be done completely external to libvirt -
libvirt only uses these types to allow configuring guests in a manner
that isn't tied to a specific host (all the host-specific information,
in particular interface names, port profile data, and bandwidth
configuration is in the network definition, and the guest
configuration only references it).
Due to a bug in the bridge network driver, libvirt was adding iptables
rules for networks with forward type='bridge' etc. any time libvirtd
was restarted while one of these networks was active.
This patch eliminates that error by only "reloading" iptables rules if
forward type is route, nat, or none.
Currently qemuDomainAssignPCIAddresses() is called to assign addresses
to PCI devices.
We need to do something similar for devices with spapr-vio addresses.
So create one place where address assignment will be done, that is
qemuDomainAssignAddresses().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
For the PPC64 pseries machine type we need to add address information
for the spapr-vty device.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
In QEMU PPC64 we have a network device called "spapr-vlan". We can specify
this using the existing syntax for network devices, however libvirt
currently rejects "spapr-vlan" in virDomainNetDefParseXML() because of
the "-". Fix the code to accept "-".
* src/conf/domain_conf.c (virDomainNetDefParseXML): Allow '-' in
model name, and be more efficient.
* docs/schemas/domaincommon.rng: Limit valid model names to match code.
Based on a patch by Michael Ellerman.
Detected by valgrind. Leak introduced in commit 88a993b:
* tools/virsh.c: fix memory leak on cmdDomblklist.
* how to reproduce?
% valgrind -v --leak-check=full virsh domblklist <domain name>
* actual valgrind result:
==6573== 1,836 bytes in 1 blocks are definitely lost in loss record 110 of 124
==6573== at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
==6573== by 0x330D71497D: xdr_string (in /lib64/libc-2.12.so)
==6573== by 0x4D26CED: xdr_remote_nonnull_string (remote_protocol.c:30)
==6573== by 0x4D28138: xdr_remote_domain_get_xml_desc_ret (remote_protocol.c:1418)
==6573== by 0x4D3C0C2: virNetMessageDecodePayload (virnetmessage.c:382)
==6573== by 0x4D3279F: virNetClientProgramCall (virnetclientprogram.c:382)
==6573== by 0x4D0D50B: callWithFD (remote_driver.c:4339)
==6573== by 0x4D0D5AB: call (remote_driver.c:4360)
==6573== by 0x4D16EAF: remoteDomainGetXMLDesc (remote_client_bodies.h:861)
==6573== by 0x4CF9F4F: virDomainGetXMLDesc (libvirt.c:4098)
==6573== by 0x4154D9: cmdDomblklist (virsh.c:1722)
==6573== by 0x4149E2: vshCommandRun (virsh.c:16365)
==6573==
==6573== 46,009 (352 direct, 45,657 indirect) bytes in 1 blocks are definitely lost in loss record 123 of 124
==6573== at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
==6573== by 0x3318286DC6: xmlXPathNewContext (in /usr/lib64/libxml2.so.2.7.6)
==6573== by 0x4C79AE2: virXMLParseHelper (xml.c:779)
==6573== by 0x415512: cmdDomblklist (virsh.c:1726)
==6573== by 0x4149E2: vshCommandRun (virsh.c:16365)
==6573== by 0x427743: main (virsh.c:17867)
==6573==
==6573== LEAK SUMMARY:
==6573== definitely lost: 2,188 bytes in 2 blocks
==6573== indirectly lost: 45,657 bytes in 332 blocks
==6573== possibly lost: 0 bytes in 0 blocks
==6573== still reachable: 128,034 bytes in 1,364 blocks
==6573== suppressed: 0 bytes in 0 blocks
Signed-off-by: Alex Jia <ajia@redhat.com>
When parsing ppc64 models on an x86 host an out-of-memory error message is displayed due
to it checking for retcpus being NULL. Fix this by removing the check whether retcpus is NULL
since we will realloc into this variable.
Also in the X86 model parser display the OOM error at the location where it happens.
Fix memory leak:
==27534== 24 bytes in 1 blocks are definitely lost in loss record 207 of 530
==27534== at 0x4A05E46: malloc (vg_replace_malloc.c:195)
==27534== by 0x38EC26EC37: vasprintf (in /lib64/libc-2.13.so)
==27534== by 0x4E998E6: virVasprintf (util.c:1677)
==27534== by 0x4E999F1: virAsprintf (util.c:1695)
==27534== by 0x4F1EAAC: nodeGetInfo (nodeinfo.c:593)
==27534== by 0x47948F: qemuCapsInitCPU (qemu_capabilities.c:855)
==27534== by 0x4796B1: qemuCapsInit (qemu_capabilities.c:915)
==27534== by 0x456550: qemuCreateCapabilities (qemu_driver.c:245)
==27534== by 0x4578C4: qemudStartup (qemu_driver.c:580)
==27534== by 0x4F20886: virStateInitialize (libvirt.c:852)
==27534== by 0x420E55: daemonRunStateInit (libvirtd.c:1156)
==27534== by 0x4E94C56: virThreadHelper (threads-pthread.c:157)
Mark this leaked variable as const char * when it is passed into another
function.
Pool creates new workers dynamically. However, it is possible
for a pool to have no workers. If we want to free that pool,
we don't want to wait on quit condition as it will never be
signaled.
Add support for newly supported Intel cpu features. Newly supported
flags are: pclmuldq, dtes64, smx, fma, pdcm, movbe, xsave, osxsave and
avx. This adds support for Intel's Sandy Bridge platform.
Reported by Alex Jia <ajia@redhat.com>. Function cmdDomIfGetLink did not
set a success return value on success path.
Signed-off-by: Alex Jia<ajia@redhat.com>
A preparatory patch for DHCP snooping where we want to be able to
differentiate between a VM's interface using the tuple of
<VM UUID, Interface MAC address>. We assume that MAC addresses could
possibly be re-used between different networks (VLANs) thus do not only
want to rely on the MAC address to identify an interface.
At the current 'final destination' in virNWFilterInstantiate I am leaving
the vmuuid parameter as ATTRIBUTE_UNUSED until the DHCP snooping patches arrive.
(we may not post the DHCP snooping patches for 0.9.9, though)
Mostly this is a pretty trivial patch. On the lowest layers, in lxc_driver
and uml_conf, I am passing the virDomainDefPtr around until I am passing
only the VM's uuid into the NWFilter calls.
This patch cleans up return codes in the nwfilter subsystem.
Some functions in nwfilter_conf.c (validators and formatters) are
keeping their bool return for now and I am converting their return
code to true/false.
All other functions now have failure return codes of -1 and success
of 0.
[I searched for all occurences of ' 1;' and checked all 'if ' and
adapted where needed. After that I did a grep for 'NWFilter' in the source
tree.]
Detected by valgrind. Leak introduced in commit dc675f3:
* tools/virsh.c: fix memory leak on cmdDomIfGetLink.
* how to reproduce?
% valgrind -v --leak-check=full virsh domif-getlink <domain name> 0
* actual valgrind result:
==13102== 18 bytes in 1 blocks are definitely lost in loss record 9 of 47
==13102== at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
==13102== by 0x322A6A67DD: xmlStrndup (in /usr/lib64/libxml2.so.2.7.6)
==13102== by 0x414892: cmdDomIfGetLink (virsh.c:1538)
==13102== by 0x4136A2: vshCommandRun (virsh.c:16363)
==13102== by 0x4253FB: main (virsh.c:17865)
==13102==
==13102== LEAK SUMMARY:
==13102== definitely lost: 18 bytes in 1 blocks
==13102== indirectly lost: 0 bytes in 0 blocks
==13102== possibly lost: 0 bytes in 0 blocks
==13102== still reachable: 127,888 bytes in 1,361 blocks
==13102== suppressed: 0 bytes in 0 blocks
Signed-off-by: Alex Jia <ajia@redhat.com>
Detected by valgrind. Leak introduced in commit e9bd9a0:
* tools/virsh.c: fix memory leak on cmdBlkdeviotune.
* how to reproduce?
% valgrind -v --leak-check=full virsh blkdeviotune <domain name> <block device>
* actual valgrind result:
==12759== 576 bytes in 1 blocks are definitely lost in loss record 18 of 29
==12759== at 0x4A04A28: calloc (vg_replace_malloc.c:467)
==12759== by 0x42134E: _vshCalloc.clone.2 (virsh.c:422)
==12759== by 0x4217CB: cmdBlkdeviotune (virsh.c:6364)
==12759== by 0x4136A2: vshCommandRun (virsh.c:16363)
==12759== by 0x4253FB: main (virsh.c:17865)
==12759==
==12759== LEAK SUMMARY:
==12759== definitely lost: 576 bytes in 1 blocks
==12759== indirectly lost: 0 bytes in 0 blocks
==12759== possibly lost: 0 bytes in 0 blocks
==12759== still reachable: 126,964 bytes in 1,342 blocks
==12759== suppressed: 0 bytes in 0 blocks
Signed-off-by: Alex Jia <ajia@redhat.com>
Jiri Denemark reported an instance of bootstrapping libvirt
failing when run inside a sandbox, traced to rpm trying to
access /var/ which was not permitted by the sandbox.
Alex Jia reported that 0.9.8-rc1 failed to bootstrap if patch(1)
is not installed.
* bootstrap.conf (buildreq): Avoid rpm call if python-config
exists. Also, require patch, in case we have gnulib-local diffs.
In some error situations, the function testDomainRestoreFlags() could
unlock the test driver mutex without first locking it. This patch
moves the lock operation earlier, so that it occurs before any
potential jump down to the unlock call.
I found this problem while auditing the test driver lock usage to
determine the cause of a hang while running the following test:
cd tests; while true; do printf x; ./undefine; done
This patch *does not* solve that problem, but we now understand its
actual source, and danpb is working on a patch.
https://bugzilla.redhat.com/show_bug.cgi?id=738725
Commit ecd8725 tried to silence a spurious warning on the initial
libvirt install, and commit ba6cbb1 tried to fix up the logic to the
correct Fedora version, but the warning was still present due to a
logic bug: since %{fedora} and %{rhel} are never simulatanously
set, then 0%{rhel} <= 6 made the %if always true. Checking for
minimum versions (via >=) is okay, but checking for maximum versions
(via <=) requires a prerequisite test that the platform being tested
is non-zero.
Also fix a bogus setting of with_libxl (although we previously
hard-code with_libxl to 0 for rhel earlier in the file, so this
was not as severe a bug).
* libvirt.spec.in (with_cgconfig): Don't enable cgconfig on F16.
Over time, Fedora and RHEL RPMs have often backported upstream
patches that touched configure.ac and/or Makefile.am; this
necessitates rerunning the autotools for the patch to be effective.
Making this a one-liner spec tweak will make it easier for future
backports to pull patches without having to find all the places
to touch to properly use the autotools. Meanwhile, there have been
historical instances where an update in the autotools caused FTBFS
situations, so this is not on by default.
* libvirt.spec.in (enable_autotools): New variable, default off.
(BuildRequires): Conditionally add autotools.
(%build): Conditionally use them before configure.
* mingw32-libvirt.spec.in: Likewise.
The installation rules for the libvirt-guests.service were
totally broken
- Installing in the wrong location
- The location was not overridable
- The install-systemd rule was not invoked anywhere
- The install-systemd rule was not invoking install-initscript
which it depends on
- The installed service file lacked a .service extension
* tools/Makefile.am: Fix install of libvirt-guests.service
The %makeinstall macro does not set DESTDIR, instead of explicitly
prefixes %{buildroot} onto all paths. Thus we need to do the same
when setting the systemd unit dir
* libvirt.spec.in: Prefix %{buildroot} onto %{unitdir}
ppc64 as new arch type and pseries as new machine type are
added under <os> ... </os>.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
assumptions from generic code.
This implements the minimal set of changes needed in libvirt to launch a
PowerPC-KVM based guest.
It removes x86-specific assumptions about choice of serial driver backend
from generic qemu guest commandline generation code.
It also restricts the ACPI capability to be available for an x86 or
x86_64 domain.
This is not a complete solution -- it still does not guarantee libvirt
the capability to flag non-supported options in guest XML. (Eg, an ACPI
specification in a PowerPC guest XML will still get processed, even
though qemu-system-ppc64 does not support it while qemu-system-x86_64 does.)
This drawback exists because libvirt falls back on qemu to query supported
features, and qemu '-h' blindly lists all capabilities -- irrespective
of whether they are available while emulating a given architecture or not.
The long-term solution would be for qemu to list out capabilities based
on architecture and platform -- so that libvirt can cleanly make out what
devices are supported on an arch (say 'ppc64') and platform (say, 'mac99').
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
This enables libvirt to select the correct qemu binary (qemu-system-ppc64)
for a guest vm based on arch 'ppc64'.
Also, libvirt is enabled to correctly parse the list of supported PowerPC
CPUs, generated by running 'qemu-system-ppc64 -cpu ?'
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Acked-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
to proc/cpuinfo
This patch creates a new sysfs hierarchy under
tests/nodeinfodata/linux-nodeinfo-sysfs-test-1.
Output files and /proc/cpuinfo files are also respectively added for
both x86 and ppc64.
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
/proc/cpuinfo
Libvirt at present depends on /proc/cpuinfo to gather host
details such as CPUs, cores, threads, etc. This is an architecture-
dependent approach. An alternative is to use 'Sysfs', which provides
a platform-agnostic interface to parse host CPU topology.
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>