To avoid code duplication between snapshot configuration code that
parses the disk source too we need to split out this code that will be
reused later on.
This patch tries to be code movement, some aspects of this function will
be refactored later.
If the managedsave image is corrupted, e.g. the XML part is, we fail to
parse it and throw an error, e.g.:
error: Failed to start domain jms8
error: XML error: missing security model when using multiple labels
This is okay, as we can't really start the machine and avoid undefined
qemu behaviour. On the other hand, the error message doesn't give a
clue to users what should they do. The consensus here would be to thrown
a warning to logs saying "Hey, you've got a corrupted file".
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1027096
If there's the following snippet in the domain XML, the domain will be
lost upon the daemon restart (if the domain is started prior restart):
<seclabel type='dynamic' relabel='yes'/>
The problem is, the 'label', 'imagelabel' and 'baselabel' are parsed
whenever the VIR_DOMAIN_XML_INACTIVE is *not* present or the label is
static. The latter is not our case, obviously. So, when libvirtd starts
up, it finds domain state xml and parse it. During parsing, many XML
flags are enabled but VIR_DOMAIN_XML_INACTIVE. Hence, our parser tries
to extract 'label', 'imagelabel' and 'baselabel' from the XML which
fails for model='none'. Err, this model - even though not specified in
XML - can be taken from qemu wide config file: /etc/libvirtd/qemu.conf.
However, in order to know we are dealing with model='none' the code in
question must be moved forward a bit. Then a new check must be
introduced. This is what the first two chunks are doing.
But this alone is not sufficient. The domain state XML won't contain the
model attribute without slight modification. The model should be
inserted into the XML even if equal to 'none' and the state XML is being
generated - what if the origin (the @security_driver variable in
qemu.conf) changes during libvirtd restarts?
At the end, a test to catch this scenario is introduced.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In virDomainRestoreFlags with VIR_DOMAIN_SAVE_BYPASS_CACHE, it risks
slowing restores from NFS, but not saves to NFS.
Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>
This patch moves some code in the qemuDomainAttachSCSIDisk
function. The check for the existence of a PCI address assigned
to the SCSI controller was moved in order to be executed only
when needed. The PCI address of a controller is not necessary
if QEMU_CAPS_DEVICE is supported.
This fixes issues with the hotplug of SCSI disks on pseries guests.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1025397
When virPCIGetVirtualFunctions created the list of an SRIOV Physical
Function's (PF) Virtual Functions (VF), it had assumed that the order
of "virtfn*" links returned by readdir() from the PF's sysfs directory
was already in the correct order. Experience has shown that this is
not always the case - it can be in alphabetical order (which would
e.g. place virtfn11 before virtfn2) or even some seemingly random
order (see the example in the bugzilla report)
This results in 1) incorrect assumptions made by consumers of the
output of the virt_functions list of virsh nodedev-dumpxml, and 2)
setting MAC address and vlan tag on the wrong VF (since libvirt uses
netlink to set mac address and vlan tag, netlink requires the VF#, and
the function virPCIGetVirtualFunctionIndex() returns the wrong index
due to the improperly ordered VF list).
The solution provided by this patch is for virPCIGetVirtualFunctions
to no longer scan the entire device directory in its natural order,
but instead to check for links individually by name "virtfn%d" where
%d starts at 0 and increases with each success. Since VFs are created
contiguously by the kernel, this will guarantee that all VFs are
found, and placed in the arry in the correct order.
One note of use to the uninitiated is that VIR_APPEND_ELEMENT always
either increments *num_virtual_functions or fails, so no this isn't an
endless loop.
(NB: the SRIOV_* defines at the top of virpci.c were removed
because they are unnecessary and/or not used.)
When adding support for Q35 guests, the code to assign a PCI address
to the primary video card was moved into Q35 and i440fx(PIIX3)
specific functions, but no fallback was kept for other machine types
that might have a video card.
This patch remedies that by assigning a PCI address to the primary
video card if it does not have any kind of address. In particular,
this fixes issues with pseries guests.
Signed-off-by: Vitor de Lima <vitor.lima@eldorado.org.br>
Signed-off-by: Laine Stump <laine@laine.org>
When starting a VM the qemu process may filter out some requested
features of a domain as it's not supported either by the host or by
qemu. Libvirt didn't check if this happened which might end up in
changing of the guest ABI when migrating.
The proof of concept implementation adds the check for the recently
introduced kvm_pv_unhalt cpuid feature bit. This feature depends on both
qemu and host kernel support and thus increase the possibility of guest
ABI breakage.
The linux kernel recently added support for paravirtual spinlock
handling to avoid performance regressions on overcomitted hosts. This
feature needs to be turned in the hypervisor so that the guest OS is
notified about the possible support.
This patch adds a new feature "paravirt-spinlock" to the XML and
supporting code to enable the "kvm_pv_unhalt" pseudo CPU feature in
qemu.
https://bugzilla.redhat.com/show_bug.cgi?id=1008989
Currently we were storing domain feature flags in a bit field as the
they were either enabled or disabled. New features such as paravirtual
spinlocks however can be tri-state as the default option may depend on
hypervisor version.
To allow storing tri-state feature state in the same place instead of
having to declare dedicated variables for each feature this patch
refactors the bit field to an array.
Some of the emulator features are presented in the <features> element in
the domain XML although they are virtual CPUID feature bits when
presented to the guest. To avoid confusing the users with these
features, as they are not configurable via the <cpu> element, this patch
adds an internal array where those can be stored privately instead of
exposing them in the XML.
Additionaly KVM feature bits are added as example usage of this code.
The qemu monitor supports retrieval of actual CPUID bits presented to
the guest using QMP monitor. Add APIs to extract these information and
tests for them.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
The CPUID functions were stored in multiple arrays according to a
specified prefix of those. This made it very hard to add another prefix
to store KVM CPUID features (0x40000000). Instead of hardcoding a third
array this patch changes the approach used:
The code is refactored to use a single array where the CPUID functions
are stored ordered by the cpuid function so that they don't depend on
the specific prefix and don't waste memory. The code is also less
complex using this approach. A trateoff to this is the change from O(N)
complexity to O(N^2) in x86DataAdd and x86DataSubtract. The rest of the
functions were already using O(N^2) algorithms.
vol-clone reports out of memory error with disk type on ppc64.
Currently, wbytes is defined as size_t type (8 bytes), but
args's value in ioctl(fd, args..) in kernel is int (4 bytes).
This makes wbytes 2^32 times larger, causing an out of memory error.
This patch changes size_t to int to synchronize with kernel.
[1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/block/ioctl.c?id=5e01dc7b#n363
[2] https://lkml.org/lkml/2013/11/1/620
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Since 86d90b3a (yes, my patch; again) we are supporting NBD storage
migration. However, on error recovery path we got the steps reversed.
The correct order is: return NBD port to the virPortAllocator and then
either unlock the vm or remove it from the driver. Not vice versa.
==11192== Invalid write of size 4
==11192== at 0x11488559: qemuMigrationPrepareAny (qemu_migration.c:2459)
==11192== by 0x11488EA6: qemuMigrationPrepareDirect (qemu_migration.c:2652)
==11192== by 0x114D1509: qemuDomainMigratePrepare3Params (qemu_driver.c:10332)
==11192== by 0x519075D: virDomainMigratePrepare3Params (libvirt.c:7290)
==11192== by 0x1502DA: remoteDispatchDomainMigratePrepare3Params (remote.c:4798)
==11192== by 0x12DECA: remoteDispatchDomainMigratePrepare3ParamsHelper (remote_dispatch.h:5741)
==11192== by 0x5212127: virNetServerProgramDispatchCall (virnetserverprogram.c:435)
==11192== by 0x5211C86: virNetServerProgramDispatch (virnetserverprogram.c:305)
==11192== by 0x520A8FD: virNetServerProcessMsg (virnetserver.c:165)
==11192== by 0x520A9E1: virNetServerHandleJob (virnetserver.c:186)
==11192== by 0x50DA78F: virThreadPoolWorker (virthreadpool.c:144)
==11192== by 0x50DA11C: virThreadHelper (virthreadpthread.c:161)
==11192== Address 0x1368baa0 is 576 bytes inside a block of size 688 free'd
==11192== at 0x4A07F5C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==11192== by 0x5079A2F: virFree (viralloc.c:580)
==11192== by 0x11456C34: qemuDomainObjPrivateFree (qemu_domain.c:267)
==11192== by 0x50F41B4: virDomainObjDispose (domain_conf.c:2034)
==11192== by 0x50C2991: virObjectUnref (virobject.c:262)
==11192== by 0x50F4CFC: virDomainObjListRemove (domain_conf.c:2361)
==11192== by 0x1145C125: qemuDomainRemoveInactive (qemu_domain.c:2087)
==11192== by 0x11488520: qemuMigrationPrepareAny (qemu_migration.c:2456)
==11192== by 0x11488EA6: qemuMigrationPrepareDirect (qemu_migration.c:2652)
==11192== by 0x114D1509: qemuDomainMigratePrepare3Params (qemu_driver.c:10332)
==11192== by 0x519075D: virDomainMigratePrepare3Params (libvirt.c:7290)
==11192== by 0x1502DA: remoteDispatchDomainMigratePrepare3Params (remote.c:4798)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
One of my previous patches (c7ac2519b7) did try to fix the issue when
domain dies too soon during migration. However, this clumsy approach was
missing removal of qemuProcessHandleMonitorDestroy resulting in double
unrefing of mon->vm and hence producing the daemon crash:
==11843== Invalid read of size 4
==11843== at 0x50C28C5: virObjectUnref (virobject.c:255)
==11843== by 0x1148F7DB: qemuMonitorDispose (qemu_monitor.c:258)
==11843== by 0x50C2991: virObjectUnref (virobject.c:262)
==11843== by 0x50C2D13: virObjectFreeCallback (virobject.c:388)
==11843== by 0x509C37B: virEventPollCleanupHandles (vireventpoll.c:583)
==11843== by 0x509C711: virEventPollRunOnce (vireventpoll.c:652)
==11843== by 0x509A620: virEventRunDefaultImpl (virevent.c:274)
==11843== by 0x520D21C: virNetServerRun (virnetserver.c:1112)
==11843== by 0x11F368: main (libvirtd.c:1513)
==11843== Address 0x13b88864 is 4 bytes inside a block of size 136 free'd
==11843== at 0x4A07F5C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==11843== by 0x5079A2F: virFree (viralloc.c:580)
==11843== by 0x50C29E3: virObjectUnref (virobject.c:270)
==11843== by 0x114770E4: qemuProcessHandleMonitorDestroy (qemu_process.c:1103)
==11843== by 0x1148F7CB: qemuMonitorDispose (qemu_monitor.c:257)
==11843== by 0x50C2991: virObjectUnref (virobject.c:262)
==11843== by 0x50C2D13: virObjectFreeCallback (virobject.c:388)
==11843== by 0x509C37B: virEventPollCleanupHandles (vireventpoll.c:583)
==11843== by 0x509C711: virEventPollRunOnce (vireventpoll.c:652)
==11843== by 0x509A620: virEventRunDefaultImpl (virevent.c:274)
==11843== by 0x520D21C: virNetServerRun (virnetserver.c:1112)
==11843== by 0x11F368: main (libvirtd.c:1513)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So far we are checking if qemu supports 'nbd-server-start'. This,
however, makes no sense on the source as nbd-server-* is used on the
destination. On the source the 'drive-mirror' is used instead.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For some strange reason virDomainDiskSourcePoolDefParse accessed def of
the disk and allocated the pool object in it. To avoid the need to carry
over the disk definition object, refactor this function to return the
allocated object instead.
Commit b0f8546 broke the build on mingw, by exposing code that
had Linux-specific dependencies but which was previously protected
by libnuma ifdef guards:
make[3]: Entering directory `/home/eblake/libvirt-tmp/build/src'
CC libvirt_driver_la-nodeinfo.lo
../../src/nodeinfo.c: In function 'virNodeGetSiblingsList':
../../src/nodeinfo.c:1543:30: error: 'SYSFS_THREAD_SIBLINGS_LIST_LENGTH_MAX' undeclared (first use in this function)
if (virFileReadAll(path, SYSFS_THREAD_SIBLINGS_LIST_LENGTH_MAX, &buf) < 0)
^
../../src/nodeinfo.c:1543:30: note: each undeclared identifier is reported only once for each function it appears in
../../src/nodeinfo.c: In function 'virNodeCapsFillCPUInfo':
../../src/nodeinfo.c:1562:5: error: implicit declaration of function 'virNodeGetCpuValue' [-Werror=implicit-function-declaration]
if ((tmp = virNodeGetCpuValue(SYSFS_CPU_PATH, cpu_id,
^
../../src/nodeinfo.c:1562:5: error: nested extern declaration of 'virNodeGetCpuValue' [-Werror=nested-externs]
../../src/nodeinfo.c:1562:35: error: 'SYSFS_CPU_PATH' undeclared (first use in this function)
if ((tmp = virNodeGetCpuValue(SYSFS_CPU_PATH, cpu_id,
^
cc1: all warnings being treated as errors
* src/nodeinfo.c (virNodeCapsFillCPUInfo): Make conditional.
(virNodeGetSiblingsList): Move into #ifdef linux block.
Signed-off-by: Eric Blake <eblake@redhat.com>
This gets rid of another stat() per volume, as well as cutting
bytes read in half, when populating the volumes of a directory
pool during a pool refresh. Not to mention that it provides an
interface that can let gluster pools also probe file types.
* src/util/virstoragefile.h (virStorageFileProbeFormatFromFD):
Delete.
(virStorageFileProbeFormatFromBuf): New prototype.
(VIR_STORAGE_MAX_HEADER): New constant, based on...
* src/util/virstoragefile.c (STORAGE_MAX_HEAD): ...old name.
(vmdk4GetBackingStore, virStorageFileGetMetadataInternal)
(virStorageFileProbeFormat): Adjust clients.
(virStorageFileProbeFormatFromFD): Delete.
(virStorageFileProbeFormatFromBuf): Export.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Adjust client.
* src/libvirt_private.syms (virstoragefile.h): Adjust exports.
Signed-off-by: Eric Blake <eblake@redhat.com>
Future patches will want to learn metadata about a file using
a buffer that was already parsed in order to probe the file's
format. Rather than reopening and re-reading the file, it makes
sense to separate getting file contents from actually parsing
those contents.
* src/util/virstoragefile.c (virStorageFileGetMetadataFromBuf)
(virStorageFileGetMetadataFromFDInternal): New functions.
(virStorageFileGetMetadataInternal): Hoist fstat() and read() into
callers.
(virStorageFileGetMetadataFromFD)
(virStorageFileGetMetadataRecurse): Rework clients.
* src/util/virstoragefile.h (virStorageFileGetMetadataFromBuf):
New prototype.
* src/libvirt_private.syms (virstoragefile.h): Export it.
Signed-off-by: Eric Blake <eblake@redhat.com>
We are calling fstat() at least twice per storage volume in
a directory storage pool; this is rather wasteful. Refactoring
this is also a step towards making code reusable for gluster,
where gluster can provide struct stat but cannot use fstat().
* src/storage/storage_backend.h
(virStorageBackendVolOpenCheckMode)
(virStorageBackendUpdateVolTargetInfoFD): Update signature.
* src/storage/storage_backend.c
(virStorageBackendVolOpenCheckMode): Pass stat results back.
(virStorageBackendUpdateVolTargetInfoFD): Use existing stats.
(virStorageBackendVolOpen, virStorageBackendUpdateVolTargetInfo):
Update callers.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Likewise.
* src/storage/storage_backend_scsi.c
(virStorageBackendSCSIUpdateVolTargetInfo): Likewise.
* src/storage/storage_backend_mpath.c
(virStorageBackendMpathUpdateVolTargetInfo): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Our backing file chain code was not very robust to an ill-timed
EINTR, which could lead to a short read causing us to randomly
treat metadata differently than usual. But the existing
virFileReadLimFD forces an error if we don't read the entire
file, even though we only care about the header of the file.
So add a new virFile function that does what we want.
* src/util/virfile.h (virFileReadHeaderFD): New prototype.
* src/util/virfile.c (virFileReadHeaderFD): New function.
* src/libvirt_private.syms (virfile.h): Export it.
* src/util/virstoragefile.c (virStorageFileGetMetadataInternal)
(virStorageFileProbeFormatFromFD): Use it.
Signed-off-by: Eric Blake <eblake@redhat.com>
'unsigned char *' makes sense if you are doing math on bytes and
don't want to worry about wraparound from a signed 'char'; but
since all we are doing is memcmp() or virReadBufInt*[LB]E(), which
are both safe on either type of char, and since read() prefers to
operate on 'char *', it's simpler to avoid casts by just typing
things as 'char *' from the get-go. [Technically, read can
operate on an 'unsigned char *' thanks to the C rule that any
pointer can be implicitly converted to 'char *' for legacy K&R
compatibility; but where this patch saves us is if we try to use
virfile.h functions that take 'char **' in order to allocate the
buffer, where the compiler would barf on type mismatch.]
* src/util/virstoragefile.c (FileTypeInfo): Avoid unsigned char.
(cowGetBackingStore, qcow2GetBackingStoreFormat)
(qcowXGetBackingStore, qcow1GetBackingStore)
(qcow2GetBackingStore, vmdk4GetBackingStore, qedGetBackingStore)
(virStorageFileMatchesMagic, virStorageFileMatchesVersion)
(virStorageFileProbeFormatFromBuf, qcow2GetFeatures)
(virStorageFileGetMetadataInternal)
(virStorageFileProbeFormatFromFD): Simplify clients.
Signed-off-by: Eric Blake <eblake@redhat.com>
Since the 90139a62 commit the error is copied into mon->lastError but
it's never freed from there.
==31989== 395 bytes in 1 blocks are definitely lost in loss record 877 of 978
==31989== at 0x4A06C2B: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==31989== by 0x7EAF129: strdup (in /lib64/libc-2.15.so)
==31989== by 0x50D586C: virStrdup (virstring.c:554)
==31989== by 0x50976C1: virCopyError (virerror.c:191)
==31989== by 0x5097A35: virCopyLastError (virerror.c:312)
==31989== by 0x114909A9: qemuMonitorIO (qemu_monitor.c:690)
==31989== by 0x509BEDE: virEventPollDispatchHandles (vireventpoll.c:501)
==31989== by 0x509C701: virEventPollRunOnce (vireventpoll.c:648)
==31989== by 0x509A620: virEventRunDefaultImpl (virevent.c:274)
==31989== by 0x520D21C: virNetServerRun (virnetserver.c:1112)
==31989== by 0x11F368: main (libvirtd.c:1513)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If there's a migration cancelled, the bitmap of migration port should be
cleaned up too.
Signed-off-by: Zeng Junliang <zengjunliang@huawei.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1020135
If networkAllocateActualDevice() had failed due to a pool of hostdev
or direct devices being depleted, the calling function could still
call networkReleaseActualDevice() as part of its cleanup, and that
function would then unconditionally decrement the connections count
for the network, even though it hadn't been incremented (due to
failure of allocate). This *was* necessary because the .actual member
of the netdef was allocated with a "lazy" algorithm, only being
created if there was a need to store data there (e.g. if a device was
allocated from a pool, or bandwidth was allocated for the device), so
there was no simple way for networkReleaseActualDevice() to tell if
something really had been allocated (i.e. if "connections++" had been
executed).
This patch changes networkAllocateDevice() to *always* allocate an
actual device for any netdef of type='network', even if it isn't
needed for any other reason. This has no ill effects anywhere else in
the code (except for using a small amount of memory), and
networkReleaseActualDevice() can then determine if there was a
previous successful allocate by checking for .actual != NULL (if not,
it skips the "connections--").
Needed for architectures that don't use gcc atomic ops but pthread. This
fixes the armel build that otherwise breaks like:
CCLD virt-login-shell
../src/.libs/libvirt-setuid-rpc-client.a(libvirt_setuid_rpc_client_la-virobject.o): In function `virClassNew':
/«PKGBUILDDIR»/debian/build/src/../../../src/util/virobject.c:150: undefined reference to `virAtomicLock'
../src/.libs/libvirt-setuid-rpc-client.a(libvirt_setuid_rpc_client_la-virobject.o): In function `virObjectNew':
/«PKGBUILDDIR»/debian/build/src/../../../src/util/virobject.c:202: undefined reference to `virAtomicLock'
../src/.libs/libvirt-setuid-rpc-client.a(libvirt_setuid_rpc_client_la-virobject.o): In function `virObjectUnref':
/«PKGBUILDDIR»/debian/build/src/../../../src/util/virobject.c:274: undefined reference to `virAtomicLock'
../src/.libs/libvirt-setuid-rpc-client.a(libvirt_setuid_rpc_client_la-virobject.o): In function `virObjectRef':
/«PKGBUILDDIR»/debian/build/src/../../../src/util/virobject.c:295: undefined reference to `virAtomicLock'
collect2: error: ld returned 1 exit status
See https://buildd.debian.org/status/fetch.php?pkg=libvirt&arch=armel&ver=1.1.4-1&stamp=1383588268
A qcow2 file with a backing file of 'gluster://host/vol/file' should
not try to look for a directory named './gluster:/' in the file system.
* src/util/virstoragefile.c (virBackingStoreIsFile): Broaden check
to include all protocols.
Signed-off-by: Eric Blake <eblake@redhat.com>
Coverity complains that the call to virPCIDeviceDetach() in
qemuPrepareHostdevPCIDevices() doesn't check status return like
other calls. Seems this just was lurking until a recent change
to this module resulted in Coverity looking harder and finding
the issue. Introduced by 'a4efb2e33' when function was called
'pciReAttachDevice()'
Just added a ignore_value() since it doesn't appear to matter
if the call fails since we're on a failure path already.
Currently the LXC container tries to skip selinux/securityfs
mounts if the directory does not exist in the filesystem,
or if SELinux is disabled.
The former check is flawed because the /sys/fs/selinux
or /sys/kernel/securityfs directories may exist in sysfs
even if the mount type is disabled. Instead of just doing
an access() check, use an virFileIsMounted() to see if
the FS is actually present in the host OS. This also
avoids the need to check is_selinux_enabled().
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some mounts must be skipped if running inside a user namespace,
since the kernel forbids their use. Instead of strcmp'ing the
filesystem type in the body of the loop, set an explicit flag
in the lxcBasicMounts table.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the lxcBasicMounts array has separate entries for
most mounts, to reflect that we must do a separate mount
operation to make mounts read-only. Remove the duplicate
entries and instead set the MS_RDONLY flag against the main
entry. Then change lxcContainerMountBasicFS to look for the
MS_RDONLY flag, mask it out & do a separate bind mount.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The 'srcpath' variable is initialized from 'mnt->src' and never
changed thereafter. Some places continue to use 'mnt->src' and
others use 'srcpath'. Remove the pointless 'srcpath' variable
and use 'mnt->src' everywhere.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virLXCBasicMountInfo struct contains a 'char *opts'
field passed onto the mount() syscall. Every entry in the
list sets this to NULL though, so it can be removed to
simplify life.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a function for efficiently checking if a path is a filesystem
mount point.
NB will not work for bind mounts, only true filesystem mounts.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1018897
If a PCI deivce is not binded to any driver (e.g. there's yet no PCI
driver in the linux kernel) but still users want to passthru the device
we fail the whole operation as we fail to resolve the 'driver' link
under the PCI device sysfs tree. Obviously, this is not a fatal error
and it shouldn't be error at all.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Until now the map was loaded from the XML definition file every time a
operation on the flags was requested. With the introduciton of one shot
initializers we can store the definition forever (as it will never
change) instead of parsing it over and over again.
This makes virCPUx86DataAddCPUID, virCPUx86DataFree, and
virCPUx86MakeData available for direct usage outside of cpu driver in
tests and the new qemu monitor that will request the actual CPU
definition from a running qemu instance.
The function was called in a single place only and was reporting errors
that were later ignored. Use the virNumaGetNodeMemory helper to get the
size of the memory in the NUMA node and remove the helper
aa0f099 introduced a strict error checking for getsockopt and it
revealed that getting a peer credential of a socket on FreeBSD
didn't work. Libvirtd hits the error:
error : virNetSocketGetUNIXIdentity:1198 : Failed to get valid
client socket identity groups
SOL_SOCKET (0xffff) was used as a level of getsockopt for
LOCAL_PEERCRED, however, it was wrong. 0 is correct as well as
Mac OS X.
So for LOCAL_PEERCRED our options are SOL_LOCAL (if defined) or
0 on Mac OS X and FreeBSD. According to the fact, the patch
simplifies the code by removing ifdef __APPLE__.
I tested the patch on FreeBSD 8.4, 9.2 and 10.0-BETA1.
Signed-off-by: Ryota Ozaki <ozaki.ryota@gmail.com>
With Mac OS X 10.9, xdrproc_t is no longer defined as:
typedef bool_t (*xdrproc_t)(XDR *, ...);
but instead as:
typdef bool_t (*xdrproc_t)(XDR *, void *, unsigned int);
For reference, Linux systems typically define it as:
typedef bool_t (*xdrproc_t)(XDR *, void *, ...);
The rationale explained in the header is that using a vararg is
incorrect and has a potential to change the ABI slightly do to compiler
optimizations taken and the undefined behavior. They decided
to specify the exact number of parameters and for compatibility with old
code decided to make the signature require 3 arguments. The third
argument is ignored for cases that its not used and its recommended to
supply a 0.
Rather than casting the virBitmap pointer to uint8_t* and then using
the structure contents as a byte array, use the virBitmap API to determine
the bitmap size and test each bit.
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
in recently xen commit: 7051d5c8, there is a api changes in
libxl_domain_create_restore.
Author: Andrew Cooper <andrew.cooper3@citrix.com>
Date: Thu Oct 10 12:23:10 2013 +0100
tools/migrate: Fix regression when migrating from older version of Xen
use the macro LIBXL_HAVE_DOMAIN_CREATE_RESTORE_PARAMS in libxl.h
in order to make libvirt could compile with old and new xen.
the params checkpointed_stream is useful if libvirt libxl driver
support migration. for new, set it as zero.
Signed-off-by: Bamvor Jian Zhang <bjzhang@suse.com>
When starting a transient VM the first thing done is to check
for duplicates. The check looks if there are any running VMs
with the matching name/uuid. It explicitly allows there to
be inactive VMs, so that a persistent VM can be temporarily
booted with a different config.
There is a race condition, however, where 2 or more clients
try to create the same transient VM. The first client will
cause a virDomainObjPtr to be added to the domain list, and
it is inactive at this stage. The second client may then
come along and see this inactive VM, and mistake it for a
persistent VM.
If the first VM fails to start its transient guest for any
reason, then it'll remove the virDomainObjPtr from the list.
The second client now has a virDomainObjPtr that it can try
to boot, which libvirt no longer has a record of. The result
can be a running QEMU process that is orphaned.
It was also, however, possible for the virDomainObjPtr to be
completely free'd which will cause libvirtd to crash in some
scenarios.
The fix is to only allow an existing inactive VM if it is
marked as persistent.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Thie patch fixes the segfault:
error : nodeStateInitialize:658 : DBus not available,
disabling HAL driver: internal error: Unable to get DBus
system bus connection: Failed to connect to socket
/var/run/dbus/system_bus_socket: No such file or directory
error : nodeStateInitialize:719 : ?:
Caught Segmentation violation dumping internal log buffer:
This segfault occurs at the below VIR_ERROR:
failure:
if (dbus_error_is_set(&err)) {
VIR_ERROR(_("%s: %s"), err.name, err.message);
When virDBusGetSystemBus fails, the code jumps to the above failure
path. However, the err variable is not correctly initialized
before calling virDBusGetSystemBus. As a result, dbus_error_is_set
may pass over the uninitialized err variable whose name or
message may point to somewhere unknown memory region, which
causes a segfault on VIR_ERROR.
The new code initializes the err variable before calling
virDBusGetSystemBus.
Signed-off-by: Ryota Ozaki <ozaki.ryota@gmail.com>
Include reference of the VM object pointer and name in debug
logs for QEMU start/stop functions. Also make sure we log the
PID that we started, since it isn't available elsewhere in the
logs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In debugging a recent oVirt/libvirt race condition, I was very
frustrated by lack of logging in the job enter/exit code. This
patch adds some key data which would have been useful in by
debugging attempts.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Using size_t counts will let us use VIR_APPEND_ELEMENT and friends.
* src/conf/storage_conf.h (_virStoragePoolObjList)
(_virStorageVolDefList): Track list sizes with size_t.
* src/storage/storage_backend_rbd.c
(virStorageBackendRBDRefreshPool): Fix type fallout.
Signed-off-by: Eric Blake <eblake@redhat.com>
To make it easier to forbid future attempts at a confusing typedef
name ending in Ptr that isn't actually a pointer, insist that we
follow our preferred style of 'typedef foo *fooPtr'.
* cfg.mk (sc_forbid_const_pointer_typedef): Enforce consistent
style, to prevent issue fixed in previous storage patch.
* src/conf/capabilities.h (virCapsPtr): Fix offender.
* src/security/security_stack.c (virSecurityStackItemPtr):
Likewise.
* tests/qemucapabilitiestest.c (testQemuDataPtr): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
The following sequence
1. Define a persistent QMEU guest
2. Start the QEMU guest
3. Stop libvirtd
4. Kill the QEMU process
5. Start libvirtd
6. List persistent guests
At the last step, the previously running persistent guest
will be missing. This is because of a race condition in the
QEMU driver startup code. It does
1. Load all VM state files
2. Spawn thread to reconnect to each VM
3. Load all VM config files
Only at the end of step 3, does the 'virDomainObjPtr' get
marked as "persistent". There is therefore a window where
the thread reconnecting to the VM will remove the persistent
VM from the list.
The easy fix is to simply switch the order of steps 2 & 3.
In addition to this though, we must only attempt to reconnect
to a VM which had a non-zero PID loaded from its state file.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The 'error' cleanup block in qemuProcessReconnect() had a
'return' statement in the middle of it. This caused a leak
of virConnectPtr & virQEMUDriverConfigPtr instances. This
was identified because netcf recently started checking its
refcount in libvirtd shutdown:
netcfStateCleanup:109 : internal error: Attempt to close netcf state driver with open connections
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
virDomainObjListLoadAllConfigs sets dom->persistent after
having released its lock on the domain object. This exposes
a possible race condition.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To ensure proper processing by virGetUserID() and virGetGroupID()
of a uid/gid add a "+" prior to the uid/gid to denote it's really
a uid/gid for the label.
The rbd code had a confusing typedef ending in Ptr that was not
actually a pointer, which made the rest of the code harder to
read. This fixes things to actually pass by pointer rather than
by copy.
* src/storage/storage_backend_rbd.c (virStorageBackendStatePtr):
Fix typedef.
(virStorageBackendRBDOpenRADOSConn)
(virStorageBackendRBDCloseRADOSConn)
(volStorageBackendRBDRefreshVolInfo)
(virStorageBackendRBDRefreshPool, virStorageBackendRBDDeleteVol)
(virStorageBackendRBDCreateVol, virStorageBackendRBDRefreshVol)
(virStorageBackendRBDResizeVol): Fix fallout.
Signed-off-by: Eric Blake <eblake@redhat.com>
When adding an automatically allocated port to a well-formed migration
URI, keep it well-formed:
tcp://1.2.3.4/ -> tcp://1.2.3.4/:12345 # wrong
tcp://1.2.3.4/ -> tcp://1.2.3.4:12345/ # fixed
tcp://1.2.3.4 -> tcp://1.2.3.4:12345 # still works
tcp:1.2.3.4 -> tcp:1.2.3.4:12345 # still works (old syntax)
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Expand the "secmodel" XML fragment of "host" with a sequence of
baselabel's which describe the default security context used by
libvirt with a specific security model and virtualization type:
<secmodel>
<model>selinux</model>
<doi>0</doi>
<baselabel type='kvm'>system_u:system_r:svirt_t:s0</baselabel>
<baselabel type='qemu'>system_u:system_r:svirt_tcg_t:s0</baselabel>
</secmodel>
<secmodel>
<model>dac</model>
<doi>0</doi>
<baselabel type='kvm'>107:107</baselabel>
<baselabel type='qemu'>107:107</baselabel>
</secmodel>
"baselabel" is driver-specific information, e.g. in the DAC security
model, it indicates USER_ID:GROUP_ID.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Merge the functions 'virSecurityDACSetUser' and
'virSecurityDACSetGroup' into 'virSecurityDACSetUserAndGroup'.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
The lxcContainerSetID() method prints a misleading log
message about setting the uid/gid when no ID map is
present in the XML config. Skip the debug message in
this case.
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
Commit '922b7fda' resulted in two DEADCODE warnings from Coverity in
remoteDispatchAuthPolkit and virAccessDriverPolkitFormatProcess.
Commit '604ae657' modified the daemon.c code to remove the deadcode
issue, but did not do so for viracessdriverpolkit.c. This just mimics
the same changes
According to the following valgrind output, there seems to be a
invalid limit for the iterator (captured on Fedora 19):
==3945== Invalid read of size 1
==3945== at 0x1E1FA410: libxlVmStart (libxl_driver.c:475)
==3945== by 0x1E1FAD9A: libxlDomainCreateWithFlags (libxl_driver.c:2633)
==3945== by 0x5187D46: virDomainCreate (libvirt.c:9439)
==3945== by 0x13BAA6: remoteDispatchDomainCreateHelper (remote_dispatch.h:2910)
==3945== by 0x51DE5B9: virNetServerProgramDispatch (virnetserverprogram.c:435)
==3945== by 0x51D93E7: virNetServerHandleJob (virnetserver.c:165)
==3945== by 0x50F5BF4: virThreadPoolWorker (virthreadpool.c:144)
==3945== by 0x50F5670: virThreadHelper (virthreadpthread.c:161)
==3945== by 0x8046C52: start_thread (pthread_create.c:308)
==3945== by 0x8758E1C: clone (clone.S:113)
==3945== Address 0x23424d81 is 0 bytes after a block of size 1 alloc'd
==3945== at 0x4A08121: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==3945== by 0x50B1F8C: virAllocN (viralloc.c:189)
==3945== by 0x1E1FA3CA: libxlVmStart (libxl_driver.c:468)
==3945== by 0x1E1FAD9A: libxlDomainCreateWithFlags (libxl_driver.c:2633)
==3945== by 0x5187D46: virDomainCreate (libvirt.c:9439)
==3945== by 0x13BAA6: remoteDispatchDomainCreateHelper (remote_dispatch.h:2910)
==3945== by 0x51DE5B9: virNetServerProgramDispatch (virnetserverprogram.c:435)
==3945== by 0x51D93E7: virNetServerHandleJob (virnetserver.c:165)
==3945== by 0x50F5BF4: virThreadPoolWorker (virthreadpool.c:144)
==3945== by 0x50F5670: virThreadHelper (virthreadpthread.c:161)
==3945== by 0x8046C52: start_thread (pthread_create.c:308)
==3945== by 0x8758E1C: clone (clone.S:113)
==3945==
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1013045
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Currently, we ignore whether dnsmasqCapsRefresh succeeds or fails. We
shouldn't do that as we may generate wrong dnsmasq command line (what
is done just a few lines below).
Signed-off-by: Hongwei Bi <hwbi2008@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
While LOCAL_PEERCRED on the BSDs does not return the pid information of
the peer, Mac OS X 10.8 added LOCAL_PEERPID to retrieve the pid so we
should use that when its available to get that information.
There are still two places where we are using 1bit width unsigned
integer to store a boolean. There's no real need for this and these
occurrences can be replaced with 'bool'.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
After commit 3e2f27e1, I've noticed build failures of virt-login-shell
when libapparmor-devel is installed on the build host
CCLD virt-login-shell
../src/.libs/libvirt-setuid-rpc-client.a(libvirt_setuid_rpc_client_la-vircommand.o):
In function `virExec':
/home/jfehlig/virt/upstream/libvirt/src/util/vircommand.c:653: undefined
reference to `aa_change_profile'
collect2: error: ld returned 1 exit status
I was about to commit an easy fix under the build-breaker rule
(build-fix-1.patch), but thought to extend the notion of SECDRIVER_LIBS
to SECDRIVER_CFLAGS, and use both throughout src/Makefile.am where it
makes sense (build-fix-2.patch).
Should I just stick with the simple fix, or is something along the lines
of patch 2 preferred?
Regards,
Jim
>From a0f35945f3127ab70d051101037e821b1759b4bb Mon Sep 17 00:00:00 2001
From: Jim Fehlig <jfehlig@suse.com>
Date: Mon, 21 Oct 2013 15:30:02 -0600
Subject: [PATCH] build: fix virt-login-shell build with apparmor
With libapparmor-devel installed, virt-login-shell fails to link
CCLD virt-login-shell
../src/.libs/libvirt-setuid-rpc-client.a(libvirt_setuid_rpc_client_la-vircommand.o): In function `virExec':
/home/jfehlig/virt/upstream/libvirt/src/util/vircommand.c:653: undefined reference to `aa_change_profile'
collect2: error: ld returned 1 exit status
Fix by linking libvirt_setuid_rpc_client with previously determined
SECDRIVER_LIBS in src/Makefile.am. While at it, introduce SECDRIVER_CFLAGS
and use both throughout src/Makefile.am where it makes sense.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
HW_PHYSMEM is available on Mac OS X as well as FreeBSD, however,
its resulting value for Mac OS X is 32 bits. Mac OS X provides
HW_MEMSIZE that is 64 bits version of HW_PHYSMEM. We have to use it.
I tested the patch on Mac OS X 10.6.8, 10.7.4, 10.8.5 and FreeBSD 9.2.
Signed-off-by: Ryota Ozaki <ozaki.ryota@gmail.com>
This patch (and the two patches that precede it) resolve:
https://bugzilla.redhat.com/show_bug.cgi?id=1005682
When libvirt was changed to delay the final cleanup of device removal
until the qemu process had signaled it with a DEVICE_DELETED event for
that device, the hostdev removal function
(qemuDomainRemoveHostDevice()) was written to properly handle the
removal of a hostdev that was actually an SRIOV virtual function
(defined with <interface type='hostdev'>). However, the function used
to search for a device matching the alias name provided in the
DEVICE_DELETED message (virDomainDefFindDevice()) would search through
the list of netdevs before hostdevs, so qemuDomainRemoveHostDevice()
was never called; instead the netdev function,
qemuDomainRemoveNetDevice() (which *doesn't* properly cleanup after
removal of <interface type='hostdev'>), was called.
(As a reminder - each <interface type='hostdev'> results in a
virDomainNetDef which contains a virDomainHostdevDef having a parent
type of VIR_DOMAIN_DEVICE_NET, and parent.data.net pointing back to
the virDomainNetDef; both Defs point to the same device info object
(and the info contains the device's "alias", which is used by qemu to
identify the device). The virDomainHostdevDef is added to the domain's
hostdevs list *and* the virDomainNetDef is added to the domain's nets
list, so searching either list for a particular alias will yield a
positive result.)
This function modifies the qemuDomainRemoveNetDevice() to short
circuit itself and call qemu DomainRemoveHostDevice() instead when the
actual device is a VIR_DOMAIN_NET_TYPE_HOSTDEV (similar logic to what
is done in the higher level qemuDomainDetachNetDevice())
Note that even if virDomainDefFindDevice() changes in the future so
that it finds the hostdev entry first, the current code will continue
to work properly.
This function was called in three places, and in each the call was
qualified by a slightly different conditional. In reality, this
function should only be called for a hostdev if all of the following
are true:
1) mode='subsystem'
2) type='pci'
3) there is a parent device definition which is an <interface>
(VIR_DOMAIN_DEVICE_NET)
We can simplify the callers and make them more consistent by checking
these conditions at the top ov qemuDomainHostdevNetConfigRestore and
returning 0 if one of them isn't satisfied.
The location of the call to qemuDomainHostdevNetConfigRestore() has
also been changed in the hot-plug case - it is moved into the caller
of its previous location (i.e. from qemuDomainRemovePCIHostDevice() to
qemuDomainRemoveHostDevice()). This was done to be more consistent
about which functions pay attention to whether or not this is one of
the special <interface> hostdevs or just a normal hostdev -
qemuDomainRemoveHostDevice() already contained a call to
networkReleaseActualDevice() and virDomainNetDefFree(), so it makes
sense for it to also handle the resetting of the device's MAC address
and vlan tag (which is what's done by
qemuDomainHostdevNetConfigRestore()).
Avoid people introducing security flaws in their apps by
forbidding the use of libvirt.so in setuid programs, with
a check in virInitialize.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Most of the usage of getuid()/getgid() is in cases where we are
considering what privileges we have. As such the code should be
using the effective IDs, not real IDs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We already have stubs for getuid, geteuid, getgid but
not for getegid. Something in gnulib already does a
check for it during configure, so we already have the
HAVE_GETEGID macro defined.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We don't want setuid programs automatically spawning libvirtd,
so disable any use of autostart when setuid.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We don't know enough about quality of external libraries used
for non-UNIX transports, nor do we want to spawn external
commands when setuid. Restrict to the bare minimum which is
UNIX transport for local usage. Users shouldn't need to be
running setuid if connecting to remote hypervisors in any
case.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The use of getenv is typically insecure, and we want people
to use our wrappers, to force them to think about setuid
needs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Unconditional use of getenv is not secure in setuid env.
While not all libvirt code runs in a setuid env (since
much of it only exists inside libvirtd) this is not always
clear to developers. So make all the code paranoid, even
if it only ever runs inside libvirtd.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>