The commit 1b854c76 introduced a new function 'virPolkitCheckAuth' and
in the #else section when you don't have polkit all attributes should be
follwed by ATTRIBUTE_UNUSED.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Spawning the pkcheck program every time a permission check is
required is hugely expensive on CPU. The pkcheck program is just
a dumb wrapper for the DBus API, so rewrite the code to use the
DBus API directly. This also simplifies error handling a bit.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Update virNetServerClientCreateIdentity and virIdentityGetSystem
to use the new typesafe APIs for setting identity attributes
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There are now two places in libvirt which use polkit. Currently
they use pkexec, which is set to be replaced by direct DBus API
calls. Add a common API which they will both be able to use for
this purpose.
No tests are added at this time, since the impl will be gutted
in favour of a DBus API call shortly.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Clean up all _virDomainInterfaceStats.
Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
virStorageSourceInitChainElement initializes a new storage chain element
for use as a new disk source. If the new element doesn't contain the
driver name, copy it from the old source.
This fixes issue where a disk would forget the driver after a snapshot.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1140984
Currently, the setns() wrapper is supported only for x86_64 and i686
which leaves us failing to build on other platforms like arm, aarch64
and so on. This means, that the wrapper needs to be extended to those
platforms and make to fail on runtime not compile time.
The syscall numbers for other platforms was fetched using this
command:
kernel.git $ git grep "define.*__NR_setns" | grep -e arm -e powerpc -e s390
arch/arm/include/uapi/asm/unistd.h:#define __NR_setns (__NR_SYSCALL_BASE+375)
arch/arm64/include/asm/unistd32.h:#define __NR_setns 375
arch/powerpc/include/uapi/asm/unistd.h:#define __NR_setns 350
arch/s390/include/uapi/asm/unistd.h:#define __NR_setns 339
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The backing store string location offset 0 determines that the file
isn't present. The string size shouldn't be then checked:
from qemu.git/docs/specs/qcow2.txt
== Header ==
The first cluster of a qcow2 image contains the file header:
Byte 0 - 3: magic
QCOW magic string ("QFI\xfb")
4 - 7: version
Version number (valid values are 2 and 3)
8 - 15: backing_file_offset
Offset into the image file at which the backing file name
is stored (NB: The string is not null terminated). 0 if the
image doesn't have a backing file.
16 - 19: backing_file_size
Length of the backing file name in bytes. Must not be
longer than 1023 bytes. Undefined if the image doesn't have
a backing file. ^^^^^^^^^
This patch intentionally leaves the backing file string size check in
place in case a malformatted file would be presented to libvirt. Also
according to the docs the string size is maximum 1023 bytes, thus this
patch adds a check to verify that.
I was also able to verify that the check was done the same way in the
legacy qcow fromat (in qemu's code).
Coverity complains that because of how 'offset' is initialized to
0 (zero), the resulting math and comparison on rem is pointless.
According to the origin commit id '3ec128989', the code is a
replacement for gmtime(), but without the localtime() or GMT
calculations - so just remove this code and add a comment
indicating the removal
With the virGetGroupList() change in place - Coverity further complains
that if we fail to virFork(), the groups will be leaked - which aha seems
to be the case. Adjust the logic to save off the -errno, free the groups,
and then return the value we saved
Signed-off-by: John Ferlan <jferlan@redhat.com>
This ends up being a very bizarre false positive. With an assist from
eblake, the claim is that mgetgroups() could return a -1 value, but yet
still have a groups buffer allocated, yet the example shown doesn't
seem to prove that.
Rather than fret about it, by adding a well placed sa_assert() on the
returned *list value we can "assure" ourselves that the mgetgroups()
failure path won't signal this condition.
Signed-off-by: John Ferlan <jferlan@redhat.com>
To express empty drive we historically use storage source with empty
path. Unfortunately NBD disks may be declared without a path.
Add a helper to wrap this logic.
Test suites using the port allocator don't want to have different
behaviour depending on whether a port is in use on the host. Add
a VIR_PORT_ALLOCATOR_SKIP_BIND_CHECK which test suites can use
to skip the bind() test. The port allocator will thus only track
ports in use by the test suite process itself. This is fine when
using the port allocator to generate guest configs which won't
actually be launched
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Perhaps a false positive, but since Coverity doesn't understand the
relationship between the 'count' and the 'strings', rather than leave
the chance the on input 'strings' is NULL and causes a deref - just
check for it and return
Signed-off-by: John Ferlan <jferlan@redhat.com>
Adjust the parentheses in/for the waitpid loops; otherwise, Coverity
points out:
(1) Event assignment: Assigning: "waitret" = "waitpid(pid, &status, 0) == -1"
(2) Event between: At condition "waitret == -1", the value of "waitret"
must be between 0 and 1.
(3) Event dead_error_condition: The condition "waitret == -1" cannot
be true.
(4) Event dead_error_begin: Execution cannot reach this statement:
"ret = -*__errno_location();".
Signed-off-by: John Ferlan <jferlan@redhat.com>
From time to time weird bugreports occur on the list, e.g [1].
Even though the kernel supports setns syscall, there's an older
glibc in the system that misses a wrapper over the syscall.
Hence, after the configure phase we think there's no setns
support in the system, which is obviously wrong. On the other
hand, we can't rely on linux distributions to provide newer glibc
soon. Therefore we need to introduce the wrapper on or own.
1: https://www.redhat.com/archives/libvir-list/2014-September/msg00492.html
Signed-off-by: Stephan Sachse <ste.sachse@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
virProcessTranslateStatus is used on error paths that should not spoil
the returned error. As the errors are ignored, use the quiet versions of
virAsprintf to create the message.
Our style overwhelmingly uses hanging braces (the open brace
hangs at the end of the compound condition, rather than on
its own line), with the primary exception of the top level function
body. Fix the few remaining outliers, before adding a syntax
check in a later patch.
* src/interface/interface_backend_netcf.c (netcfStateReload)
(netcfInterfaceClose, netcf_to_vir_err): Correct use of { in
compound statement.
* src/conf/domain_conf.c (virDomainHostdevDefFormatSubsys)
(virDomainHostdevDefFormatCaps): Likewise.
* src/network/bridge_driver.c (networkAllocateActualDevice):
Likewise.
* src/util/virfile.c (virBuildPathInternal): Likewise.
* src/util/virnetdev.c (virNetDevGetVirtualFunctions): Likewise.
* src/util/virnetdevmacvlan.c
(virNetDevMacVLanVPortProfileCallback): Likewise.
* src/util/virtypedparam.c (virTypedParameterAssign): Likewise.
* src/util/virutil.c (virGetWin32DirectoryRoot)
(virFileWaitForDevices): Likewise.
* src/vbox/vbox_common.c (vboxDumpNetwork): Likewise.
* tests/seclabeltest.c (main): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit 0e1a1a8c introduced umask for virCommand, but the variables
used emit a warning on older compilers about shadowed global
declaration.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Add umask to _virCommand, allow user to set umask to command.
Set umask(002) to qemu process to overwrite the default umask
of 022 set by many distros, so that unix sockets created for
virtio-serial has expected permissions.
Fix problem reported here:
https://sourceware.org/bugzilla/show_bug.cgi?id=13078#c11https://bugzilla.novell.com/show_bug.cgi?id=888166
To use virtio-serial device, unix socket created for chardev with
default umask(022) has insufficient permissions.
e.g.:
-device virtio-serial \
-chardev socket,path=/tmp/foo,server,nowait,id=foo \
-device virtserialport,chardev=foo,name=org.fedoraproject.port.0
srwxr-xr-x 1 qemu qemu 0 21. Jul 14:19 /tmp/somefile.sock
Other users in the same group (like real user, test engines, etc)
cannot write to this socket.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently, there is one flag passed in during macvtap creation
(withTap) -- Let's convert this field to an unsigned int flag
field for future expansion.
Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
That sets a new flag, but that flag does mean the child will get
LISTEN_FDS and LISTEN_PID environment variables properly set and
passed FDs reordered so that it corresponds with LISTEN_FDS (they must
start right after STDERR_FILENO).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Since not only systemd can do this (we'll be doing it as well few
patches later), change 'systemd' to 'caller' and fix LISTEN_FDS to
LISTEN_PID where applicable.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In case the host has 2 or more NUMA nodes, we fetch CPU map for each
node. However, we need to free the CPU map in between loops:
==29513== 96 (72 direct, 24 indirect) bytes in 3 blocks are definitely lost in loss record 951 of 1,264
==29513== at 0x4C2A700: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==29513== by 0x52AD24B: virAlloc (viralloc.c:144)
==29513== by 0x52AF0E6: virBitmapNew (virbitmap.c:78)
==29513== by 0x52FB720: virNumaGetNodeCPUs (virnuma.c:294)
==29513== by 0x53C700B: nodeCapsInitNUMA (nodeinfo.c:1886)
==29513== by 0x11759708: vboxCapsInit (vbox_common.c:398)
==29513== by 0x11759CC4: vboxConnectOpen (vbox_common.c:514)
==29513== by 0x53C965F: do_open (libvirt.c:1147)
==29513== by 0x53C9EBC: virConnectOpen (libvirt.c:1317)
==29513== by 0x142905: remoteDispatchConnectOpen (remote.c:1215)
==29513== by 0x126ADF: remoteDispatchConnectOpenHelper (remote_dispatch.h:2346)
==29513== by 0x5453D21: virNetServerProgramDispatchCall (virnetserverprogram.c:437)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Otherwise we fail like
libvirt version: 1.2.7, package: 6 (root 2014-08-08-16:09:22 bogon)
virAuditOpen:62 : Unable to initialize audit layer: Protocol not supported
virFileGetDefaultHugepageSize:2958 : internal error: Unable to parse /proc/meminfo
virStateInitialize:749 : Initialization of QEMU state driver failed: internal error: Unable to parse /proc/meminfo
daemonRunStateInit:922 : Driver state initialization failed
if the data can't be determined.
Reference: http://bugs.debian.org/757609
This fixes compilation on kFreeBSD which otherwise fails like
CC util/libvirt_util_la-virprocess.lo
In file included from /usr/include/sys/cpuset.h:35:0,
from util/virprocess.c:43:
/usr/include/sys/_cpuset.h:49:43: error: 'NBBY' undeclared here (not in
a function)
long __bits[howmany(CPU_SETSIZE, _NCPUBITS)];
^
In file included from util/virprocess.c:43:0:
/usr/include/sys/cpuset.h:215:12: error: unknown type name 'cpusetid_t'
int cpuset(cpusetid_t *);
^
/usr/include/sys/cpuset.h:216:30: error: expected ')' before 'id_t'
int cpuset_setid(cpuwhich_t, id_t, cpusetid_t);
^
/usr/include/sys/cpuset.h:217:42: error: expected ')' before 'id_t'
int cpuset_getid(cpulevel_t, cpuwhich_t, id_t, cpusetid_t *);
^
/usr/include/sys/cpuset.h:218:48: error: expected ')' before 'id_t'
int cpuset_getaffinity(cpulevel_t, cpuwhich_t, id_t, size_t, cpuset_t
*);
^
/usr/include/sys/cpuset.h:219:48: error: expected ')' before 'id_t'
int cpuset_setaffinity(cpulevel_t, cpuwhich_t, id_t, size_t, const
cpuset_t *);
And it's the correct usage as documented in
http://www.freebsd.org/cgi/man.cgi?query=cpuset_setid
Also change the #ifdef HAVE_BSH_CPU_AFFINITY to #if for consistency.
Valgrind caught a memory leak:
==2018== 9 bytes in 1 blocks are definitely lost in loss record 143 of 927
==2018== at 0x4A0645D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==2018== by 0x8C42369: strdup (strdup.c:42)
==2018== by 0x50EACC9: virStrdup (virstring.c:676)
==2018== by 0x50E79E5: virStorageSourceCopy (virstoragefile.c:1845)
==2018== by 0x20A3FAA7: qemuDomainBlockCommit (qemu_driver.c:15620)
==2018== by 0x51DC6B2: virDomainBlockCommit (libvirt.c:20092)
I traced it to the fact that blockcopy and blockcommit end up
reparsing a backing chain on pivot, but the chain parsing code
doesn't gracefully handle the case where the backing file is
already known.
I'm not exactly sure when this was introduced, but suspect that the
refactoring in commit 9944b71 and friends that moved towards probing
in-place rather than into a temporary structure are part of the cause.
* src/util/virstoragefile.c (virStorageFileGetMetadataInternal):
Don't leak any prior value.
Signed-off-by: Eric Blake <eblake@redhat.com>
Since commit be0782e1 we are parsing /proc/meminfo to find out the
default huge page size. However, if the host we are running at does
not support any huge pages (e.g. CONFIG_HUGETLB_PAGE is turned off),
we will not successfully parse the meminfo file and hence the whole
qemu driver init process fails. Moreover, the default huge page size
is needed if and only if there's at least one hugetlbfs mount point.
So the fix consists of moving the virFileGetDefaultHugepageSize
function call after the first hugetlbfs mount point is found.
With this fix, we fail to start with one or more hugetlbfs mounts and
malformed meminfo file, but that's expected (how can one mount
hugetlbfs without kernel supporting huge pages?). Workaround in that
case is to umount all the hugetlbfs mounts.
Reported-by: Jim Fehlig <jfehlig@suse.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Otherwise this beautiful error would be overwritten when
the function is called with a really high rate number:
2014-07-28 12:51:47.920+0000: 2304: error : virCommandWait:2399 :
internal error: Child process (/sbin/tc class add dev vnet0 parent 1:
classid 1:1 htb rate 4294968kbps) unexpected exit status 1: Illegal "rate"
Usage: ... qdisc add ... htb [default N] [r2q N]
default minor id of class to which unclassified packets are sent {0}
r2q DRR quantums are computed as rate in Bps/r2q {10}
debug string of 16 numbers each 0-3 {0}
... class add ... htb rate R1 [burst B1] [mpu B] [overhead O]
[prio P] [slot S] [pslot PS]
[ceil R2] [cburst B2] [mtu MTU] [quantum Q]
rate rate allocated to this class (class can still borrow)
burst max bytes burst which can be accumulated during idle period {computed}
mpu minimum packet size used in rate computations
overhead per-packet size overhead used in rate computations
linklay adapting to a linklayer e.g. atm
ceil definite upper class rate (no borrows) {rate}
cburst burst but for ceil {computed}
mtu max packet size we create rate map for {1600}
prio priority of leaf; lowe
https://bugzilla.redhat.com/show_bug.cgi?id=1043735
Cygwin has getifaddrs(), but not AF_LINK, leading to:
util/virstats.c: In function 'virNetInterfaceStats':
util/virstats.c:138:41: error: 'AF_LINK' undeclared (first use in this function)
if (ifa->ifa_addr->sa_family != AF_LINK)
...
* src/util/virstats.c (virNetInterfaceStats): Only use getifaddrs
if AF_LINK is present.
Signed-off-by: Eric Blake <eblake@redhat.com>
This should iterate over mount tab and search for hugetlbfs among with
looking for the default value of huge pages.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Leak introduced in commit 16ebf10f (v1.2.6), detected by valgrind:
==9816== 216 (96 direct, 120 indirect) bytes in 6 blocks are definitely lost in loss record 665 of 821
==9816== at 0x4A081D4: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==9816== by 0x50836FB: virAlloc (viralloc.c:144)
==9816== by 0x1DBDBE27: udevProcessPCI (node_device_udev.c:546)
==9816== by 0x1DBDD79D: udevGetDeviceDetails (node_device_udev.c:1293)
* src/util/virpci.h (virPCIEDeviceInfoFree): New prototype.
* src/util/virpci.c (virPCIEDeviceInfoFree): New function.
* src/conf/node_device_conf.c (virNodeDevCapsDefFree): Clear
pci_express under pci case.
(virNodeDevCapPCIDevParseXML): Avoid leak.
* src/node_device/node_device_udev.c (udevProcessPCI): Likewise.
* src/libvirt_private.syms (virpci.h): Export it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Finding virPCIE* code is more intuitive if located in virpci.h
instead of node_device_conf.h.
* src/conf/node_device_conf.h (virPCIELinkSpeed, virPCIELink)
(virPCIEDeviceInfo): Move...
* src/util/virpci.h: ...here.
* src/conf/node_device_conf.c (virPCIELinkSpeed): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
virTimeFieldsThenRaw will never return negative result, so I clean up
the related meaningless judgements to make it better.
Signed-off-by: James <james.wangyufei@huawei.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Create the structures and API's to hold and manage the iSCSI host device.
This extends the 'scsi_host' definitions added in commit id '5c811dce'.
A future patch will add the XML parsing, but that code requires some
infrastructure to be in place first in order to handle the differences
between a 'scsi_host' and an 'iSCSI host' device.
Split virDomainHostdevSubsysSCSI further. In preparation for having
either SCSI or iSCSI data, create a union in virDomainHostdevSubsysSCSI
to contain just a virDomainHostdevSubsysSCSIHost to describe the
'scsi_host' host device
Added <capabilities> in the <features> section of LXC domains
configuration. This section can contain elements named after the
capabilities like:
<mknod state="on"/>, keep CAP_MKNOD capability
<sys_chroot state="off"/> drop CAP_SYS_CHROOT capability
Users can restrict or give more capabilities than the default using
this mechanism.
Under ./configure --without-numactl but with numactl-devel installed,
the build fails with:
../../src/util/virnuma.c: In function 'virNumaNodeIsAvailable':
../../src/util/virnuma.c:407:5: error: implicit declaration of function 'numa_bitmask_isbitset' [-Werror=implicit-function-declaration]
return numa_bitmask_isbitset(numa_nodes_ptr, node);
^
and other failures, all because the configure results for particular
functions were used without regard to whether libnuma was even being
linked in.
* src/util/virnuma.c (virNumaGetPages): Fix message typo.
(virNumaNodeIsAvailable): Correct build when not using numactl.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit ef48a1b introduced virFindSCSIHostByPCI for Linux and
a stub for other platforms that returns -1 while the function
should return 'char *', so use 'return NULL' instead.
Commit fbd91d4 introduced virReadSCSIUniqueId with the third
argument 'int *result', however the stub for non-Linux patform
uses 'unsigned int *result', so change it to 'int *result'.
Pushed under the build breaker rule.
Introduce a new function to parse the provided scsi_host parent address
and unique_id value in order to find the /sys/class/scsi_host directory
which will allow a stable SCSI host address
Add a test to scsihosttest to lookup the host# name by using the PCI address
and unique_id value
Introduce a new function to read the current scsi_host entry and return
the value found in the 'unique_id' file.
Add a 'scsihosttest' test (similar to the fchosttest, but incorporating some
of the concepts of the mocked pci test library) in order to read the
unique_id file like would be found in the /sys/class/scsi_host tree.
We do so in the vast majority of places, so there's no problem of adding
the attribute to enforce it by the complier and fix a few leftover
places.
This was originally pointed out by Coverity as a recent change triggered
it's warning that our code checked the vast majority of returns from
virStrToLong_ui.
There's no need to use it since we have this shiny functions
that even checks for conversion and overflow errors.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1091866
Add a new boolean 'sparse'. This will be used by the logical backend
storage driver to determine whether the target volume is sparse or not
(also known by a snapshot or thin logical volume). Although setting sparse
to true at creation could be seen as duplicitous to setting during
virStorageBackendLogicalMakeVol() in case there are ever other code paths
between Create and FindLVs that need to know about the volume be sparse.
Use the 'sparse' in a new virStorageBackendLogicalVolWipe() to decide whether
to attempt to wipe the logical volume or not. For now, I have found no
means to wipe the volume without writing to it. Writing to the sparse
volume causes it to be filled. A sparse logical volume is not completely
writeable as there exists metadata which if overwritten will cause the
sparse lv to go INACTIVE which means pool-refresh will not find it.
Access to whatever lvm uses to manage data blocks is not provided by
any API I could find.
There were numerous places where numatune configuration (and thus
domain config as well) was changed in different ways. On some
places this even resulted in persistent domain definition not to be
stable (it would change with daemon's restart).
In order to uniformly change how numatune config is dealt with, all
the internals are now accessible directly only in numatune_conf.c and
outside this file accessors must be used.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Since there was already public virDomainNumatune*, I changed the
private virNumaTune to match the same, so all the uses are unified and
public API is kept:
s/vir\(Domain\)\?Numa[tT]une/virDomainNumatune/g
then shrunk long lines, and mainly functions, that were created after
that:
sed -i 's/virDomainNumatuneMemPlacementMode/virDomainNumatunePlacement/g'
And to cope with the enum name, I haad to change the constants as
well:
s/VIR_NUMA_TUNE_MEM_PLACEMENT_MODE/VIR_DOMAIN_NUMATUNE_PLACEMENT/g
Last thing I did was at least a little shortening of already long
name:
s/virDomainNumatuneDef/virDomainNumatune/g
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There are many places with numatune-related code that should be put
into special numatune_conf and this patch creates a basis for that.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Now that we've finally fixed all the violators, it's time to
enforce that any pointer to a const object is never freed (it
is aliasing some other memory, where the non-const original
should be freed instead). Alas, the code still needs a normal
vs. Coverity version, but at least we are still guaranteeing
that the macro call evaluates its argument exactly once.
I verified that we still get the following compiler warnings,
which in turn halts the build thanks to -Werror on gcc (hmm,
gcc 4.8.3's placement of the ^ for ?: type mismatch is a bit
off, but that's not our problem):
int oops1 = 0;
VIR_FREE(oops1);
const char *oops2 = NULL;
VIR_FREE(oops2);
struct blah { int dummy; } oops3;
VIR_FREE(oops3);
util/virauthconfig.c:159:35: error: pointer/integer type mismatch in conditional expression [-Werror]
VIR_FREE(oops1);
^
util/virauthconfig.c:161:5: error: passing argument 1 of 'virFree' discards 'const' qualifier from pointer target type [-Werror]
VIR_FREE(oops2);
^
In file included from util/virauthconfig.c:28:0:
util/viralloc.h:79:6: note: expected 'void *' but argument is of type 'const void *'
void virFree(void *ptrptr) ATTRIBUTE_NONNULL(1);
^
util/virauthconfig.c:163:35: error: type mismatch in conditional expression
VIR_FREE(oops3);
^
* src/util/viralloc.h (VIR_FREE): No longer cast away const.
* src/xenapi/xenapi_utils.c (xenSessionFree): Work around bogus
header.
Signed-off-by: Eric Blake <eblake@redhat.com>
Add 'nocow' to storage volume xml so that user can have an option
to set NOCOW flag to the newly created volume. It's useful on btrfs
file system to enhance performance.
Btrfs has low performance when hosting VM images, even more when the guest
in those VM are also using btrfs as file system. One way to mitigate this
bad performance is to turn off COW attributes on VM files. Generally, there
are two ways to turn off COW on btrfs: a) by mounting fs with nodatacow,
then all newly created files will be NOCOW. b) per file. Add the NOCOW file
attribute. It could only be done to empty or new files.
This patch tries the second way, according to 'nocow' option, it could set
NOCOW flag per file:
for raw file images, handle 'nocow' in libvirt code; for non-raw file images,
pass 'nocow=on' option to qemu-img, and let qemu-img to handle that (requires
qemu-img version >= 2.1).
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Rename linuxDomainInterfaceStats to virNetInterfaceStats in order
to allow adding platform specific implementations without
making consumer worrying about specific implementation to be used.
Also, rename util/virstatslinux.c to util/virstats.c so placing
other platform specific implementations into this file don't
look unexpected from the file name.
If the openvswitch service is stopped, and is followed by destroying a
VM, the openvswitch bridge translates into a state where it doesn't
recover the port configuration. While it successfully fetches data
from the internal DB, since the corresponding virtual interface does
not exists anymore the whole recovery process fails leaving restarted
VM with inability to connect to the bridge. The following set of
commands will trigger the problem:
virsh start vm
service openvswitch-switch stop
virsh destroy vm
service openvswitch-switch start
virsh start vm
Signed-off-by: Chunhe Li <lichunhe@huawei.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This negation in names of boolean variables is driving me insane. The
code is much more readable if we drop the 'no-' prefix. Well, at least
for me.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The commit referenced above changed function arguments of
virStorageFileGetMetadataFromBuf() but didn't tweak the
ATTRIBUTE_NONNULL tied to them. This was caught by coverity as it
actually obeys them. We disabled them for GCC and thus it didn't show
up.
Additionally in commit 3ea661deea I passed
NULL to the backingFormat argument which was also marked as nonnull. Use
a dummy int's address when the argument isn't supplied so that the code
doesn't need to change much.
When dispatching events from the event loop, the array of registered
handles is searched to see what handles happened an event on. However,
the array is searched in weird way: the check for the array boundaries
is at the end, so we may touch the elements after the end of the
array:
==10434== Invalid read of size 4
==10434== at 0x52D06B6: virEventPollDispatchHandles (vireventpoll.c:486)
==10434== by 0x52D10E4: virEventPollRunOnce (vireventpoll.c:660)
==10434== by 0x52CF207: virEventRunDefaultImpl (virevent.c:308)
==10434== by 0x1639D1: virNetServerRun (virnetserver.c:1139)
==10434== by 0x1220DC: main (libvirtd.c:1507)
==10434== Address 0xc11ff04 is 4 bytes after a block of size 960 alloc'd
==10434== at 0x4C2CA5E: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==10434== by 0x52AD378: virReallocN (viralloc.c:245)
==10434== by 0x52AD46E: virExpandN (viralloc.c:294)
==10434== by 0x52AD5B1: virResizeN (viralloc.c:352)
==10434== by 0x52CF2EC: virEventPollAddHandle (vireventpoll.c:116)
==10434== by 0x52CEF5B: virEventAddHandle (virevent.c:78)
==10434== by 0x11F69A90: nodeStateInitialize (node_device_udev.c:1797)
==10434== by 0x53C3C89: virStateInitialize (libvirt.c:743)
==10434== by 0x120563: daemonRunStateInit (libvirtd.c:919)
==10434== by 0x5317719: virThreadHelper (virthread.c:197)
==10434== by 0x8376F39: start_thread (in /lib64/libpthread-2.17.so)
==10434== by 0x8A7F9FC: clone (in /lib64/libc-2.17.so)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit a48f445100 introduced a helper
function to convert cgroup device mode to string. The function was only
conditionally compiled on platforms that support cgroup. This broke the
build when attempting to export the symbol:
CCLD libvirt.la
Cannot export virCgroupGetDevicePermsString: symbol not defined
Move the function out of the ifdef, as it doesn't really depend on the
cgroup code being present.
Cgroups code uses VIR_CGROUP_DEVICE_* flags to specify the mode but in
the end it needs to be converted to a string. Add a helper to do it and
use it in the cgroup code before introducing it into the rest of the
code.
When discovering a disk backing chain the parent disk's metadata need to
be populated into the guest images so that each piece of the backing
chain contains a copy of those. This will allow us to refactor the
security driver so that it will not need to carry around the original
disk definition.
We are going to modify storage source chains in place. Add a helper that
will copy relevant information such as security labels to the new
element if that doesn't contain it.
In the future we might need to track state of individual images. Move
the readonly and shared flags to the virStorageSource struct so that we
can keep them in a per-image basis.
The qemu block info function relied on working with local storage. Break
this assumption by adding support for remote volumes. Unfortunately we
still need to take a hybrid approach as some of the operations require a
filedescriptor.
Previously you'd get:
$ virsh domblkinfo gl vda
error: cannot stat file '/img10': Bad file descriptor
Now you get some stats:
$ virsh domblkinfo gl vda
Capacity: 10485760
Allocation: 197120
Physical: 197120
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1110198
To allow reusing this function in the qemu driver we need to allow
specifying the storage format. Also separate return of the backing store
path now isn't necessary.
There's a lot of places where we skip doing actions based on the
locality of given storage type. The usual pattern is to skip it if:
virStorageSourceGetActualType(src) == VIR_STORAGE_TYPE_NETWORK
Add a simple helper to simplify the pattern to
virStorageSourceIsLocalStorage(src)
Replace the inline "auth" struct in virStorageSource with a pointer
to a virStorageAuthDefPtr and utilize between the domain_conf, qemu_conf,
and qemu_command sources for finding the auth data for a domain disk
Introduce virStorageAuthDef and friends. Future patches will merge/utilize
their view of storage source/pool auth/secret definitions.
New API's include:
virStorageAuthDefParse: Parse the "<auth/>" XML data for either the
domain disk or storage pool returning a
virStorageAuthDefPtr
virStorageAuthDefCopy: Copy a virStorageAuthDefPtr - to be used by
the qemuTranslateDiskSourcePoolAuth when it
copies storage pool auth data into domain
disk auth data
virStorageAuthDefFormat: Common output of the "<auth" in the domain
disk or storage pool XML
virStorageAuthDefFree: Free memory associated with virStorageAuthDef
Subsequent patches will utilize the new functions for the domain disk and
storage pools.
Future work in the hostdev pass through can then make use of common data
structures and code.
Replace:
if (virBufferError(&buf)) {
virBufferFreeAndReset(&buf);
virReportOOMError();
...
}
with:
if (virBufferCheckError(&buf) < 0)
...
This should not be a functional change (unless some callers
misused the virBuffer APIs - a different error would be reported
then)
Check if the buffer is in error state and report an error if it is.
This replaces the pattern:
if (virBufferError(buf)) {
virReportOOMError();
goto cleanup;
}
with:
if (virBufferCheckError(buf) < 0)
goto cleanup;
Document typical buffer usage to favor this.
Also remove the redundant FreeAndReset - if an error has
been set via virBufferSetError, the content is already freed.
virFileReadAll already logs an error. If reading the 'speed' file
fails with EINVAL, we log an error even though we ignore it. If it
fails with other errors, we log two errors.
Use virFileReadAllQuiet - ignore EINVAL and report just one error
in other cases.
Fixes this error on libvirtd startup:
2014-06-30 12:47:14.583+0000: 20971: error : virFileReadAll:1297 :
Failed to read file '/sys/class/net/wlan0/speed': Invalid argument
When CPU comparison APIs return VIR_CPU_COMPARE_INCOMPATIBLE, the caller
has no clue why the CPU is considered incompatible with host CPU. And in
some cases, it would be nice to be able to get such info in a client
rather than having to look in logs.
To achieve this, the APIs can be told to return VIR_ERR_CPU_INCOMPATIBLE
error for incompatible CPUs and the reason will be described in the
associated error message.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The parent directory doesn't necessarily need to be stored after we
don't mangle the path stored in the image. Remove it and tweak the code
to avoid using it.
Store backing chain paths as non-canonical. The canonicalization step
will be already taken. This will allow to avoid storing unnecessary
amounts of data.
Now that we store only relative names in virStorageSource's member
relPath the backingRelative member is obsolete. Remove it and adapt the
code to the removal.
Due to various refactors and compatibility with the virstoragetest the
relPath field of the virStorageSource structure was always filled either
with the relative name or the full path in case of absolutely backed
storage. Return its original purpose to store only the relative name of
the disk if it is backed relatively and tweak the tests.
This patch introduces a function that will allow us to resolve a
relative difference between two elements of a disk backing chain. This
function will be used to allow relative block commit and block pull
where we need to specify the new relative name of the image to qemu.
This patch also adds unit tests for the function to verify that it works
correctly.
virPortAllocatorSetUsed permits to set a port as already used and
prevent the port allocator to use it without any attempt to bind it.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If we are running on a system that is not capable of huge pages (e.g.
because the kernel is not configured that way) we still try to open
"/sys/kernel/mm/hugepages/" which however does not exist. We should
be tolerant to this specific use case.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
On the Linux kernel, if huge pages are allocated the size they cut off
from memory is accounted under the 'MemUsed' in the meminfo file.
However, we want the sum to be subtracted from 'MemTotal'. This patch
implements this feature. After this change, we can enable reporting
of the ordinary system pages in the capability XML:
<capabilities>
<host>
<uuid>01281cda-f352-cb11-a9db-e905fe22010c</uuid>
<cpu>
<arch>x86_64</arch>
<model>Haswell</model>
<vendor>Intel</vendor>
<topology sockets='1' cores='1' threads='1'/>
<feature/>
<pages unit='KiB' size='4'/>
<pages unit='KiB' size='2048'/>
<pages unit='KiB' size='1048576'/>
</cpu>
<power_management/>
<migration_features/>
<topology>
<cells num='4'>
<cell id='0'>
<memory unit='KiB'>4048248</memory>
<pages unit='KiB' size='4'>748382</pages>
<pages unit='KiB' size='2048'>3</pages>
<pages unit='KiB' size='1048576'>1</pages>
<distances/>
<cpus num='1'>
<cpu id='0' socket_id='0' core_id='0' siblings='0'/>
</cpus>
</cell>
...
</cells>
</topology>
</host>
</capabilities>
You can see the beautiful thing about this: if you sum up all the
<pages/> you'll get <memory/>.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Introduce a common function that will take a callback to resolve links
that will be used to canonicalize paths on various storage systems and
add extensive tests.
To free string lists with some strings stolen from the middle we need to
walk the complete array. Introduce a new helper that takes the string
list size to free such string lists.
virNumaGetPages calls closedir(dir) in cleanup and dir could
be NULL if we jump there from the failed opendir() call.
While it's not harmful on Linux, FreeBSD libc crashes [1], so
make sure that dir is not NULL before calling closedir.
1: http://lists.freebsd.org/pipermail/freebsd-standards/2014-January/002704.html
One of previous commits (e6258a33) tried to build the huge page code
only on Linux since it's Linux centric indeed. But it failed miserably
as it used 'WITH_LINUX' which is an automake conditional not a gcc
one. In the sources we need to use __linux__.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Only three other callers possibly call closedir on a NULL argument.
Even though these probably won't be used on FreeBSD where this crashes,
let's be nice and only call closedir on an actual directory stream.
==== Invalid write of size 4
==== at 0x52E678C: virNumaGetDistances (virnuma.c:479)
==== by 0x5396890: nodeCapsInitNUMA (nodeinfo.c:1796)
==== by 0x203C2B: virQEMUCapsInit (qemu_capabilities.c:960)
==== Address 0xe10a1e0 is 0 bytes after a block of size 0 alloc'd
==== at 0x4C2A6D0: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==== by 0x52A10D6: virAllocN (viralloc.c:191)
==== by 0x52E674D: virNumaGetDistances (virnuma.c:470)
==== by 0x5396890: nodeCapsInitNUMA (nodeinfo.c:1796)
==== by 0x203C2B: virQEMUCapsInit (qemu_capabilities.c:960)
The hugepage sizing and counting code gathers the information from sysfs
and thus isn't portable. Stub it out for non-Linux so that we can report
a better error. This patch also avoids calling sysinfo() on Mingw where
it isn't supported.
So far, we are doing compile time decisions on which architecture is
used. However, for testing purposes it's much easier if we pass host
architecture as parameter and then let the function decide which code
snippet for extracting host CPU info will be used.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The image labels are stored in the virStorageSource struct. Convert the
virDomainDiskDefGetSecurityLabelDef helper not to use the full disk def
and move it appropriately.
For future work we need two functions that fetches total number of
pages and number of free pages for given NUMA node and page size
(virNumaGetPageInfo()).
Then we need to learn pages of what sizes are supported on given node
(virNumaGetPages()).
Note that system page size is disabled at the moment as there's one
issue connected. If you have a NUMA node with huge pages allocated the
kernel would return the normal size of memory for that node. It
basically ignores the fact that huge pages steal size from the system
memory. Until we resolve this, it's safer to not confuse users and
hence not report any system pages yet.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Not on all hosts the set of NUMA nodes IDs is continuous. This is
critical, because our code currently assumes the set doesn't contain
holes. For instance in nodeGetFreeMemory() we can see the following
pattern:
if ((max_node = virNumaGetMaxNode()) < 0)
return 0;
for (n = 0; n <= max_node; n++) {
...
}
while it should be something like this:
if ((max_node = virNumaGetMaxNode()) < 0)
return 0;
for (n = 0; n <= max_node; n++) {
if (!virNumaNodeIsAvailable(n))
continue;
...
}
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
These functions will handle PCIe devices and their link capabilities
to query some info about it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The block commit code looks for an explicit base file relative
to the discovered top file; so for a chain of:
base <- snap1 <- snap2 <- snap3
and a command of:
virsh blockcommit $dom vda --base snap2 --top snap1
we got a sane message (here from libvirt 1.0.5):
error: invalid argument: could not find base 'snap2' below 'snap1' in chain for 'vda'
Meanwhile, recent refactoring has slightly reduced the quality of the
libvirt error messages, by losing the phrase 'below xyz':
error: invalid argument: could not find image 'snap2' in chain for 'snap3'
But we had a one-off, where we were not excluding the top file
itself in searching for the base; thankfully qemu still reports
the error, but the quality is worse:
virsh blockcommit $dom vda --base snap2 --top snap2
error: internal error unable to execute QEMU command 'block-commit': Base '/snap2' not found
Fix the one-off in blockcommit by changing the semantics of name
lookup - if a starting point is specified, then the result must
be below that point, rather than including that point. The only
other call to chain lookup was blockpull code, which was already
forcing the lookup to omit the active layer and only needs a
tweak to use the new semantics.
This also fixes the bug exposed in the testsuite, where when doing
a lookup pinned to an intermediate point in the chain, we were
unable to return the name of the parent also in the chain.
* src/util/virstoragefile.c (virStorageFileChainLookup): Change
semantics for non-NULL startFrom.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Adjust caller,
to keep existing semantics.
* tests/virstoragetest.c (mymain): Adjust to expose new semantics.
Signed-off-by: Eric Blake <eblake@redhat.com>
The kernel's more broken than one would think. Various drivers report
various (usually spurious) values if the interface is in other state
than 'up' . While on some we experience -EINVAL when read()-ing the
speed sysfs file, with other drivers we might get anything from 0 to
UINT_MAX. If that's the case it's better to not report link speed.
Well, the interface is not up anyway.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If we're compiling on non-Linux platform, the virNetDevGetLinkInfo()
is a dummy function which barely logs debug message that getting link
info is not supported. However, while the debug message was prepared
for printing the interface name too, I actually forgot to pass the
variable which resulted in build error on platforms like mingw or
FreeBSD.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The purpose of this function is to fetch link state
and link speed for given NIC name from the SYSFS.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Jim Fehlig reported a regression found by libvirt-TCK tests:
> ~ # perl /usr/share/libvirt-tck/tests/qemu/100-disk-encryption.t
...
> ok 4 - defined persistent domain config
> # Starting inactive domain config
> libvirt error code: 1, message: internal error: unable to execute QEMU command
> 'cont': 'drive-ide0-0-1'
> (/var/cache/libvirt-tck/300-disk-encryption/demo.qcow2) is encrypted
Commit 2279d560 converted a boolean into a pointer with the intent of
transferring that pointer out of a temporary object into the caller's
data structure. The temporary structure meant that meta->encryption
was always NULL on entry, so we could get away with blindly allocating
the pointer when the header said so. But later, commit 8823272d
tweaked things to do backing chain detection in-place, rather than via
a temporary object; this has the net result that meta->encryption can
be non-NULL on entry. Not only did this turn the latent behavior into
a memory leak, it is also a behavior regression: blindly allocating a
new pointer wipes out what secrets we already knew about the chain,
making it impossible to restart the domain.
Of course, no one in their right mind should be relying on qcow2
encryption - it is fundamentally flawed. And sadly, the TCK tests
don't get run often enough, and this shows that our virstoragetest
does not exercise encrypted images at all. Otherwise, we could
have avoided a release containing this regression.
* src/util/virstoragefile.c (virStorageFileGetMetadataInternal):
Don't nuke an already-existing encryption.
Signed-off-by: Eric Blake <eblake@redhat.com>
This simplifies the usage in {libxl,qemu}DomainGetNumaParameters
and it's needed for consistent error reporting in virBitmapFormat.
Also remove the forgotten ATTRIBUTE_NONNULL marker.
On some systems, libnuma can be present but it's so ancient that
it misses some symbols that virNumaGetDistances() needs. To be
more precise: numa_bitmask_isbitset() and numa_nodes_ptr are the
symbols in question. Fortunately, they were both introduced in
the same release so it's sufficient for us to check for only one
of them. And the winner is numa_bitmask_isbitset().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In case the libvirt is built without numactl support, we're
missing the virNumaGetDistances() stub so the linking fails:
CCLD libvirt_lxc
libvirt_lxc-nodeinfo.o: In function `virNodeCapsGetSiblingInfo':
/home/zippy/tmp/libvirt.git/src/nodeinfo.c:1763: undefined reference to `virNumaGetDistances'
collect2: error: ld returned 1 exit status
make[3]: *** [libvirt_lxc] Error 1
The issue was introduced in 77c830d8c4.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The API gets a NUMA node and find distances to other nodes. The
distances are returned in an array. If an item X within the array
equals to value of zero, then there's no such node as X.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
To allow using the array manipulation macros on the arrays returned by
virStringSplit we need to know the count of the elements in the array.
Modify virStringSplit to return this value, rename it and add a helper
with the old name so that we don't need to update all the code.
Add parsers for relative and absolute backing names for local and remote
storage files.
This parser parses relative paths as relative to their parents and
absolute paths according to the protocol or local access.
For remote storage volumes, all URI based backing file names are
supported and for the qemu colon syntax the NBD protocol is supported.
Use virStorageFileReadHeader() to read headers of storage files possibly
on remote storage to retrieve the image metadata.
The backend information is now parsed by
virStorageFileGetMetadataInternal which is now exported from the util
source and virStorageFileGetMetadataFromFDInternal now doesn't need to
be exported.
My future work will modify the metadata crawler function to use the
storage driver file APIs to access the files instead of accessing them
directly so that we will be able to request the metadata for remote
files too. To avoid linking the storage driver to every helper file
using the utils code, the backing chain traversal function needs to be
moved to the storage driver source.
Additionally the virt-aa-helper and virstoragetest programs need to be
linked with the storage driver as a result of this change.
In 9dd02965 the virNumaGetNodeMemory was introduced, however the
comment describing the function mentions virNumaGetNodeMemorySize.
And there's one typo in virNumaIsAvailable() description.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For guests backed by gluster volumes (or other network storage) we don't
fill the backing chain (see qemuDomainDetermineDiskChain). This leaves
the "relPath" field of the top image NULL. This causes a crash in
virStorageFileChainLookup() when looking up a backing element for such a
disk.
Since I'm working on adding support for network storage and one of the
steps will make the "relPath" field optional let's use STREQ_NULLABLE
instead of STREQ in virStorageFileChainLookup() to avoid the problem.
The original version of virTimeLocalOffsetFromUTC() would fail for
certain times of the day if daylight savings time was active. This
could most easily be seen by uncommenting the TEST_LOCALOFFSET() cases
that include a DST setting.
After a lot of experimenting, I found that the way to solve it in
almost all test cases is to set tm_isdst = -1 in the struct tm prior
to calling mktime(). Once this is done, the correct offset is returned
for all test cases at all times except the two hours just after
00:00:00 Jan 1 UTC - during that time, any timezone that is *behind*
UTC, and that is supposed to always be in DST will not have DST
accounted for in its offset.
I believe that the code of virTimeLocalOffsetFromUTC() actually is
correct for all cases, but the problem still encountered is due to our
inability to come up with a TZ string that properly forces DST to
*always* be active. Since a modfication of the (currently fixed)
expected result data to account for this would necessarily use the
same functions that we're trying to test, I've instead just made the
test program conditionally bypass the problematic cases if the current
date is either December 31 or January 1. This way we get maximum
testing during 363 days of the year, but don't get false failures on
Dec 31 and Jan 1.
Add argument to return backing file format of a file probed by
virStorageFileGetMetadataFromFD so that it can be used in place of
virStorageFileGetMetadataFromBuf.
Since there isn't a single libc API to get this value, this patch
supplies one which gets the value by grabbing current time, then
converting that into a struct tm with gmtime_r(), then back to a
time_t using mktime.
The returned value is the difference between UTC and localtime in
seconds. If localtime is ahead of UTC (east) the offset will be a
positive number, and if localtime is behind UTC (west) the offset will
be negative.
This function should be POSIX-compliant, and is threadsafe, but not
async signal safe. If it was ever necessary to know this value in a
child process, we could cache it with a one-time init function when
libvirtd starts, then just supply the cached value, but that
complexity isn't needed for current usage; that would also have the
problem that it might not be accurate after a local daylight savings
boundary.
(If it weren't for DST, we could simply replace this entire function
with "-timezone"; timezone contains the offset of the current timezone
(negated from what we want) but doesn't account for DST. And in spite
of being guaranteed by POSIX, it isn't available on older versions of
mingw.)
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently the protocol type with index 0 was NBD which made it hard to
distinguish whether the protocol type was actually assigned. Add a new
protocol type with index 0 to distinguish it explicitly.
The gluster volume name was previously stored as part of the source path
string. This is unfortunate when we want to do operations on the path as
the volume is used separately.
Parse and store the volume name separately for gluster storage volumes
and use the newly stored variable appropriately.
If you trigger bug 1033369, we get the error message:
error from service: Invalid argument
Which is a bit too generic to pinpoint what is actually failing. This
changes it to:
error from service: CreateMachine: Invalid argument
Acked-by: Eric Blake <eblake@redhat.com>
This is the only callsite.
We drop use of localerror.name here, because it's not actually useful
to us: rather than the parameter name which received an invalid value
(which was assumed), it's actually the the dbus errno equivalent.
Just use the error string.
Acked-by: Eric Blake <eblake@redhat.com>
Inspired by a simpler patch from "Wangrui (K) <moon.wangrui@huawei.com>".
A submitted patch pointed out that virNetlinkCommand() was doing an
improper typecast of the return value from nl_recv() (int to
unsigned), causing it to miss error returns, and that even after
remedying that problem, virNetlinkCommand() was calling VIR_FREE() on
the pointer returned from nl_recv() (*resp) even if nl_recv() had
returned an error, and that in this case the pointer was verifiably
invalid, as it was pointing to memory that had been allocated by
libnl, but then freed prior to returning the error.
While reviewing this patch, I noticed several other problems with this
seemingly simple function (at least one of them as serious as the
problem being reported/fixed by the aforementioned patch), and decided
they all deserved to be fixed. Here is the list:
1) The return value from nl_recv() must be assigned to an int (rather
than unsigned int) in order to detect failure.
2) When nl_recv() returns an error or 0, the contents of *resp is
invalid, and should be simply set to 0, *not* VIR_FREE()'d.
3) When nl_recv() returns 0, errno is not set, so the logged error
message should not reference errno (it *is* an error though).
4) The first error return from virNetlinkCommand returns -EINVAL,
incorrectly implying that the caller can expect the return value to
be of the "-errno" variety, which is not true in any other case.
5) The 2nd error return returns directly with garbage in *resp. While
the caller should never use *resp in this case, it's still good
practice to set it to NULL.
6) For the next 5 (!!) error conditions, *resp will contain garbage,
and virNetlinkCommand() will goto it's cleanup code which will
VIR_FREE(*resp), almost surely leading to a segfault.
In addition to fixing these 6 problems, this patch also makes the
following two changes to make the function conform more closely to the
style of other libvirt code:
1) Change the handling of return code from "named rc and defaulted to
0, but changed to -1 on error" to the more common "named ret and
defaulted to -1, but changed to 0 on success".
2) Rename the "error" label to "cleanup", since the code that follows
is executed in success cases as well as failure.
This partially reverts commits b279e52f7 and ea18f8b2.
It turns out our code base is full of:
if ((struct.member = virBlahFromString(str)) < 0)
goto error;
Meanwhile, the C standard says it is up to the compiler whether
an enum is signed or unsigned when all of its declared values
happen to be positive. In my testing (Fedora 20, gcc 4.8.2),
the compiler picked signed, and nothing changed. But others
testing with gcc 4.7 got compiler warnings, because it picked
the enum to be unsigned, but no unsigned value is less than 0.
Even worse:
if ((struct.member = virBlahFromString(str)) <= 0)
goto error;
is silently compiled without warning, but incorrectly treats -1
from a bad parse as a large positive number with no warning; and
without the compiler's help to find these instances, it is a
nightmare to maintain correctly. We could force signed enums
with a dummy negative declaration in each enum, or cast the
result of virBlahFromString back to int after assigning to an
enum value, or use a temporary int for collecting results from
virBlahFromString, but those actions are all uglier than what we
were trying to cure by directly using enum types for struct
values in the first place. It's better off to just live with int
members, and use 'switch ((virFoo) struct.member)' where we want
the compiler to help, than to track down all the conversions from
string to enum and ensure they don't suffer from type problems.
* src/util/virstorageencryption.h: Revert back to int declarations
with comment about enum usage.
* src/util/virstoragefile.h: Likewise.
* src/conf/domain_conf.c: Restore back to casts in switches.
* src/qemu/qemu_driver.c: Likewise.
* src/qemu/qemu_command.c: Add cast rather than revert.
Signed-off-by: Eric Blake <eblake@redhat.com>
Add VIR_STORAGE_FILE_PLOOP format. This format is used
to store disk images for virtual machines in PCS and containers
in PCS, OpenVZ and also in Parallels Desktop for Mac.
This format is described on OpenVZ site -
https://openvz.org/Ploop (together with ploop devices). It
consists of XML descriptor and one or more image files: base
image and deltas. Format of the image files described here:
https://openvz.org/Ploop/format.
This patch only adds VIR_STORAGE_FILE_PLOOP constant, consequent
patches will use it in parallels driver.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
For internal structs, we might as well be type-safe and let the
compiler help us with less typing required on our part (getting
rid of casts is always nice). In trying to use enums directly,
I noticed two problems in virstoragefile.h that can't be fixed
without more invasive refactoring: virStorageSource.format is
used as more of a union of multiple enums in storage volume
code (so it has to remain an int), and virStorageSourcePoolDef
refers to pooltype whose enum is declared in src/conf, but where
src/util can't pull in headers from src/conf.
* src/util/virstoragefile.h (virStorageNetHostDef)
(virStorageSourcePoolDef, virStorageSource): Use enums instead of
int for fields of internal types.
* src/qemu/qemu_command.c (qemuParseCommandLine): Cover all values.
* src/conf/domain_conf.c (virDomainDiskSourceParse)
(virDomainDiskSourceFormat): Simplify clients.
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotPrepareDiskExternalBackingInactive)
(qemuDomainSnapshotPrepareDiskExternalOverlayActive)
(qemuDomainSnapshotPrepareDiskInternal): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
The VIR_ENUM_DECL/VIR_ENUM_IMPL helper macros already append 'Type'
to the enum name being converted; it looks silly to have functions
with 'TypeType' in their name. Even though some of our enums have
to have a 'Type' suffix, the corresponding string conversion
functions do not.
* src/conf/secret_conf.h (VIR_ENUM_DECL): Rename virSecretUsageType.
* src/conf/storage_conf.h (VIR_ENUM_DECL): Rename
virStoragePoolAuthType, virStoragePoolSourceAdapterType,
virStoragePartedFsType.
* src/conf/domain_conf.c (virDomainDiskDefParseXML)
(virDomainFSDefParseXML, virDomainFSDefFormat): Update callers.
* src/conf/secret_conf.c (virSecretDefParseUsage)
(virSecretDefFormatUsage): Likewise.
* src/conf/storage_conf.c (virStoragePoolDefParseAuth)
(virStoragePoolDefParseSource, virStoragePoolSourceFormat):
Likewise.
* src/lxc/lxc_controller.c (virLXCControllerSetupLoopDevices):
Likewise.
* src/storage/storage_backend_disk.c
(virStorageBackendDiskPartFormat): Likewise.
* src/util/virstorageencryption.c (virStorageEncryptionSecretParse)
(virStorageEncryptionSecretFormat): Likewise.
* tools/virsh-secret.c (cmdSecretList): Likewise.
* src/libvirt_private.syms (secret_conf.h, storage_conf.h): Export
corrected names.
Signed-off-by: Eric Blake <eblake@redhat.com>
Continuing the work of consistent enum cleanups; this time in
virstorageencryption.h.
* src/util/virstorageencryption.h (virStorageEncryptionFormat):
Convert to typedef, renaming to avoid collision with function.
(virStorageEncryptionSecret, virStorageEncryption): Directly use
enums.
Signed-off-by: Eric Blake <eblake@redhat.com>
In "src/conf/" there are many enumeration (enum) declarations.
Similar to the recent cleanup to "src/util" directory, it's
better to use a typedef for variable types, function types and
other usages. Other enumeration and folders will be changed to
typedef's in the future. Most of the files changed in this
commit are related to storage (storage_conf) enums.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
If the XML_PARSE_NOENT flag is passed to libxml2, then any
entities in the input document will be fully expanded. This
allows the user to read arbitrary files on the host machine
by creating an entity pointing to a local file. Removing
the XML_PARSE_NOENT flag means that any entities are left
unchanged by the parser, or expanded to "" by the XPath
APIs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit 1b14c44 broke the build on FreeBSD by changing
the signature of a few functions without updating the
corresponding stubs that are used when WITH_MACVTAP
or WITH_VIRTUALPORT is not defined.
In "src/util/" there are many enumeration (enum) declarations.
Sometimes, it's better using a typedef for variable types,
function types and other usages. Other enumeration will be
changed to typedef's in the future.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
All callers of virStorageFileGetMetadataFromBuf were first calling
virStorageFileProbeFormatFromBuf, to learn what format to pass in.
But this function is already wired to do the exact same probe if
the incoming format is VIR_STORAGE_FILE_AUTO, so it's simpler to
just refactor the probing into the central function.
* src/util/virstoragefile.h (virStorageFileGetMetadataFromBuf):
Drop parameter.
(virStorageFileProbeFormatFromBuf): Drop declaration.
* src/util/virstoragefile.c (virStorageFileGetMetadataFromBuf):
Do probe here instead of in callers.
(virStorageFileProbeFormatFromBuf): Make static.
* src/libvirt_private.syms (virstoragefile.h): Drop function.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Update caller.
* src/storage/storage_backend_gluster.c
(virStorageBackendGlusterRefreshVol): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit f22b7899 stumbled across a difference between 32-bit and
64-bit platforms when parsing "-1" as an int. Now that we've
fixed that difference, it's time to fix the testsuite.
* src/util/virstoragefile.c (virStorageFileParseChainIndex):
Require a positive index.
Signed-off-by: Eric Blake <eblake@redhat.com>
strtoul() is required to parse negative numbers as their
twos-complement positive counterpart. But sometimes we want
to reject negative numbers. Add new functions to do this.
The 'p' suffix is a mnemonic for 'positive' (technically it
also parses 0, but 'non-negative' doesn't lend itself to a
nice one-letter suffix).
* src/util/virstring.h (virStrToLong_uip, virStrToLong_ulp)
(virStrToLong_ullp): New prototypes.
* src/util/virstring.c (virStrToLong_uip, virStrToLong_ulp)
(virStrToLong_ullp): New functions.
* src/libvirt_private.syms (virstring.h): Export them.
* tests/virstringtest.c (testStringToLong): Test them.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit f22b7899 called to light a long-standing latent bug: the
behavior of virStrToLong_ui was different on 32-bit platforms
than on 64-bit platforms. Curse you, C type promotion and
narrowing rules, and strtoul specification. POSIX says that for
a 32-bit long, strtol handles only 2^32 values [LONG_MIN to
LONG_MAX] while strtoul handles 2^33 - 1 values [-ULONG_MAX to
ULONG_MAX] with twos-complement wraparound for negatives. Thus,
parsing -1 as unsigned long produces ULONG_MAX, rather than a
range error. We WANT[1] this same shortcut for turning -1 into
UINT_MAX when parsing to int; and get it for free with 32-bit
long. But with 64-bit long, ULONG_MAX is outside the range
of int and we were rejecting it as invalid; meanwhile, we were
silently treating -18446744073709551615 as 1 even though it
textually exceeds INT_MIN. Too bad there's not a strtoui() in
libc that does guaranteed parsing to int, regardless of the size
of long.
The bug has been latent since 2007, introduced by Jim Meyering
in commit 5d25419 in the attempt to eradicate unsafe use of
strto[u]l when parsing ints and longs. How embarrassing that we
are only discovering it now - so I'm adding a testsuite to ensure
that it covers all the corner cases we care about.
[1] Ideally, we really want the caller to be able to choose whether
to allow negative numbers to wrap around to their 2s-complement
counterpart, as in strtoul, or to force a stricter input range
of [0 to UINT_MAX] by rejecting negative signs; this will be added
in a later patch for all three int types.
This patch is tested on both 32- and 64-bit; the enhanced
virstringtest passes on both platforms, while virstoragetest now
reliably fails on both platforms instead of just 32-bit platforms.
That test will be fixed later.
* src/util/virstring.c (virStrToLong_ui): Ensure same behavior
regardless of platform long size.
* tests/virstringtest.c (testStringToLong): New function.
(mymain): Comprehensively test string to long parsing.
Signed-off-by: Eric Blake <eblake@redhat.com>
To avoid memory leak of the "backingStoreRaw" field when reparsing
backing chains a new function is being introduced by this patch that
shall be used to clear backing store information.
The memory leak was introduced in commit 8823272d41.
Commit 5c43e2e introduced a NULL deref if there is a failure
in virStorageFileGetMetadataInternal.
* src/util/virstoragefile.c (virStorageFileGetMetadataFromBuf):
Fix error handling.
Signed-off-by: Eric Blake <eblake@redhat.com>
SO_REUSEADDR on Windows is actually akin to SO_REUSEPORT
on Linux/BSD. ie it allows 2 apps to listen to the same
port at once. Thus we must not set it on Win32 platforms
See http://msdn.microsoft.com/en-us/library/windows/desktop/ms740621.aspx
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Now that all clients have been adjusted, ensure that no future
misuse of readdir is introduced into the code base.
* cfg.mk (sc_prohibit_readdir): New rule.
* src/util/virfile.c (virDirRead): Exempt the wrapper.
Signed-off-by: Eric Blake <eblake@redhat.com>
In making the conversion to the new API, I fixed a couple bugs:
virSCSIDeviceGetSgName would leak memory if a directory
unexpectedly contained multiple entries;
virNetDevTapGetRealDeviceName could report a spurious error
from a stale errno inherited before starting the readdir search.
The decision on whether to store the result of virDirRead into
a variable is based on whether the end of the loop falls through
to cleanup code automatically. In some cases, we have loops that
are documented to return NULL on failure, and which raise an
error on most failure paths but not in the case where the directory
was unexpectedly empty; it may be worth a followup patch to
explicitly report an error if readdir was successful but the
directory was empty, so that a NULL return always has an error set.
* src/util/vircgroup.c (virCgroupRemoveRecursively): Use new
interface.
(virCgroupKillRecursiveInternal, virCgroupSetOwner): Report
readdir failures.
* src/util/virfile.c (virFileLoopDeviceOpenSearch)
(virFileNBDDeviceFindUnused, virFileDeleteTree): Use new
interface.
* src/util/virnetdevtap.c (virNetDevTapGetRealDeviceName):
Properly check readdir errors.
* src/util/virpci.c (virPCIDeviceIterDevices)
(virPCIDeviceFileIterate, virPCIGetNetName): Report readdir
failures.
(virPCIDeviceAddressIOMMUGroupIterate): Use new interface.
* src/util/virscsi.c (virSCSIDeviceGetSgName): Report readdir
failures, and avoid memory leak.
(virSCSIDeviceGetDevName): Report readdir failures.
* src/util/virusb.c (virUSBDeviceSearch): Report readdir
failures.
* src/util/virutil.c (virGetFCHostNameByWWN)
(virFindFCHostCapableVport): Report readdir failures.
Signed-off-by: Eric Blake <eblake@redhat.com>
Introduce a wrapper for readdir. This helps us make sure that we always
set errno before calling readdir and it will make sure errors are
properly logged.
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
The virFirewallAddRuleFull method originally had a single
compulsory virFirewallQueryCallback parameter. During dev
work though the ignoreErrors parameter was added and the
callback parameter made optional. The ATTRIBUTE_NONNULL
annotation was never removed though.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the virebtables.{c,h} files to use the new virFirewall
APIs for changing ebtables rules.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Update the iptablesXXXX methods so that instead of directly
executing iptables commands, they populate rules in an
instance of virFirewallPtr. The bridge driver can thus
construct the ruleset and then invoke it in one operation
having rollback handled automatically.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The network and nwfilter drivers both have a need to update
firewall rules. The currently share no code for interacting
with iptables / firewalld. The nwfilter driver is fairly
tied to the concept of creating shell scripts to execute
which makes it very hard to port to talk to firewalld via
DBus APIs.
This patch introduces a virFirewallPtr object which is able
to represent a complete sequence of rule changes, with the
ability to have multiple transactional checkpoints with
rollbacks. By formally separating the definition of the rules
to be applied from the mechanism used to apply them, it is
also possible to write a firewall engine that uses firewalld
DBus APIs natively instead of via the slow firewalld-cmd.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of hardcoding LIBEXECDIR as the location of the libvirt_iohelper
binary, use virFileFindResource to optionally find it in the current
build directory.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add virFileFindResource which will try to locate files
in the local build tree if the calling binary (eg libvirtd or
test suite) is being run from the build tree. The corresponding
virFileActivateDirOverride should be called at startup passing
in argv[0]. This will be examined for evidence of libtool magic
binary prefix / sub-directory in order to activate the override.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Each backing store of a given disk is associated with a unique index
(which is also formatted in domain XML) for easier addressing of any
particular backing store. With this patch, any backing store can be
addressed by its disk target and the index. For example, "vdc[4]"
addresses the backing store with index equal to 4 of the disk identified
by "vdc" target. Such shorthand can be used in any API in place for a
backing file path:
virsh blockcommit domain vda --base vda[3] --top vda[2]
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
To avoid having the root of a backing chain present twice in the list we
need to invert the working of virStorageFileGetMetadataRecurse.
Until now the recursive worker created a new backing chain element from
the name and other information passed as arguments. This required us to
pass the data of the parent in a deconstructed way and the worker
created a new entry for the parent.
This patch converts this function so that it just fills in metadata
about the parent and creates a backing chain element from those. This
removes the duplication of the first element.
To avoid breaking the test suite, virstoragetest now calls a wrapper
that creates the parent structure explicitly and pre-fills it with the
test data with same function signature as previously used.