Libtool supports linking directly against .o files on some platforms
(such as Linux), which happens to be the only place where we are
actually doing that (for the dtrace-generated probes.o files). However,
it raises a big stink about the non-portability, even though we don't
attempt it on platforms where it would actually fail:
CCLD libvirt_driver_qemu.la
*** Warning: Linking the shared library libvirt_driver_qemu.la against
the non-libtool
*** objects libvirt_qemu_probes.o is not portable!
This shuts libtool up by creating a proper .lo file that matches
what libtool normally expects.
* src/Makefile.am (%_probes.lo): New rule.
(libvirt_probes.stp, libvirt_qemu_probes.stp): Simplify into...
(%_probes.stp): ...shorter rule.
(CLEANFILES): Clean new .lo files.
(libvirt_la_BUILT_LIBADD, libvirt_driver_qemu_la_LIBADD)
(libvirt_lxc_LDADD, virt_aa_helper_LDADD): Link against .lo file.
* tests/Makefile.am (PROBES_O, qemu_LDADDS): Likewise.
If vdsm is installed and configured in Fedora 17, we add the following
items into qemu.conf:
spice_tls=1
spice_tls_x509_cert_dir="/etc/pki/vdsm/libvirt-spice"
However, after this changes, augtool cannot identify qemu.conf anymore.
Add a VIR_ERR_DOMAIN_LAST sentinel for virErrorDomain and
replace the virErrorDomainName function by a VIR_ENUM_IMPL
In the process the naming of error domains is sanitized
* src/util/virterror.c: Use VIR_ENUM_IMPL for converting
error domains to strings
* include/libvirt/virterror.h: Add VIR_ERR_DOMAIN_LAST
The libvirt_private.syms file exports virNetlinkEventServiceLocalPid
so there needs to be a no-op stub for Win32 to avoid linker errors
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
* daemon/libvirtd.c: Set custom driver module dir if the current
binary name is 'lt-libvirtd' (indicating execution directly
from GIT checkout)
* src/driver.c, src/driver.h, src/libvirt_driver_modules.syms: Add
virDriverModuleInitialize to allow driver module location to
be changed
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When building as driver modules, it is not possible for the QEMU
driver module to reference the DTrace/SystemTAP probes linked into
the main libvirt.so. Thus we need to move the QEMU probes into a
separate file 'libvirt_qemu_probes.d'. Also rename the existing
file from 'probes.d' to 'libvirt_probes.d' while we're at it
* daemon/Makefile.am, src/internal.h: Include libvirt_probes.h
instead of probes.h
* src/Makefile.am: Add rules for libvirt_qemu_probes.d
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor_json.c,
src/qemu/qemu_monitor_text.c: Include libvirt_qemu_probes.h
* src/libvirt_probes.d: Rename from probes.d
* src/libvirt_qemu_probes.d: QEMU specific probes formerly
in probes.d
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Since we have drivers which depend on each other (ie QEMU/LXC
depend on the network driver APIs), we need to use RTLD_GLOBAL
instead of RTLD_LOCAL. While this pollutes the calling binary
with many more symbols, this is no worse than if we directly
link to the drivers, and this only applies to libvirtd
* src/driver.c: s/RTLD_LOCAL/RTLD_GLOBAL/
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Only libvirt_driver_storage.la links to libblkid currently. If
we are running in a scenario with driver modules, LXC must
directly link to it, since it can't assume the storage driver
is present
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The libvirt_test.la library was introduced to allow test suites
to reference internal-only symbols. These days, nearly every
symbol we care about is in src/libvirt_private.syms, so there
is no need for libvirt_test.la to continue to exist
* src/Makefile.am: Delete libvirt_test.la & add new .syms files
* src/libvirt_private.syms: Export symbols needed by test suite
* tests/Makefile.am: Link to libvirt_test.la. Ensure LXC tests link
to network_driver.la
* src/libvirt_esx.syms, src/libvirt_openvz.syms: Add exports needed
by test suite
libvirt_driver_nodedev.la should not link against either
libvirt_util.la or gnulib.la, since libvirt.so brings
in those deps.
* src/Makefile.am: Fix broken linkage of libvirt_driver_nodedev.la
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The driver modules all use symbols which are defined in libvirt.so.
Thus for loading of modules to work, the binary that libvirt.so
is linked to must export its symbols back to modules. If the
libvirt.so itself is dlopen()d then the RTLD_GLOBAL flag must
be set. Unfortunately few, if any, programming languages use
the RTLD_GLOBAL flag when loading modules :-( This means is it
not practical to use driver modules for any libvirt client side
drivers (OpenVZ, VMWare, Hyper-V, Remote client, test).
This patch changes the build process so only server side drivers
are built as modules (Xen, QEMU, LXC, UML)
* daemon/libvirtd.c: Add missing load of 'interface' driver
* src/Makefile.am: Only build server side drivers as modules
* src/libvirt.c: Don't load any driver modules
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
and use it for virDomainParseMemory. This allows to parse arbitrary
scaled value, not only memory related values as needed for the
filesystem limits code following later in this series.
This reverts commit b1e374a7ac, which was
rather bad since I failed to consider all sides of the issue. The main
things I didn't consider properly are:
- a thread which sends a non-blocking call waits for the thread with
the buck to process the call
- the code doesn't expect non-blocking calls to remain in the queue
unless they were already partially sent
Thus, the reverted patch actually breaks more than what it fixes and
clients (which may even be libvirtd during p2p migrations) will likely
end up in a deadlock.
The pciDevice structure corresponding to the device being hot-unplugged
was freed after it was "stolen" from activeList. The pointer was still
used for eg-inactive list. This patch removes the free of the structure
and frees it only if reset fails on the device.
I'm tired of writing:
bool sep = false;
while (...) {
if (sep)
virBufferAddChar(buf, ',');
sep = true;
virBufferAdd(buf, str);
}
This makes it easier, allowing one to write:
while (...)
virBufferAsprintf(buf, "%s,", str);
virBufferTrim(buf, ",", -1);
to trim any remaining comma.
* src/util/buf.h (virBufferTrim): Declare.
* src/util/buf.c (virBufferTrim): New function.
* tests/virbuftest.c (testBufTrim): Test it.
This patch adds support for a new storage backend with RBD support.
RBD is the RADOS Block Device and is part of the Ceph distributed storage
system.
It comes in two flavours: Qemu-RBD and Kernel RBD, this storage backend only
supports Qemu-RBD, thus limiting the use of this storage driver to Qemu only.
To function this backend relies on librbd and librados being present on the
local system.
The backend also supports Cephx authentication for safe authentication with
the Ceph cluster.
For storing credentials it uses the built-in secret mechanism of libvirt.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
When the last reference to a virConnectPtr is released by
libvirtd, it was possible for a deadlock to occur in the
virDomainEventState functions. The virDomainEventStatePtr
holds a reference on virConnectPtr for each registered
callback. When removing a callback, the virUnrefConnect
function is run. If this causes the last reference on the
virConnectPtr to be released, then virReleaseConnect can
be run, which in turns calls qemudClose. This function has
a call to virDomainEventStateDeregisterConn which is intended
to remove all callbacks associated with the virConnectPtr
instance. This will try to grab a lock on virDomainEventState
but this lock is already held. Deadlock ensues
Thread 1 (Thread 0x7fcbb526a840 (LWP 23185)):
Since each callback associated with a virConnectPtr holds a
reference on virConnectPtr, it is impossible for the qemudClose
method to be invoked while any callbacks are still registered.
Thus the call to virDomainEventStateDeregisterConn must in fact
be a no-op. Thus it is possible to just remove all trace of
virDomainEventStateDeregisterConn and avoid the deadlock.
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Delete virDomainEventStateDeregisterConn
* src/libxl/libxl_driver.c, src/lxc/lxc_driver.c,
src/qemu/qemu_driver.c, src/uml/uml_driver.c: Remove
calls to virDomainEventStateDeregisterConn
This patch adds support for the recent ipset iptables extension
to libvirt's nwfilter subsystem. Ipset allows to maintain 'sets'
of IP addresses, ports and other packet parameters and allows for
faster lookup (in the order of O(1) vs. O(n)) and rule evaluation
to achieve higher throughput than what can be achieved with
individual iptables rules.
On the command line iptables supports ipset using
iptables ... -m set --match-set <ipset name> <flags> -j ...
where 'ipset name' is the name of a previously created ipset and
flags is a comma-separated list of up to 6 flags. Flags use 'src' and 'dst'
for selecting IP addresses, ports etc. from the source or
destination part of a packet. So a concrete example may look like this:
iptables -A INPUT -m set --match-set test src,src -j ACCEPT
Since ipset management is quite complex, the idea was to leave ipset
management outside of libvirt but still allow users to reference an ipset.
The user would have to make sure the ipset is available once the VM is
started so that the iptables rule(s) referencing the ipset can be created.
Using XML to describe an ipset in an nwfilter rule would then look as
follows:
<rule action='accept' direction='in'>
<all ipset='test' ipsetflags='src,src'/>
</rule>
The two parameters on the command line are also the two distinct XML attributes
'ipset' and 'ipsetflags'.
FYI: Here is the man page for ipset:
https://ipset.netfilter.org/ipset.man.html
Regards,
Stefan
We were being lazy - virnetlink.c was getting uint32_t as a
side-effect from glibc 2.14's <unistd.h>, but older glibc 2.11
does not provide uint32_t from <unistd.h>. In fact, POSIX states
that <unistd.h> need only provide intptr_t, not all of <stdint.h>,
so the bug really is ours. Reported by Jonathan Alescio.
* src/util/virnetlink.h: Include <stdint.h>.
This involves setting the cpuacct cgroup to a per-vcpu granularity,
as well as summing the each vcpu accounting into a common array.
Now that we are reading more than one cgroup file, we double-check
that cpus weren't hot-plugged between reads to invalidate our
summing.
Signed-off-by: Eric Blake <eblake@redhat.com>
If qemuPrepareHostdevUSBDevices fail it will roll back devices added
to the driver list of used devices. However, if it may fail because
the device is being used already. But then again - with roll back.
Therefore don't try to remove a usb device manually if the function
fail. Although, we want to remove the device if any operation
performed afterwards fail.
Allow the logging APIs to be called with a va_list for format
args, instead of requiring var-args usage.
* src/util/logging.h, src/util/logging.c: Add virLogVMessage
The use of readlink() in lxc_container.c is intentional; we don't
want an absolute pathname there.
* src/util/cgroup.h (VIR_CGROUP_SYSFS_MOUNT): Indent properly.
* cfg.mk (exclude_file_name_regexp--sc_prohibit_readlink): Add
exemption.
One of our latest USB device handling patches
05abd1507d introduced a regression.
That is, we first create a temporary list of all USB devices that
are to be used by domain just starting up. Then we iterate over and
check if a device from the list is in the global list of currently
assigned devices (activeUsbHostdevs). If not, we add it there and
continue with next iteration then. But if a device from temporary
list is either taken already or adding to the activeUsbHostdevs fails,
we remove all devices in temp list from the activeUsbHostdevs list.
Therefore, if a device is already taken we remove it from
activeUsbHostdevs even if we should not. Thus, next time we allow
the device to be assigned to another domain.
Most versions of libselinux do not contain the function
selinux_lxc_contexts_path() that the security driver
recently started using for LXC. We must add a conditional
check for it in configure and then disable the LXC security
driver for builds where libselinux lacks this function.
* configure.ac: Check for selinux_lxc_contexts_path
* src/security/security_selinux.c: Disable LXC security
if selinux_lxc_contexts_path() is missing
Normal practice is for cgroups controllers to be mounted at
/sys/fs/cgroup. When setting up a container, /sys is mounted
with a new sysfs instance, thus we must re-mount all the
cgroups controllers. The complexity is that we must mount
them in the same layout as the host OS. ie if 'cpu' and 'cpuacct'
were mounted at the same location in the host we must preserve
this in the container. Also if any controllers are co-located
we must setup symlinks from the individual controller name to
the co-located mount-point
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Both /proc and /sys may have sub-mounts in them from the host
OS. We must explicitly unmount them all before mounting the
new instance over that location. If we don't then /proc/mounts
will show the sub-mounts as existing, even though nothing will
be able to access them, due to the over-mount.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If the LXC config has a filesystem
<filesystem>
<source dir='/'/>
<target dir='/'/>
</filesystem>
then there is no need to go down the pivot root codepath.
We can simply use the existing root as needed.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently to make sysfs readonly, we remount the existing
instance and then bind it readonly. Unfortunately this means
sysfs is still showing device objects wrt the host OS namespace.
We need it to reflect the container namespace, so we must mount
a completely new instance of it. Do the same for selinuxfs since
there is no benefit to bind mounting & this lets us simplify
the code.
* src/lxc/lxc_container.c: Mount fresh sysfs instance
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of hardcoding use of SELinux contexts in the LXC driver,
switch over to using the official security driver API.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some security drivers require special options to be passed to
the mount system call. Add a security driver API for handling
this data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The SELinux policy for LXC uses a different configuration file
than the traditional svirt one. Thus we need to load
/etc/selinux/targeted/contexts/lxc_contexts which contains
something like this:
process = "system_u:system_r:svirt_lxc_net_t:s0"
file = "system_u:object_r:svirt_lxc_file_t:s0"
content = "system_u:object_r:virt_var_lib_t:s0"
cleverly designed to be parsable by virConfPtr
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the SELinux driver stores its state in a set of global
variables. This switches it to use a private data struct instead.
This will enable different instances to have their own data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The AppArmour driver does not currently have support for LXC
so ensure that when probing, it claims to be disabled
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To allow the security drivers to apply different configuration
information per hypervisor, pass the virtualization driver name
into the security manager constructor.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Thanks to this new option we are now able to use modern CPU models (such
as Westmere) defined in external configuration file.
The qemu-1.1{,-device} data files for qemuhelptest are filled in with
qemu-1.1-rc2 output for now. I will update those files with real
qemu-1.1 output once it is released.
The uhci1, uhci2, uhci3 companion controllers for ehci1 must
have a master start port set. Since this value is predictable
we should set it automatically if the app does not supply it
Currently each USB2 companion controller gets put on a separate
PCI slot. Not only is this wasteful of PCI slots, but it is not
in compliance with the spec for USB2 controllers. The master
echi1 and all companion controllers should be in the same slot,
with echi1 in function 7, and uhci1-3 in functions 0-2 respectively.
* src/qemu/qemu_command.c: Special case handling of USB2 controllers
to apply correct pci slot assignment
* tests/qemuxml2argvdata/qemuxml2argv-usb-ich9-ehci-addr.args,
tests/qemuxml2argvdata/qemuxml2argv-usb-ich9-ehci-addr.xml: Expand
test to cover automatic slot assignment
The virDomainDeviceInfoIsSet API was only checking if an
address or alias was set in the struct. Thus if only a
rom bar setting / filename, boot index, or USB master
value was set, they could be accidentally dropped when
formatting XML
Callers of virGetUser{Config,Runtime,Cache}Directory all
append further path component. We should not be
adding a trailing slash in the return path otherwise we
get paths containing '//'
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Sometimes it is useful to see the callpath for log messages.
This change enhances the log filter syntax so that stack traces
can be show by setting '1:+NAME' instead of '1:NAME'.
This results in output like:
2012-05-09 14:18:45.136+0000: 13314: debug : virInitialize:414 : register drivers
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(virInitialize+0xd6)[0x7f89188ebe86]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x431921]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x3a21e21735]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x40a279]
2012-05-09 14:18:45.136+0000: 13314: debug : virRegisterDriver:775 : driver=0x7f8918d02760 name=Test
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(virRegisterDriver+0x6b)[0x7f89188ec717]
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(+0x11b3ad)[0x7f891891e3ad]
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(virInitialize+0xf3)[0x7f89188ebea3]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x431921]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x3a21e21735]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x40a279]
* docs/logging.html.in: Document new syntax
* configure.ac: Check for execinfo.h
* src/util/logging.c, src/util/logging.h: Add support for
stack traces
* tests/testutils.c: Adapt to API change
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The current unprivileged user libvirtd sockets are in the abstract
namespace. This has a number of problems
- You can't connect to them remotely using the nc/ssh tunnel
- This is not portable for OS-X, BSD & probably others
- Parent directory permissions don't apply
"Instead of developing one CPU with 12 cores, the Magny Cours is
actually two 6 core “Bulldozer” CPUs combined in to one package"
I.e, each package has two NUMA nodes, and the two numa nodes share
the same core ID set (0-6), which means parsing the cores number
from sysfs doesn't work in this case.
And the wrong CPU number could cause three problems for libvirt:
1) performance lost
A domain without "cpuset" or "placement='auto'" (to drive numad)
specified will be only pinned to part of the CPUs.
2) domain can be started
If a domain uses numad, and the advisory nodeset returned from
numad contains node which exceeds the range of wrong total CPU
number. The domain will fail to start, as the bitmask passed to
sched_setaffinity could be fully filled with zero.
3) wrong CPU number affects lots of stuffs.
E.g. for command "virsh vcpuinfo", "virsh vcpupin", it will always
output with the truncated CPU list.
For more details:
https://www.redhat.com/archives/libvir-list/2012-May/msg00607.html
This patch is to fix the problem by parsing /proc/cpuinfo to get
the value of field "cpu cores", and use it as nodeinfo->cores if
it's greater than the cores number from sysfs.
Like for 'static' placement, when the memory policy mode is
'strict', set the memory policy by writing the advisory nodeset
returned from numad to cgroup file cpuset.mems,
On some of the NUMA platforms, the CPU index in each NUMA node
grows non-consecutive. While on other platforms, it can be inconsecutive,
E.g.
% numactl --hardware
available: 4 nodes (0-3)
node 0 cpus: 0 4 8 12 16 20 24 28
node 0 size: 131058 MB
node 0 free: 86531 MB
node 1 cpus: 1 5 9 13 17 21 25 29
node 1 size: 131072 MB
node 1 free: 127070 MB
node 2 cpus: 2 6 10 14 18 22 26 30
node 2 size: 131072 MB
node 2 free: 127758 MB
node 3 cpus: 3 7 11 15 19 23 27 31
node 3 size: 131072 MB
node 3 free: 127226 MB
node distances:
node 0 1 2 3
0: 10 20 20 20
1: 20 10 20 20
2: 20 20 10 20
3: 20 20 20 10
This patch is to fix the problem by using the CPU index in
caps->host.numaCell[i]->cpus[i] to set the bitmask instead of
assuming the CPU index of the NUMA nodes are always sequential.
For pseries guest, the default controller model is
ibmvscsi controller, this controller only can work
on spapr-vio address.
This patch is to assign spapr-vio address type to
ibmvscsi controller and correct vscsi test case.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
We had previously weakened our nodeinfotest in order to ignore parsed
node values, because the parse function was mistakenly relying on
host files. A better fix is to avoid using the numactl library, but
to instead parse the same files that numactl would read, all while
allowing the files to be relative to our choice of directory.
* src/nodeinfo.c (CPU_SYS_PATH, NODE_SYS_PATH): Replace with...
(SYSFS_SYSTEM_PATH): ...parent directory.
(linuxNodeInfoCPUPopulate): Check NUMA nodes from requested
directory (by inlining numactl code).
(nodeGetCPUmap, nodeGetMemoryStats): Adjust macro use.
* tests/nodeinfotest.c (linuxTestCompareFiles, linuxTestNodeInfo):
Update test to match.
We were wasting time to malloc a copy of a constant string, then
copy it into static storage, for every call to nodeGetInfo. At
least we were lucky that it was a constant source, and thus not
subject to even worse issues with one thread clobbering the static
storage while another was using it. This gets rid of the waste,
by passing the string through the stack instead, as well as renaming
internal functions to better match our conventions.
* src/nodeinfo.c (sysfs_path): Delete.
(get_cpu_value, count_thread_siblings, parse_socket): Add
parameter, and rename...
(virNodeGetCpuValue, virNodeCountThreadSiblings)
(virNodeParseSocket): ... into a common namespace.
(cpu_online, parse_core): Inline into callers.
(linuxNodeInfoCPUPopulate): Update caller.
(nodeGetInfo): Drop a useless malloc.
Commit cdce2f42d tried to silence a compiler warning on 32-bit builds,
but the gcc shipped with RHEL 5 is old enough that the type conversion
via multiplication by 1 was insufficient for the task.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Previous attempt
didn't get past all gcc versions.
As defined in:
http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
This offers a number of advantages:
* Allows sharing a home directory between different machines, or
sessions (eg. using NFS)
* Cleanly separates cache, runtime (eg. sockets), or app data from
user settings
* Supports performing smart or selective migration of settings
between different OS versions
* Supports reseting settings without breaking things
* Makes it possible to clear cache data to make room when the disk
is filling up
* Allows us to write a robust and efficient backup solution
* Allows an admin flexibility to change where data and settings are stored
* Dramatically reduces the complexity and incoherence of the
system for administrators
Appending an item to a list transfers ownership of that item to the
list owner. But an error can occur in between item allocation and
appending it to the list. In this case the item has to be freed
explicitly. This was not done in some special cases resulting in
possible memory leaks.
Reported by Coverity.
This patch lifts the limit of calling thread detection code only on KVM
guests. With upstream qemu the thread mappings are reported also on
non-KVM machines.
QEMU adopted the thread_id information from the kvm branch.
To remain compatible with older upstream versions of qemu the check is
attempted but the failure to detect threads (or even run the monitor
command - on older versions without SMP support) is treated non-fatal
and the code reports one vCPU with pid of the hypervisor (in same
fashion this was done on non-KVM guests).
After a cpu hotplug the qemu driver did not refresh information about
virtual processors used by qemu and their corresponding threads. This
patch forces a re-detection as is done on start of QEMU.
This ensures that correct information is reported by the
virDomainGetVcpus API and "virsh vcpuinfo".
A failure to obtain the thread<->vcpu mapping is treated non-fatal and
the mapping is not updated in a case of failure as not all versions of
QEMU report this in the info cpus command.
This patch changes a switch statement into ifs when handling live vs.
configuration modifications getting rid of redundant code in case when
both live and persistent configuration gets changed.
when failing to attach another usb device to a domain for some reason
which has one use device attached before, the libvirtd crashed.
The crash is caused by null-pointer dereference error in invoking
usbDeviceListSteal passed in NULL value usb variable.
commit 05abd1507d introduces the bug.
Detected by valgrind. Leaks are introduced in commit 122fa379.
src/conf/storage_conf.c: fix memory leaks.
How to reproduce?
$ make && make -C tests check TESTS=storagepoolxml2xmltest
$ cd tests && valgrind -v --leak-check=full ./storagepoolxml2xmltest
actual result:
==28571== LEAK SUMMARY:
==28571== definitely lost: 40 bytes in 5 blocks
==28571== indirectly lost: 0 bytes in 0 blocks
==28571== possibly lost: 0 bytes in 0 blocks
==28571== still reachable: 1,054 bytes in 21 blocks
==28571== suppressed: 0 bytes in 0 blocks
Signed-off-by: Alex Jia <ajia@redhat.com>
No useful error was being reported when an invalid character device
target type is specified in the domainXML. E.g.
...
<console type="pty">
<source path="/dev/pts/2"/>
<target type="kvm" port="0"/>
</console>
...
resulted in
error: Failed to define domain from x.xml
error: An error occurred, but the cause is unknown
With this small patch, the error is more helpful
error: Failed to define domain from x.xml
error: XML error: unknown target type 'kvm' specified for character device
<vcpu> is not an optional node. The value for its 'placement'
actually always defaults to 'static' in the underlying codes.
(Even no 'cpuset' and 'placement' is specified, the domain
process will be pinned to all the available pCPUs).
Though numad will manage the memory allocation of task dynamically,
it wants management application (libvirt) to pre-set the memory
policy according to the advisory nodeset returned from querying numad,
(just like pre-bind CPU nodeset for domain process), and thus the
performance could benefit much more from it.
This patch introduces new XML tag 'placement', value 'auto' indicates
whether to set the memory policy with the advisory nodeset from numad,
and its value defaults to the value of <vcpu> placement, or 'static'
if 'nodeset' is specified. Example of the new XML tag's usage:
<numatune>
<memory placement='auto' mode='interleave'/>
</numatune>
Just like what current "numatune" does, the 'auto' numa memory policy
setting uses libnuma's API too.
If <vcpu> "placement" is "auto", and <numatune> is not specified
explicitly, a default <numatume> will be added with "placement"
set as "auto", and "mode" set as "strict".
The following XML can now fully drive numad:
1) <vcpu> placement is 'auto', no <numatune> is specified.
<vcpu placement='auto'>10</vcpu>
2) <vcpu> placement is 'auto', no 'placement' is specified for
<numatune>.
<vcpu placement='auto'>10</vcpu>
<numatune>
<memory mode='interleave'/>
</numatune>
And it's also able to control the CPU placement and memory policy
independently. e.g.
1) <vcpu> placement is 'auto', and <numatune> placement is 'static'
<vcpu placement='auto'>10</vcpu>
<numatune>
<memory mode='strict' nodeset='0-10,^7'/>
</numatune>
2) <vcpu> placement is 'static', and <numatune> placement is 'auto'
<vcpu placement='static' cpuset='0-24,^12'>10</vcpu>
<numatune>
<memory mode='interleave' placement='auto'/>
</numatume>
A follow up patch will change the XML formatting codes to always output
'placement' for <vcpu>, even it's 'static'.
It turns out that when cgroups are enabled, the use of a block device
for a snapshot target was failing with EPERM due to libvirt failing
to add the block device to the cgroup whitelist. See also
https://bugzilla.redhat.com/show_bug.cgi?id=810200
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotUndoSingleDiskActive): Account for cgroup.
(qemuDomainSnapshotCreateDiskActive): Update caller.
qemu's behavior in this case is to change the spice server behavior to
require secure connection to any channel not otherwise specified as
being in plaintext mode. libvirt doesn't currently allow requesting this
(via plaintext-channel=<channel name>).
RHBZ: 819499
Signed-off-by: Alon Levy <alevy@redhat.com>
Until now, the nl_pid of the source address of every message sent by
virNetlinkCommand has been set to the value of getpid(). Most of the
time this doesn't matter, and in the one case where it does
(communication with lldpad), it previously was the proper thing to do,
because the netlink event service (which listens on a netlink socket
for unsolicited messages from lldpad) coincidentally always happened
to bind with a local nl_pid == getpid().
With the fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=816465
that particular nl_pid is now effectively a reserved value, so the
netlink event service will always bind to something else
(coincidentally "getpid() + (1 << 22)", but it really could be
anything). The result is that communication between lldpad and
libvirtd is broken (lldpad gets a "disconnected" error when it tries
to send a directed message).
The solution to this problem caused by a solution, is to query the
netlink event service's nlhandle for its "local_port", and send that
as the source nl_pid (but only when sending to lldpad, of course - in
other cases we maintain the old behavior of sending getpid()).
There are two cases where a message is being directed at lldpad - one
in virNetDevLinkDump, and one in virNetDevVPortProfileOpSetLink.
The case of virNetDevVPortProfileOpSetLink is simplest to explain -
only if !nltarget_kernel, i.e. the message isn't targetted for the
kernel, is the dst_pid set (by calling
virNetDevVPortProfileGetLldpadPid()), so only in that case do we call
virNetlinkEventServiceLocalPid() to set src_pid.
For virNetDevLinkDump, it's a bit more complicated. The call to
virNetDevVPortProfileGetLldpadPid() was effectively up one level (in
virNetDevVPortProfileOpCommon), although obscured by an unnecessary
passing of a function pointer. This patch removes the function
pointer, and calls virNetDevVPortProfileGetLldpadPid() directly in
virNetDevVPortProfileOpCommon - if it's doing this, it knows that it
should also call virNetlinkEventServiceLocalPid() to set src_pid too;
then it just passes src_pid and dst_pid down to
virNetDevLinkDump. Since (src_pid == 0 && dst_pid == 0) implies that
the kernel is the destination, there is no longer any need to send
nltarget_kernel as an arg to virNetDevLinkDump, so it's been removed.
The disparity between src_pid being int and dst_pid being uint32_t may
be a bit disconcerting to some, but I didn't want to complicate
virNetlinkEventServiceLocalPid() by having status returned separately
from the value.
This value will be needed to set the src_pid when sending netlink
messages to lldpad. It is part of the solution to:
https://bugzilla.redhat.com/show_bug.cgi?id=816465
Note that libnl's port generation algorithm guarantees that the
nl_socket_get_local_port() will always be > 0 (since it is "getpid() +
(n << 22>" where n is always < 1024), so it is okay to cast the
uint32_t to int (thus allowing us to use -1 as an error sentinel).
Until now, virNetlinkCommand has assumed that the nl_pid in the source
address of outgoing netlink messages should always be the return value
of getpid(). In most cases it actually doesn't matter, but in the case
of communication with lldpad, lldpad saves this info and later uses it
to send netlink messages back to libvirt. A recent patch to fix Bug
816465 changed the order of the universe such that the netlink event
service socket is no longer bound with nl_pid == getpid(), so lldpad
could no longer send unsolicited messages to libvirtd. Adding src_pid
as an argument to virNetlinkCommand() is the first step in notifying
lldpad of the proper address of the netlink event service socket.
src/qemu/qemu_hostdev.c:
refactor qemuPrepareHostdevUSBDevices function, make it focus on
adding usb device to activeUsbHostdevs after check. After that,
the usb hotplug function qemuDomainAttachHostDevice also could use
it.
expand qemuPrepareHostUSBDevices to perform the usb search,
rollback on failure.
src/qemu/qemu_hotplug.c:
If there are multiple usb devices available with same vendorID and productID,
but with different value of "bus, device", we give an error to let user
use <address> to specify the desired one.
usbFindDevice():get usb device according to
idVendor, idProduct, bus, device
it is the exact match of the four parameters
usbFindDeviceByBus():get usb device according to bus, device
it returns only one usb device same as usbFindDevice
usbFindDeviceByVendor():get usb device according to idVendor,idProduct
it probably returns multiple usb devices.
usbDeviceSearch(): a helper function to do the actual search
When we added the default USB controller into domain XML, we efficiently
broke migration to older versions of libvirt that didn't support USB
controllers at all (0.9.4 and earlier) even for domains that don't use
anything that the older libvirt can't provide. We still want to present
the default USB controller in any XML seen by a user/app but we can
safely remove it from the domain XML used during migration. If we are
migrating to a new enough libvirt, it will add the controller XML back,
while older libvirt won't be confused with it although it will still
tell qemu to create the controller.
Similar approach can be used in the future whenever we find out we
always enabled some kind of device without properly advertising it in
domain XML.
Commit 4c82f09e added a capability check for qemu per-device io
throttling, but only applied it to domain startup. As mentioned
in the previous commit (98cec05), the user can still get an 'internal
error' message during a hotplug attempt, when the monitor command
doesn't exist. It is confusing to allow tuning on inactive domains
only to then be rejected when starting the domain.
* src/qemu/qemu_driver.c (qemuDomainSetBlockIoTune): Reject
offline tuning if online can't match it.
If you have a qemu build that lacks the blockio tune monitor command,
then this command:
$ virsh blkdeviotune rhel6u2 hda --total_bytes_sec 1000
error: Unable to change block I/O throttle
error: internal error Unexpected error
fails as expected (well, the error message is lousy), but the next
dumpxml shows that the domain was modified anyway. Worse, that means
if you save the domain then restore it, the restore will likely fail
due to throttling being unsupported, even though no throttling should
even be active because the monitor command failed in the first place.
* src/qemu/qemu_driver.c (qemuDomainSetBlockIoTune): Check for
error before making modification permanent.
These two functions are called from main() on all platforms, and
always return success on platforms that don't support libnl. They
still log an error message, though, which doesn't make sense - they
should just be NOPs on those platforms. (Per a suggestion during
review, I've turned the logs into debug messages rather than removing
them completely).
Error: STRING_NULL:
/libvirt/src/node_device/node_device_linux_sysfs.c:80:
string_null_argument: Function "saferead" does not terminate string "*buf".
/libvirt/src/util/util.c:101:
string_null_argument: Function "read" fills array "*buf" with a non-terminated string.
/libvirt/src/node_device/node_device_linux_sysfs.c:87:
string_null: Passing unterminated string "buf" to a function expecting a null-terminated string.
Error: STRING_NULL:
/libvirt/src/util/uuid.c:273:
string_null_argument: Function "getDMISystemUUID" does not terminate string "*dmiuuid".
/libvirt/src/util/uuid.c:241:
string_null_argument: Function "saferead" fills array "*uuid" with a non-terminated string.
/libvirt/src/util/util.c:101:
string_null_argument: Function "read" fills array "*buf" with a non-terminated string.
/libvirt/src/util/uuid.c:274:
string_null: Passing unterminated string "dmiuuid" to a function expecting a null-terminated string.
/libvirt/src/util/uuid.c:138:
var_assign_parm: Assigning: "cur" = "uuidstr". They now point to the same thing.
/libvirt/src/util/uuid.c:164:
string_null_sink_loop: Searching for null termination in an unterminated array "cur".
Error: RESOURCE_LEAK:
/libvirt/src/qemu/qemu_driver.c:6968:
alloc_fn: Calling allocation function "calloc".
/libvirt/src/qemu/qemu_driver.c:6968:
var_assign: Assigning: "nodeset" = storage returned from "calloc(1UL, 1UL)".
/libvirt/src/qemu/qemu_driver.c:6977:
noescape: Variable "nodeset" is not freed or pointed-to in function "virTypedParameterAssign".
/libvirt/src/qemu/qemu_driver.c:6997:
leaked_storage: Variable "nodeset" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/libvirt/src/vmx/vmx.c:2431:
alloc_fn: Calling allocation function "calloc".
/libvirt/src/vmx/vmx.c:2431:
var_assign: Assigning: "networkName" = storage returned from "calloc(1UL, 1UL)".
/libvirt/src/vmx/vmx.c:2495:
leaked_storage: Variable "networkName" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/nodeinfo.c:629: alloc_fn: Calling allocation function "fopen".
/builddir/build/BUILD/libvirt-0.9.10/src/nodeinfo.c:629: var_assign: Assigning: "cpuinfo" = storage returned from "fopen("/proc/cpuinfo", "r")".
/builddir/build/BUILD/libvirt-0.9.10/src/nodeinfo.c:638: leaked_storage: Variable "cpuinfo" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/test/test_driver.c:1041: alloc_arg: Calling allocation function "virXPathNodeSet" on "devs".
/builddir/build/BUILD/libvirt-0.9.10/src/util/xml.c:621: alloc_arg: "virAllocN" allocates memory that is stored into "*list".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:129: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:129: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(count, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/util/xml.c:625: noescape: Variable "*list" is not freed or pointed-to in function "memcpy".
/builddir/build/BUILD/libvirt-0.9.10/src/test/test_driver.c:1098: leaked_storage: Variable "devs" going out of scope leaks the storage it points to.
Coverity logs:
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_inotify.c:103: alloc_fn: Calling allocation function "xenDaemonLookupByUUID".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xend_internal.c:2534: alloc_fn: Storage is returned from allocation function "virGetDomain".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:191: alloc_arg: "virAlloc" allocates memory that is stored into "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(1UL, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:210: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xend_internal.c:2534: var_assign: Assigning: "ret" = "virGetDomain(conn, name, uuid)".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xend_internal.c:2541: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_inotify.c:103: var_assign: Assigning: "dom" = storage returned from "xenDaemonLookupByUUID(conn, rawuuid)".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_inotify.c:126: leaked_storage: Variable "dom" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2742: alloc_fn: Calling allocation function "fopen".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2742: var_assign: Assigning: "cpuinfo" = storage returned from "fopen("/proc/cpuinfo", "r")".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2763: noescape: Variable "cpuinfo" is not freed or pointed-to in function "xenHypervisorMakeCapabilitiesInternal".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2574:45: noescape: "xenHypervisorMakeCapabilitiesInternal" does not free or save its pointer parameter "cpuinfo".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2768: leaked_storage: Variable "cpuinfo" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2752: alloc_fn: Calling allocation function "fopen".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2752: var_assign: Assigning: "capabilities" = storage returned from "fopen("/sys/hypervisor/properties/capabilities", "r")".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2763: noescape: Variable "capabilities" is not freed or pointed-to in function "xenHypervisorMakeCapabilitiesInternal".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2574:60: noescape: "xenHypervisorMakeCapabilitiesInternal" does not free or save its pointer parameter "capabilities".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2768: leaked_storage: Variable "capabilities" going out of scope leaks the storage it points to.
Coverity logs:
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:523: alloc_fn: Calling allocation function "fopen".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:523: var_assign: Assigning: "fd" = storage returned from "fopen(local_file, "rb")".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:540: noescape: Variable "fd" is not freed or pointed-to in function "fread".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:542: noescape: Variable "fd" is not freed or pointed-to in function "feof".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:575: leaked_storage: Variable "fd" going out of scope leaks the storage it points to.
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:585: leaked_storage: Variable "fd" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2088: alloc_fn: Calling allocation function "phypVolumeLookupByName".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2026: alloc_fn: Storage is returned from allocation function "virGetStorageVol".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:724: alloc_arg: "virAlloc" allocates memory that is stored into "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(1UL, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:753: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2026: var_assign: Assigning: "vol" = "virGetStorageVol(pool->conn, pool->name, volname, key)".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2030: return_alloc: Returning allocated memory "vol".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2088: leaked_storage: Failing to save storage allocated by "phypVolumeLookupByName(pool, voldef->name)" leaks it.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2725: alloc_fn: Calling allocation function "phypGetStoragePoolLookUpByUUID".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2689: alloc_fn: Storage is returned from allocation function "virGetStoragePool".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:592: alloc_arg: "virAlloc" allocates memory that is stored into "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(1UL, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:610: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2689: var_assign: Assigning: "sp" = "virGetStoragePool(conn, pools[i], uuid)".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2694: return_alloc: Returning allocated memory "sp".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2725: leaked_storage: Failing to save storage allocated by "phypGetStoragePoolLookUpByUUID(conn, def->uuid)" leaks it.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2719: alloc_fn: Calling allocation function "phypStoragePoolLookupByName".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2254: alloc_fn: Storage is returned from allocation function "virGetStoragePool".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:592: alloc_arg: "virAlloc" allocates memory that is stored into "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(1UL, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:610: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2254: return_alloc_fn: Directly returning storage allocated by "virGetStoragePool".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2719: leaked_storage: Failing to save storage allocated by "phypStoragePoolLookupByName(conn, def->name)" leaks it.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2270: alloc_fn: Calling allocation function "phypStoragePoolLookupByName".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2254: alloc_fn: Storage is returned from allocation function "virGetStoragePool".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:592: alloc_arg: "virAlloc" allocates memory that is stored into "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(1UL, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:610: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2254: return_alloc_fn: Directly returning storage allocated by "virGetStoragePool".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2270: var_assign: Assigning: "sp" = storage returned from "phypStoragePoolLookupByName(vol->conn, vol->pool)".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2324: leaked_storage: Variable "sp" going out of scope leaks the storage it points to.
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2327: leaked_storage: Variable "sp" going out of scope leaks the storage it points t
On 32-bit platforms, gcc warns that the comparison between a long
and (ULLONG_MAX/1024/1024) is always false; throwing in a type
conversion shuts up the warning.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Shut gcc up.