This makes the driver fail with a clear error message in case of UUID
collisions (for example if somebody copied a container configuration
without updating the UUID) and also raises an error on other hash map
failures.
OpenVZ itself doesn't complain about duplicate UUIDs since this
parameter is only used by libvirt.
Commit 5e6ce1 moved down detection of the ACPI feature in
qemuParseCommandLine. However, when ACPI is detected, it clears
all feature flags in def->features to only set ACPI. This used to
be fine because this was the first place were def->features was set,
but after the move this is no longer necessarily true because this
block comes before the ACPI check:
if (strstr(def->emulator, "kvm")) {
def->virtType = VIR_DOMAIN_VIRT_KVM;
def->features |= (1 << VIR_DOMAIN_FEATURE_PAE);
}
Since def is allocated in qemuParseCommandLine using VIR_ALLOC, we
can always use |= when modifying def->features
The only useful translation of "%s" as a format string is "%s" (I
suppose you could claim "%1$s" is also valid, but why bother). So
it is not worth translating; fixing this exposes some instances
where we were failing to translate real error messages. This makes
the fix of commit 097da1ab more generic, as well as ensuring no
future regressions.
* cfg.mk (sc_prohibit_useless_translation): New rule.
* src/lxc/lxc_driver.c (lxcSetVcpuBWLive): Fix offender.
* src/openvz/openvz_conf.c (openvzReadFSConf): Likewise.
* src/qemu/qemu_cgroup.c (qemuSetupCgroupForVcpu): Likewise.
* src/qemu/qemu_driver.c (qemuSetVcpusBWLive): Likewise.
* src/xenapi/xenapi_utils.c (xenapiSessionErrorHandle): Likewise.
Instead of changing the existed virFileMakePath to accept mode
argument and modifying a pile of its uses, this patch introduces
virFileMakePathWithMode, and use it instead of mkdir() to create
the readline history dir.
Commit 122fa379de introduces option to
store more than one host entry in a storage pool source definition. That
commit causes a regression, where a check is added that only one host
entry should be present (that actualy is not present as the source
structure was just allocated and zeroed) instead of allocating memory
for the host entry.
As the storage pool sources are stored in a list of structs, the pointer
returned by virStoragePoolSourceListNewSource() shouldn't be freed as it
points in the middle of a memory block. This combined with a regression
that takes the error path every time on caused a double-free abort on
the src struct in question.
Previous commits added code to unmount the existing /proc,
/sys and /dev hierarchies on the root filesystem of the
container. This should only have been done if the container's
root filesystem was the same as the host's root. ie if
the root source is '/'. As it is, this causes LXC containersr
to fail to start if their root source is not '/'
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In preparation for introducing a full RPC protocol for
libvirt_lxc, switch over to using the virNetServer APIs
for the monitor connection
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
While it is not currently used elsewhere in libvirt, the code
for finding a free loop device & associating a file with it
is not LXC specific. Move it into the viffile.{c,h} file where
potentially shared code is more commonly kept.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the cgroup object into virLXCControllerPtr and rename
all the setup methods to include 'Cgroup' in their name
if appropriate
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the monitor FDs into the virLXCControllerPtr object
removing the need for the 'struct lxcMonitor' object
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virLXCControllerRun method is getting a little too large,
and about 50% of its code is related to setting up a /dev/pts
mount. Move the latter out into a dedicated method
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the security manager object into the virLXCControllerPtr
object. Also simplify the code creating it in the first place
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the list of loop device FDs into the virLXCControllerPtr
object and make sure that virLXCControllerStopInit will
close them all
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Turn 'struct lxc_console' into virLXCControllerConsolePtr and make it
a part of virLXCControllerPtr
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Keep a record of the init PID in the virLXCController object
and create a virLXCControllerStopInit method for killing this
process
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the veth device name state into the virLXCControllerPtr
object and stop passing it around. Also use size_t instead
of unsigned int for the array length parameters.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The LXC controller code is having to pass around an ever increasing
number of parameters between methods. To make the code more managable
introduce a virLXCControllerPtr to hold all this state, starting with
the container name and virDomainDefPtr object
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the build of libvirt_lxc will cause recompilation
of all sources under src/util, src/conf, src/security and
more. Switch the libvirt_lxc process to link against the
libtool convenience libraries that are already built as
part of the main libvirt.os & libvirtd build process
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Refactor the RPC server dispatcher code so that if 'max_workers==0'
the entire server will run single threaded. This is useful for
use cases where there will only ever be 1 client connected
which serializes its requests
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The callback that is invoked when a new RPC client is
initialized does not have any opaque parameter. Add
one so that custom data can be passed into the callback
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Recently the Ceph project defaulted auth_supported from 'none' to 'cephx'.
When no auth information was set for Ceph disks this would lead to librados defaulting to
'cephx', but there would be no additional authorization information.
We now explicitly set auth_supported to none when passing down arguments to Qemu.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
This patch adds an internal function vmwareUpdateVMStatus to
update the real state of the domain. This function is used in
various places in the driver, in particular to detect when
the domain has been shut down by the user with the "halt"
command.
QEMU domains were marked as having managed save image even if they were
saved using the regular save. With this patch, domains are marked so
only when using managed save API.
QEMU (and librbd) flush the cache on the source before the
destination starts, and the destination does not read any
changeable data before that, so live migration with rbd caching
is safe.
This makes 'virsh migrate' work with rbd and caching without the
--unsafe flag.
Reported-by: Vladimir Bashkirtsev <vladimir@bashkirtsev.com>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
virDomainBlockStatsFlags can't collect total_time_ns for read/write/flush
because of key typo when retriveing from qemu cmd result
Signed-off-by: lvroyce <lvroyce@linux.vnet.ibm.com>
Reported by Jason Helfman as a build-breaker on FreeBSD.
* src/conf/domain_conf.c (virDomainFSDefParseXML): Use POSIX
spelling.
* src/openvz/openvz_conf.c (openvzReadFSConf): Likewise.
Below patch fixes this coverity report:
/libvirt/src/conf/nwfilter_conf.c:382:
leaked_storage: Variable "varAccess" going out of scope leaks the storage it points to.
Since we are mounting a new /dev in the container, we must
remove any sub-mounts like /dev/shm, /dev/mqueue, etc,
otherwise they'll be recorded in /proc/mounts, but not be
accessible to applications.
Hello,
This is a patch to fix vm's outbound traffic control problem.
Currently, vm's outbound traffic control by libvirt doesn't go well.
This problem was previously discussed at libvir-list ML, however
it seems that there isn't still any answer to the problem.
http://www.redhat.com/archives/libvir-list/2011-August/msg00333.html
I measured Guest(with virtio-net) to Host TCP throughput with the
command "netperf -H".
Here are the outbound QoS parameters and the results.
outbound average rate[kilobytes/s] : Guest to Host throughput[Mbit/s]
======================================================================
1024 (8Mbit/s) : 4.56
2048 (16Mbit/s) : 3.29
4096 (32Mbit/s) : 3.35
8192 (64Mbit/s) : 3.95
16384 (128Mbit/s) : 4.08
32768 (256Mbit/s) : 3.94
65536 (512Mbit/s) : 3.23
The outbound traffic goes down unreasonably and is even not controled.
The cause of this problem is too large mtu value in "tc filter" command run by
libvirt. The command uses burst value to set mtu and the burst is equal to
average rate value if it's not set. This value is too large. For example
if the average rate is set to 1024 kilobytes/s, the mtu value is set to 1024
kilobytes. That's too large compared to the size of network packets.
Here libvirt applies tc ingress filter to Host's vnet(tun) device.
Tc ingress filter is implemented with TBF(Token Buckets Filter) algorithm. TBF
uses mtu value to calculate the amount of token consumed by each packet. With too
large mtu value, the token consumption rate is set too large. This leads to
token starvation and deterioration of TCP throughput.
Then, should we use the default mtu value 2 kilobytes?
The anser is No, because Guest with virtio-net device uses 65536 bytes
as mtu to transmit packets to Host, and the tc filter with the default mtu
value 2k drops packets whose size is larger than 2k. So, the most packets
is droped and again leads to deterioration of TCP throughput.
The appropriate mtu value is 65536 bytes which is equal to the maximum value
of network interface device defined in <linux/netdevice.h>. The value is
not so large that it causes token starvation and not so small that it
drops most packets.
Therefore this patch set the mtu value to 64kb(== 65535 bytes).
Again, here are the outbound QoS parameters and the TCP throughput with
the libvirt patched.
outbound average rate[kilobytes/s] : Guest to Host throughput[Mbit/s]
======================================================================
1024 (8Mbit/s) : 8.22
2048 (16Mbit/s) : 16.42
4096 (32Mbit/s) : 32.93
8192 (64Mbit/s) : 66.85
16384 (128Mbit/s) : 133.88
32768 (256Mbit/s) : 271.01
65536 (512Mbit/s) : 547.32
The outbound traffic conforms to the given limit.
Thank you,
Signed-off-by: Eiichi Tsukata <eiichi.tsukata.xh@hitachi.com>
If the user specified invalid protocol type in a network's SRV record
the error path ended up in freeing uninitialized pointers causing a
daemon crash.
*network_conf.c: virNetworkDNSSrvDefParseXML(): initialize local
variables
VirtualBox doesn't use the common virDomainObj implementation so this
patch adds a separate implementation using the VirtualBox API.
This driver implementation supports all currently defined flags. As
VirtualBox does not support transient guests, managed save images and
autostarting we assume all guests are persistent, don't have a managed
save image and are not autostarted. Filtering for existence of those
properities results in empty list.
mnt_fsname can not be the same, as we check the duplicate pool
sources earlier before, means it can't be the same pool, moreover,
a pool can't be started if it's already active anyway. So no reason
to act as success.
virConnectDomainEventRegisterAny() takes a domain as an argument.
So it should be possible to register the same event (be it
VIR_DOMAIN_EVENT_ID_LIFECYCLE for example) for two different domains.
That is, we need to take domain into account when searching for
duplicate event being already registered.
In order to retrieve some sysinfo data we need to parse /proc/sysinfo and
/proc/cpuinfo.
Signed-off-by: Thang Pham <thang.pham@us.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
For the s390x architecture the sysfs core_id alone is not unique. As a
result it can happen that libvirt thinks there are less host CPUs available
than really present.
Currently, a logical CPU is equivalent to a core for s390x. We therefore
produce a fake core id from the CPU number.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Minimal CPU "parser" for s390 to avoid compile time warning.
Signed-off-by: Thang Pham <thang.pham@us.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Adding CPU encoder/decoder for s390 to avoid runtime error messages.
Signed-off-by: Thang Pham <thang.pham@us.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Starting a KVM guest on s390 fails immediately. This is because
"qemu --help" reports -no-acpi even for the s390(x) architecture but
-no-acpi isn't supported there.
Workaround is to remove QEMU_CAPS_NO_ACPI from the capability set
after the version/capability extraction.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
We used to prefix 'rbd:' to volume names, this is not necessary.
Qemu takes RBD devices in this way, like: qemu -drive rbd:pool/image
When attaching a network disk like RBD to a guest we however do not use this prefix.
Currently you can't map a RBD volume name directly to a domain without removing the prefix.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
If no 'listen' attribute or <listen> element is set in the
guest XML, the default driver configured listen address is
used. There is no way to client applications to determine
what this address is though. When starting the guest, we
should update the live XML to include this default listen
address
Storage is one of the last domains in libvirt where we don't fully
utilize inactive and live XML. Okay, it might be because we don't
have support for that. So implement such support. However, we need
to fallback when talking to old daemon which doesn't support this
new flag called VIR_STORAGE_XML_INACTIVE.
Currently, we share the idea of old & new def with domains. Users can
*-edit an object (domain, pool) which spawns a new internal
representation for them. This is referenced via
{domainObj,poolObj}->newDef [compared to ->def]. However, for pool we
were never overwriting def with newDef. This must be done on
pool-destroy (like we do analogically in domain detroy).
Currently libvirt-lxc checks to see if the destination exists and is a
directory. If it is not a directory then the mount fails. Since
libvirt-lxc can bind mount files on an inode, this patch is needed to
allow us to bind mount files on files. Currently we want to bind mount
on top of /etc/machine-id, and /etc/adjtime
If the destination of the mount point does not exists, it checks if the
src is a directory and then attempts to create a directory, otherwise it
creates an empty file for the destination. The code will then bind mount
over the destination.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some GNULIB headers (eg unistd.h) will often need to include
winsock2.h for various symbols. There is a rule that winsock2.h
must be included before windows.h. This means that any file
which does
#ifdef WIN32
#include <windows.h>
#endif
#include <unistd.h>
is potentially broken. A simple rule is that /all/ includes of
windows.h must be matched with a preceding include of winsock2.h
regardless of whether unistd.h is used currently
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A sanlock lease can be marked as shared (rather
than exclusive) using SANLK_RES_SHARED flag. This
adds support for that flag and ensures that in auto
disk mode, any shared disks use shared leases. This
also makes any read-only disks be completely
ignored.
These changes remove the need for the option
ignore_readonly_and_shared_disks
so that is removed
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently you can configure LXC to bind a host directory to
a guest directory, but not to bind a guest directory to a
guest directory. While the guest container init could do
this itself, allowing it in the libvirt XML means a stricter
SELinux policy can be written
Introduce a new syntax for filesystems to allow use of a RAM
filesystem
<filesystem type='ram'>
<source usage='10' units='MiB'/>
<target dir='/mnt'/>
</filesystem>
The usage units default to KiB to limit consumption of host memory.
* docs/formatdomain.html.in: Document new syntax
* docs/schemas/domaincommon.rng: Add new attributes
* src/conf/domain_conf.c: Parsing/formatting of RAM filesystems
* src/lxc/lxc_container.c: Mounting of RAM filesystems
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The test of ref count is not protected by lock, which is unsafe because
the ref count may have been changed by other threads during the test.
This patch fixes this.
When shutting down libvirtd, the virNetServer shutdown can deadlock
if there are in-flight jobs being handled by virNetServerHandleJob().
virNetServerFree() will acquire the virNetServer lock and call
virThreadPoolFree() to terminate the workers, waiting for the workers
to finish. But in-flight workers will attempt to acquire the
virNetServer lock, resulting in deadlock.
Fix the deadlock by unlocking the virNetServer lock before calling
virThreadPoolFree(). This is safe since the virNetServerPtr object
is ref-counted and only decrementing the ref count needs to be
protected. Additionally, there is no need to re-acquire the lock
after virThreadPoolFree() completes as all the workers have
terminated.
The lxc contoller eventually makes use of virRandomBits(), which was
segfaulting since virRandomInitialize() is never invoked.
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff554d560 in random_r () from /lib64/libc.so.6
(gdb) bt
0 0x00007ffff554d560 in random_r () from /lib64/libc.so.6
1 0x0000000000469eaa in virRandomBits (nbits=32) at util/virrandom.c:80
2 0x000000000045bf69 in virHashCreateFull (size=256,
dataFree=0x4aa2a2 <hashDataFree>, keyCode=0x45bd40 <virHashStrCode>,
keyEqual=0x45bdad <virHashStrEqual>, keyCopy=0x45bdfa <virHashStrCopy>,
keyFree=0x45be37 <virHashStrFree>) at util/virhash.c:134
3 0x000000000045c069 in virHashCreate (size=0, dataFree=0x4aa2a2 <hashDataFree>)
at util/virhash.c:164
4 0x00000000004aa562 in virNWFilterHashTableCreate (n=0)
at conf/nwfilter_params.c:686
5 0x00000000004aa95b in virNWFilterParseParamAttributes (cur=0x711d30)
at conf/nwfilter_params.c:793
6 0x0000000000481a7f in virDomainNetDefParseXML (caps=0x702c90, node=0x7116b0,
ctxt=0x7101b0, bootMap=0x0, flags=0) at conf/domain_conf.c:4589
7 0x000000000048cc36 in virDomainDefParseXML (caps=0x702c90, xml=0x710040,
root=0x7103b0, ctxt=0x7101b0, expectedVirtTypes=16, flags=0)
at conf/domain_conf.c:8658
8 0x000000000048f011 in virDomainDefParseNode (caps=0x702c90, xml=0x710040,
root=0x7103b0, expectedVirtTypes=16, flags=0) at conf/domain_conf.c:9360
9 0x000000000048ee30 in virDomainDefParse (xmlStr=0x0,
filename=0x702ae0 "/var/run/libvirt/lxc/x.xml", caps=0x702c90,
expectedVirtTypes=16, flags=0) at conf/domain_conf.c:9310
10 0x000000000048ef00 in virDomainDefParseFile (caps=0x702c90,
filename=0x702ae0 "/var/run/libvirt/lxc/x.xml", expectedVirtTypes=16, flags=0)
at conf/domain_conf.c:9332
11 0x0000000000425053 in main (argc=5, argv=0x7fffffffe2b8)
at lxc/lxc_controller.c:1773
The comment says:
/* Now create the final dir in the path with the uid/gid/mode
* requested in the config. If the dir already exists, just set
* the perms.
*/
However, virDirCreate is only invoked if the target path doesn't
exist yet (which is opposite with the comment), or the uid from
the config is not -1 (I don't understand why, think it's just
another mistake). And the result is the perms of the pool won't
be changed if one tries to build the pool with different perms
again.
Besides these logic error fix, if no uid and gid are specified in
the config, the practical used uid, gid are reflected.
The two new APIs are rather trivial; based on bits and pieces of
other existing APIs. But rather than blindly return 0 or 1 for
HasMetadata, I chose to first validate that the snapshot in
question in fact exists.
* src/esx/esx_driver.c (esxDomainSnapshotIsCurrent)
(esxDomainSnapshotHasMetadata): New functions.
* src/vbox/vbox_tmpl.c (vboxDomainSnapshotIsCurrent)
(vboxDomainSnapshotHasMetadata): Likewise.
Blindly returning success is misleading if the object no longer
exists; it is a bit better to check for existence up front before
returning information about that object. This pattern matches the
fact that most of our other APIs check for existence as a side
effect prior to getting at the real piece of information being
queried.
* src/esx/esx_driver.c (esxDomainIsUpdated, esxDomainIsPersistent):
Add existence checks.
* src/vbox/vbox_tmpl.c (vboxDomainIsPersistent)
(vboxDomainIsUpdated): Likewise.
This patch adds support for listing all domains into drivers that use
the common virDomainObj implementation: libxl, lxc, openvz, qemu, test,
uml, vmware.
For drivers that don't support managed save images the guests are
treated as if they had none, so filtering guests that do have such an
image on this driver succeeds and produces 0 results.
The two new functions are very similar to the existing functions;
just a matter of different arguments and a call to a different
helper function.
* src/qemu/qemu_driver.c (qemuDomainSnapshotListNames)
(qemuDomainSnapshotNum, qemuDomainSnapshotListChildrenNames)
(qemuDomainSnapshotNumChildren): Support new flags.
(qemuDomainListAllSnapshots): New functions.
Wraps the conversion from 'char *name' to virDomainSnapshotPtr in
a reusable manner.
* src/conf/virdomainlist.h (virDomainListSnapshots): New declaration.
* src/conf/virdomainlist.c (virDomainListSnapshots): Implement it.
* src/libvirt_private.syms (virdomainlist.h): Export it.
The generator doesn't handle lists of virDomainSnapshotPtr, so
this commit requires a bit more work than some RPC additions.
* src/remote/remote_protocol.x
(REMOTE_PROC_DOMAIN_LIST_ALL_SNAPSHOTS)
(REMOTE_PROC_DOMAIN_SNAPSHOT_LIST_ALL_CHILDREN): New RPC calls,
with corresponding structs.
* daemon/remote.c (remoteDispatchDomainListAllSnapshots)
(remoteDispatchDomainSnapshotListAllChildren): New functions.
* src/remote/remote_driver.c (remoteDomainListAllSnapshots)
(remoteDomainSnapshotListAllChildren): Likewise.
* src/remote_protocol-structs: Regenerate.
There was an inherent race between virDomainSnapshotNum() and
virDomainSnapshotListNames(), where an additional snapshot could
be created in the meantime, or where a snapshot could be deleted
before converting the name back to a virDomainSnapshotPtr. It
was also an awkward name: the function operates on domains, not
domain snapshots. virDomainSnapshotListChildrenNames() suffered
from the same inherent race, although its naming was nicer.
This patch makes things nicer by grabbing a snapshot list
atomically, in the format most useful to the user.
* include/libvirt/libvirt.h.in (virDomainListAllSnapshots)
(virDomainSnapshotListAllChildren): New declarations.
* src/libvirt.c (virDomainSnapshotListNames)
(virDomainSnapshotListChildrenNames): Add cross-references.
(virDomainListAllSnapshots, virDomainSnapshotListAllChildren):
New functions.
* src/libvirt_public.syms (LIBVIRT_0.9.13): Export them.
* src/driver.h (virDrvDomainListAllSnapshots)
(virDrvDomainSnapshotListAllChildren): New callbacks.
* python/generator.py (skip_function): Prepare for later
hand-written versions.
It turns out that one-bit filtering makes it hard to select the inverse
set, so it is easier to provide filtering groups. For back-compat,
omitting all bits within a group means the group is not used for
filtering, and by definition of a group (each snapshot matches exactly
one bit within the group, and the set of bits in the group covers all
snapshots), selecting all bits also makes the group useless.
Unfortunately, virDomainSnapshotListChildren defined the bit
VIR_DOMAIN_SNAPSHOT_LIST_DESCENDANTS as an expansion rather than a
filter, so we cannot make it part of a filter group, so that bit
(and its counterpart VIR_DOMAIN_SNAPSHOT_LIST_ROOTS for
virDomainSnapshotList) remains a single control bit.
* include/libvirt/libvirt.h.in (virDomainSnapshotListFlags): Add a
couple more flags.
* src/libvirt.c (virDomainSnapshotNum)
(virDomainSnapshotNumChildren): Document them.
(virDomainSnapshotListNames, virDomainSnapshotListChildrenNames):
Likewise, and add thread-safety caveats.
* src/conf/virdomainlist.h (VIR_DOMAIN_SNAPSHOT_FILTERS_*): New
convenience macros.
* src/conf/domain_conf.c (virDomainSnapshotObjListCopyNames)
(virDomainSnapshotObjListCount): Support the new flags.
Until now, it was possible to crash libvirtd when defining domain with
channel device with missing source element.
When creating new virDomainChrDef, target.port is set to -1, but
unfortunately it is an union with addresses that virDomainChrDefFree
tries to free in case the deviceType is channel. Having the port set
to -1 is intended, however the cleanest way to get around the problems
with the crash seems to be renumbering the VIR_DOMAIN_CHR_CHANNEL_
target types to cover new NONE type (with value 0) being the default
(no target type yet).
Macro virCheckNullArgGoto is supposed to check for NULL argument but
checks non-NULL instead.
Macro virCheckNonNullArgReturn reports error as if the argument should
be NULL when it shouldn't.
when lxcContainerIdentifyCGroups failed, the memory it allocated
has been freed, so we should not free this memory again in
lxcContainerSetupPivortRoot and lxcContainerSetupExtraMounts.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Another case where we can do the same amount of work with fewer
lines of redundant code, which will make adding new filters easier.
* src/conf/domain_conf.c (virDomainSnapshotNameData): Adjust
struct.
(virDomainSnapshotObjListCount): Delete, now taken care of...
(virDomainSnapshotObjListCopyNames): ...here.
(virDomainSnapshotObjListGetNames): Adjust caller to handle
counting.
(virDomainSnapshotObjListNum): Simplify.
Now that domain listing is a thin wrapper around child listing,
it's easier to have a common entry point. This restores the
hashForEach optimization lost in the previous patch when there
are no snapshots being filtered out of the entire list.
* src/conf/domain_conf.h (virDomainSnapshotObjListGetNames)
(virDomainSnapshotObjListNum): Add parameter.
(virDomainSnapshotObjListGetNamesFrom)
(virDomainSnapshotObjListNumFrom): Delete.
* src/libvirt_private.syms (domain_conf.h): Drop deleted functions.
* src/conf/domain_conf.c (virDomainSnapshotObjListGetNames):
Merge, and (re)add an optimization.
* src/qemu/qemu_driver.c (qemuDomainUndefineFlags)
(qemuDomainSnapshotListNames, qemuDomainSnapshotNum)
(qemuDomainSnapshotListChildrenNames)
(qemuDomainSnapshotNumChildren): Update callers.
* src/qemu/qemu_migration.c (qemuMigrationIsAllowed): Likewise.
* src/conf/virdomainlist.c (virDomainListPopulate): Likewise.
This idea was first suggested by Daniel Veillard here:
https://www.redhat.com/archives/libvir-list/2011-October/msg00353.html
Now that I am about to add more complexity to snapshot listing, it
makes sense to avoid code duplication and special casing for domain
listing (all snapshots) vs. snapshot listing (descendants); adding
a metaroot reduces the number of code lines by having the domain
listing turn into a descendant listing of the metaroot.
Note that this has one minor pessimization - if we are going to list
ALL snapshots without filtering, then virHashForeach is more efficient
than recursing through the child relationships; restoring that minor
optimization will occur in the next patch.
* src/conf/domain_conf.h (_virDomainSnapshotObj)
(_virDomainSnapshotObjList): Repurpose some fields.
(virDomainSnapshotDropParent): Drop unused parameter.
* src/conf/domain_conf.c (virDomainSnapshotObjListGetNames)
(virDomainSnapshotObjListCount): Simplify.
(virDomainSnapshotFindByName, virDomainSnapshotSetRelations)
(virDomainSnapshotDropParent): Match new field semantics.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML)
(qemuDomainSnapshotReparentChildren, qemuDomainSnapshotDelete):
Adjust clients.
This patch adds common code to list domains in fashion used by
virListAllDomains with all currently supported flags. The header file
also contains macros that group filters together that are used to
shorten filter conditions.
This patch stores existence of the image in the object. At start of the
daemon the state is checked and then updated in key moments in domain
lifecycle.
This patch wires up the RPC protocol handlers for
virConnectListAllDomains(). The RPC generator has no support for the way
how virConnectListAllDomains() returns the results so the handler code
had to be done manually.
The new api is handled by REMOTE_PROC_CONNECT_LIST_ALL_DOMAINS, with
number 273 and marked with high priority.
This patch adds a new public api that lists domains. The new approach is
different from those used before. There are key points to this:
1) The list is acquired atomically and contains both active and inactive
domains (guests). This eliminates the need to call two different list
APIs, where the state might change in between the calls.
2) The returned list consists of virDomainPtrs instead of names or ID's
that have to be converted to virDomainPtrs anyways using separate calls
for each one of them. This is more convenient and saves hypervisor calls.
3) The returned list is auto-allocated. This saves a lot of hassle for
the users.
4) Built in support for filtering. The API call supports various
filtering flags that modify the output list according to user needs.
Available filter groups:
Domain status:
VIR_CONNECT_LIST_DOMAINS_ACTIVE, VIR_CONNECT_LIST_DOMAINS_INACTIVE
Domain persistence:
VIR_CONNECT_LIST_DOMAINS_PERSISTENT,
VIR_CONNECT_LIST_DOMAINS_TRANSIENT
Domain state:
VIR_CONNECT_LIST_DOMAINS_RUNNING, VIR_CONNECT_LIST_DOMAINS_PAUSED,
VIR_CONNECT_LIST_DOMAINS_SHUTOFF, VIR_CONNECT_LIST_DOMAINS_OTHER
Existence of managed save image:
VIR_CONNECT_LIST_DOMAINS_MANAGEDSAVE,
VIR_CONNECT_LIST_DOMAINS_NO_MANAGEDSAVE
Auto-start option:
VIR_CONNECT_LIST_DOMAINS_AUTOSTART,
VIR_CONNECT_LIST_DOMAINS_NO_AUTOSTART
Existence of snapshot:
VIR_CONNECT_LIST_DOMAINS_HAS_SNAPSHOT,
VIR_CONNECT_LIST_DOMAINS_NO_SNAPSHOT
5) The python binding returns a list of domain objects that is very neat
to work with.
The only problem with this approach is no support from code generators
so both RPC code and python bindings had to be written manually.
*include/libvirt/libvirt.h.in: - add API prototype
- clean up whitespace mistakes nearby
*python/generator.py: - inhibit generation of the bindings for the new
api
*src/driver.h: - add driver prototype
- clean up some whitespace mistakes nearby
*src/libvirt.c: - add public implementation
*src/libvirt_public.syms: - export the new symbol
when libvirt_lxc trigger oom error in lxcContainerGetSubtree
we should free the alloced memory for mounts.
so when lxcContainerGetSubtree failed,we should do some
memory cleanup in lxcContainerUnmountSubtree.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
we alloc the memory for format in lxcContainerMountDetectFilesystem
but without free it in lxcContainerMountFSBlockHelper.
this patch just call VIR_FREE to free it.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
With latest changes to qemu-ga success on some commands is not reported
anymore, e.g. guest-shutdown or guest-suspend-*. However, errors are
still being reported. Therefore, we need to find different source of
indication if operation was successful. Events.
Commit 6e769eba made it a runtime error if libvirt was compiled
without yajl support but targets a new enough qemu. But enough
users are hitting this on self-compiled libvirt that it is worth
erroring out at compilation time, rather than an obscure failure
when trying to use the built executable.
* configure.ac: If qemu is requested and -version works, require
yajl when qemu version is new enough.
* src/qemu/qemu_capabilities.c (qemuCapsComputeCmdFlags): Add
comment.
When libpcap is not available, the NWFilter driver provides a
no-op stub for the DHCP snooping initialization. This was
mistakenly returning '-1' instead of '0', so the entire driver
initialization failed
dump-guest-memory is a new dump mechanism, and it can work when the
guest uses host devices. This patch adds a API to use this new
monitor command.
We will always use json mode if qemu's version is >= 0.15, so I
don't implement the API for text mode.
This reverts
commit c16b4c43fc
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Fri May 11 15:09:27 2012 +0100
Avoid LXC pivot root in the root source is still /
This commit broke setup of /dev, because the code which
deals with setting up a private /dev and /dev/pts only
works if you do a pivotroot.
The original intent of avoiding the pivot root was to
try and ensure the new root has a minimumal mount
tree. The better way todo this is to just unmount the
bits we don't want (ie old /proc & /sys subtrees.
So apply the logic from
commit c529b47a75
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Fri May 11 11:35:28 2012 +0100
Trim /proc & /sys subtrees before mounting new instances
to the pivot_root codepath as well
Libvirt updates the configuration of SPICE server only when something
changes. This is unfortunate when the user wants to disconnect a
existing spice session when the connected attribute is already
"disconnect".
This patch modifies the conditions for calling the password updater to
be called when nothing changes, but the connected attribute is already
"disconnect".
While unescaping the commands the commands passed through to the monitor
function qemuMonitorUnescapeArg() initialized lenght of the input string
to strlen()+1 which is fine for alloc but not for iteration of the
string.
This patch fixes the off-by-one error and drops the pointless check for
a single trailing slash that is automaticaly handled by the default
branch of switch.
commit 52d064f42d added
VIR_NETWORK_XML_INACTIVE in order to allow suppressing the
auto-generated list of VFs in network definitions, and a --inactive
flag to virsh net-dumpxml to take advantage of the flag. However, it
missed out on two opportunities:
1) Use INACTIVE to get the current config of the network as it
exists on disk, rather than the currently active config.
2) Add INACTIVE to the flags used for the virsh net-edit command, so
that it won't include the forward-pool interfaces that were
autogenerated, and so that a re-edit of the network prior to
restarting it will show any other edits made since the last restart
of the network. (prior to this patch, if you edited a network a 2nd
time without restarting, all of the previous edits would magically
disappear).
In order to fit with the new #define-based generic edit function in
virsh.c, a new function vshNetworkGetXMLDesc() was added. This
function first tries to call virNetworkGetXMLDesc with the INACTIVE
flag added, then retries without if the first attempt fails (in the
manner expected when the server doesn't support it).
A core use case of the hook scripts is to be able to do things
to a guest's network configuration. It is possible to hook into
the 'start' operation for a QEMU guest which runs just before
the guest is started. The TAP devices will exist at this point,
but the QEMU process will not. It can be desirable to have a
'started' hook too, which runs once QEMU has started.
If libvirtd is restarted it will re-populate firewall rules,
but there is no QEMU hook to trigger for existing domains.
This is solved with a 'reconnect' hook.
Finally, if attaching to an external QEMU process there needs
to be an 'attach' hook script.
This all also applies to the LXC driver
* docs/hooks.html.in: Document new operations
* src/util/hooks.c, src/util/hooks.c: Add 'started', 'reconnect'
and 'attach' operations for QEMU. Add 'prepare', 'started',
'release' and 'reconnect' operations for LXC
* src/lxc/lxc_driver.c: Add hooks for 'prepare', 'started',
'release' and 'reconnect' operations
* src/qemu/qemu_process.c: Add hooks for 'started', 'reconnect'
and 'reconnect' operations
First 'poll' can't return EWOULDBLOCK, and second, we're checking errno
so far away from the poll() call that we've probably already trashed the
original errno value.
In addition to keepalive responses, we also need to send keepalive
requests from client IO loop to properly detect dead connection in case
a libvirt API is called from the main loop, which prevents any timers to
be called.
The previous commit removed the only usage of ``all'' parameter in
virKeepAliveStopInternal, which was actually the only reason for having
virKeepAliveStopInternal. This effectively reverts most of commit
6446a9e20c.
When a libvirt API is called from the main event loop (which seems to be
common in event-based glib apps), the client IO loop would properly
handle keepalive requests sent by a server but will not actually send
them because the main event loop is blocked with the API. This patch
gets rid of response timer and the thread which is processing keepalive
requests is also responsible for queueing responses for delivery.
Add virKeepAliveTimeout and virKeepAliveTrigger APIs that can be used to
set poll timeouts and trigger keepalive timer. virKeepAliveTrigger
checks if it is called to early and does nothing in that case.
The code that needs to be run every keepalive interval of inactivity was
only called from a timer and thus from the main event loop. We will need
to call the code directly from another place.
As non-blocking calls are no longer dropped, we don't really need to
care that much about their fate and wait for the thread with the buck
to process them. If another thread has the buck, we can just push a
non-blocking call to the queue and be done with it.
So far, we were dropping non-blocking calls whenever sending them would
block. In case a client is sending lots of stream calls (which are not
supposed to generate any reply), the assumption that having other calls
in a queue is sufficient to get a reply from the server doesn't work. I
tried to fix this in b1e374a7ac but
failed and reverted that commit.
With this patch, non-blocking calls are never dropped (unless the
connection is being closed) and will always be sent.
Normally, when every call has a thread associated with it, the thread
may get the buck and be in charge of sending all calls until its own
call is done. When we introduced non-blocking calls, we had to add
special handling of new non-blocking calls. This patch uses event loop
to send data if there is no thread to get the buck so that any
non-blocking calls left in the queue are properly sent without having to
handle them specially. It also avoids adding even more cruft to client
IO loop in the following patches.
With this change in, non-blocking calls may see unpredictable delays in
delivery when the client has no event loop registered. However, the only
non-blocking calls we have are keepalives and we already require event
loop for them, which makes this a non-issue until someone introduces new
non-blocking calls.
'make dist' was depending on *protocol-structs files, which are
stored in git but in turn depended on generated files. We still
want to ship the protocol-structs files, but by renaming the
tests to something not matching a file name, we separate 'make
check' (which depends on the generated file) from 'make dist'
(which only depends on the git files). After all, the tarball
should never depend on a generated file not stored in git.
I found one more case of a git file depending on a generated
file, in a bogus virkeycode.c listing; but at least this one
had no associated rules so it never broke 'make dist'.
Reported by Wen Congyang. Latent bug has been present since
commit 62dee6f, but only recently exposed by commit 7bff56a.
* src/Makefile.am ($(srcdir)/util/virkeycode.c): Drop useless
dependency.
(BUILT_SOURCES): ...and build virkeymaps.h sooner.
(PROTOCOL_STRUCTS): Rather than depend on the struct file...
(check-local): ...convert things into a phony target of...
(check-protocol): ...a new check.
($(srcdir)/remote_protocol-struct): Rename to isolate the distributed
file from the conditional test.
(PDWTAGS): Deal with rename. Swap to compare 'expected actual'.
Currently, if qemuProcessStart fail at some point, e.g. because
domain being started wants a PCI/USB device already assigned to
a different domain, we jump to cleanup label where qemuProcessStop
is performed. This unconditionally calls virSecurityManagerRestoreAllLabel
which is wrong because the other domain is still using those devices.
However, once we successfully label all devices/paths in
qemuProcessStart() from that point on, we have to perform a rollback
on failure - that is - we have to virSecurityManagerRestoreAllLabel.
The two APIs are rather trivial; based on bits and pieces of other
existing APIs. It leaves the door open for future extension to
qemu to report snapshots without metadata based on reading qcow2
internal snapshot names.
* src/qemu/qemu_driver.c (qemuDomainSnapshotIsCurrent)
(qemuDomainSnapshotHasMetadata): New functions.
Right now, starting from just a virDomainSnapshotPtr, and wanting to
know if it is the current snapshot for its respective domain, you have
to use virDomainSnapshotGetDomain(), then virDomainSnapshotCurrent(),
then compare the two names returned by virDomainSnapshotGetName().
It is a bit easier if we can directly query this information from the
snapshot itself.
Right now, it is possible to filter a snapshot listing based on
whether snapshots have metadata that would prevent domain deletion,
but the only way to learn if an individual snapshot has metadata is
to see if that snapshot appears in the list returned by a listing.
Additionally, I hope to expand the qemu driver in a future patch to
use qemu-img to reconstruct snapshot XML corresponding to internal
qcow2 snapshot names not otherwise tracked by libvirt (in part, so
that libvirt can guarantee that new snapshots are not created with
a name that would silently corrupt the existing portion of the qcow2
file); if I ever get that in, then it would no longer be an all-or-none
decision on whether snapshots have metadata, and becomes all the more
important to be able to directly determine that information from a
particular snapshot.
Other query functions (such as virDomainIsActive) do not have a flags
argument, but since virDomainHasCurrentSnapshot takes a flags argument,
I figured it was safer to provide a flags argument here as well.
* include/libvirt/libvirt.h.in (virDomainSnapshotIsCurrent)
(virDomainSnapshotHasMetadata): New declarations.
* src/libvirt.c (virDomainSnapshotIsCurrent)
(virDomainSnapshotHasMetadata): New functions.
* src/libvirt_public.syms (LIBVIRT_0.9.13): Export them.
* src/driver.h (virDrvDomainSnapshotIsCurrent)
(virDrvDomainSnapshotHasMetadata): New driver callbacks.
Right now, the only way to get at the contents of a virBuffer is
to destroy it. But there are cases in my upcoming patches where
peeking at the contents makes life easier. I suppose this does
open up the potential for bad code to dereference a stale pointer,
by disregarding the docs that the return value is invalid on the
next virBuf operation, but such is life.
* src/util/buf.h (virBufferCurrentContent): New declaration.
* src/util/buf.c (virBufferCurrentContent): Implement it.
* src/libvirt_private.syms (buf.h): Export it.
* tests/virbuftest.c (testBufAutoIndent): Test it.
My latest patch for RPC rework (a2c304f687) introduced a memory leak.
virNetMessageEncodeHeader() is calling VIR_ALLOC_N(msg->buffer, ...)
despite fact, that msg->buffer isn't VIR_FREE()'d on all paths calling
the function. Therefore, rather than injecting free statement switch to
VIR_REALLOC_N().
when do remount,the source and target should be the same
values specified in the initial mount() call.
So change fs->dst to src.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
virDomainSnapshotPtr has a refcount member, but no one was able
to use it. Furthermore, all of our other vir*Ptr objects have
a *Ref method to match their *Free method. Thankfully, this is
client-side only, so we can use this new function regardless of
how old the server side is! (I have future patches to virsh
that want to use it.)
* include/libvirt/libvirt.h.in (virDomainSnapshotRef): Declare.
* src/libvirt.c (virDomainSnapshotRef): Implement it.
* src/libvirt_public.syms (LIBVIRT_0.9.13): Export it.
When libvirtd forks off a new child, the child then calls virLogReset(),
which ends up closing file descriptors used as log outputs. However, we
recently started logging closed file descriptors, which means we need to
lock logging mutex which was already locked by virLogReset(). We don't
really want to log anything when we are in the process of closing log
outputs.
For pseries guest, spapr-vlan and spapr-vty is based
on spapr-vio address. According to model of network
device, the address type should be assigned automatically.
For serial device, serial pty device is recognized as
spapr-vty device, which is also on spapr-vio.
So this patch is to correct the address type of
spapr-vlan and spapr-vty, and build correct
command line of spapr-vty.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Reviewed-by: Michael Ellerman<michaele@au1.ibm.com>
There is a theoretical problem of an extreme bug where we can get
into deadlock due to command handshaking. Thanks to a pair of pipes,
we have a situation where the parent thinks the child reported an
error and is waiting for a message from the child to explain the
error; but at the same time the child thinks it reported success
and is waiting for the parent to acknowledge the success; so both
processes are now blocked.
Thankfully, I don't think this deadlock is possible without at
least one other bug in the code, but I did see exactly that sort
of situation prior to commit da831af - I saw a backtrace where a
double close bug in the parent caused the parent to read from the
wrong fd and assume the child failed, even though the child really
sent success.
This potential deadlock is not quite like commit 858c247 (a deadlock
due to multiple readers on one pipe preventing a write from completing),
although the solution is similar - always close unused pipe fds before
blocking, rather than after.
* src/util/command.c (virCommandHandshakeWait): Close unused fds
sooner.
When libvirtd is started and there is an unusable/not-connectable
leftover from earlier started machine, it's more reasonable to say
that the machine "crashed" if we know it was started with
"-no-shutdown".
This patch fixes that and also changes the other result (when machine
was started without "-no-shutdown") to "unknown", because the previous
"failed" reason means (according to include/libvirt/libvirt.h.in:174),
that the machine failed to start.
Commit 7bff56a worked in an incremental build, but fails for a
fresh clone; apparently, if make sees both an actual file
spelling and an inference rule, only the exact spelling is used.
CCLD libvirt_driver_test.la
CC libvirt_driver_remote_la-remote_driver.lo
remote/remote_driver.c:4707:34: fatal error: remote_client_bodies.h: No such file or directory
compilation terminated.
BUILT_SOURCES to the rescue, instead of trying to mess with .lo
dependencies directly.
* src/Makefile.am (REMOTE_DRIVER_PREREQS, %remote_driver.lo): Drop...
(BUILT_SOURCES): ...and add here instead.
Commit 1c275e9a accidentally dropped the storage driver from
libvirtd, because it depended on a C preprocessor macro that
was not defined. Furthermore, if you do './configure
--without-storage-dir --with-storage-disk' or any other combination
where you explicitly build a subset of storage backends excluding
the dir backend, then the build is broken.
Based on analysis by Osier Yang.
* configure.ac (WITH_STORAGE): Define top-level conditional.
* src/Makefile.am (mod_LTLIBRARIES): Build driver even when
storage_dir is disabled.
* daemon/libvirtd.c: Pick up storage driver for any backend, not
just dir.
* daemon/Makefile.am (libvirtd_LDADD): Likewise.
Since we are allocating RPC buffer dynamically, we can increase limits
for max. size of RPC message and RPC string. This is needed to cover
some corner cases where libvirt is run on such huge machines that their
capabilities XML is 4 times bigger than our current limit. This leaves
users with inability to even connect.
Currently, we are allocating buffer for RPC messages statically.
This is not such pain when RPC limits are small. However, if we want
ever to increase those limits, we need to allocate buffer dynamically,
based on RPC message len (= the first 4 bytes). Therefore we will
decrease our mem usage in most cases and still be flexible enough in
corner cases.
We had a distributed file (remote_protocol.h, which in turn was
a prereq to remote_driver.c) depending on a generated file
(libvirt_probes.h), which is a no-no for a VPATH build from a
read-only source tree (no wonder 'make distcheck' tests precisely
that situation):
File `libvirt_driver_remote.la' does not exist.
File `libvirt_driver_remote_la-remote_driver.lo' does not exist.
Prerequisite `libvirt_probes.h' is newer than target `../../src/remote/remote_protocol.h'.
Must remake target `../../src/remote/remote_protocol.h'.
Invoking recipe from Makefile:7464 to update target `../../src/remote/remote_protocol.h'.
make[3]: Entering directory `/home/remote/eblake/libvirt-tmp2/build/libvirt-0.9.12/_build/src'
GEN ../../src/remote/remote_protocol.h
cannot create ../../src/remote/remote_protocol.h: Permission denied at ../../src/rpc/genprotocol.pl line 31.
make[3]: *** [../../src/remote/remote_protocol.h] Error 13
Rather than making distributed .c files depend on generated files, we
really want to ensure that compilation into .lo files is not attempted
until the generated files are present, done by this patch. Since there
were two different sets of conditionally generated files that both
feed the .lo file, I had to introduce a new variable REMOTE_DRIVER_PREREQS
to keep automake happy.
After that fix, the next issue was that make treats './foo' and 'foo'
differently in determining whether an implicit %foo rule is applicable,
with the result that locking/qemu-sanlock.conf wasn't properly being
built at the right times. Also, the output for using the .aug test
files was a bit verbose.
After fixing the src directory, the next error is related to the docs
directory, where the tarball is missing a stamp file and thus tries to
regenerate files that are already present:
GEN ../../docs/apibuild.py.stamp
Traceback (most recent call last):
File "../../docs/apibuild.py", line 2511, in <module>
rebuild("libvirt")
File "../../docs/apibuild.py", line 2495, in rebuild
builder.serialize()
File "../../docs/apibuild.py", line 2424, in serialize
output = open(filename, "w")
IOError: [Errno 13] Permission denied: '../../docs/libvirt-api.xml'
make[5]: *** [../../docs/apibuild.py.stamp] Error 1
and fixing that exposed another case of a distributed file (generated
html) depending on a built file (libvirt.h), but only when doing an
in-tree build, because of a file glob.
* src/Makefile.am ($(srcdir)/remote/remote_driver.c): Change...
(libvirt_driver_remote_la-remote_driver.lo): ...to the real
dependency.
($(builddir)/locking/%-sanlock.conf): Drop $(builddir), so that
rule gets run in time for test_libvirt_sanlock.aug.
(test_libvir*.aug): Cater to silent build.
(conf_DATA): Don't ship qemu-sanlock.conf in the tarball, since it
is trivial to regenerate.
* docs/Makefile.am (EXTRA_DIST): Ship our stamp file.
($(APIBUILD_STAMP)): Don't depend on generated file.
I came across a bug that the command line generated for passthrough
of the host parallel port /dev/parport0 by libvirt for QEMU is incorrect.
It currently produces:
-chardev tty,id=charparallel0,path=/dev/parport0
-device isa-parallel,chardev=charparallel0,id=parallel0
The first parameter is "tty". It sould be "parport".
If I launch qemu with -chardev parport,... it works as expected.
I have already filled a bug report (
https://bugzilla.redhat.com/show_bug.cgi?id=823879 ), the topic was
already on the list some months ago:
https://www.redhat.com/archives/libvirt-users/2011-September/msg00095.html
Signed-off-by: Eric Blake <eblake@redhat.com>
It is possible to deadlock libvirt by having a domain with XML
longer than PIPE_BUF, and by writing a hook script that closes
stdin early. This is because libvirt was keeping a copy of the
child's stdin read fd open, which means the write fd in the
parent will never see EPIPE (remember, libvirt should always be
run with SIGPIPE ignored, so we should never get a SIGPIPE signal).
Since there is no error, libvirt blocks waiting for a write to
complete, even though the only reader is also libvirt. The
solution is to ensure that only the child can act as a reader
before the parent does any writes; and then dealing with the
fallout of dealing with EPIPE.
Thankfully, this is not a security hole - since the only way to
trigger the deadlock is to install a custom hook script, anyone
that already has privileges to install a hook script already has
privileges to do any number of other equally disruptive things
to libvirt; it would only be a security hole if an unprivileged
user could install a hook script to DoS a privileged user.
* src/util/command.c (virCommandRun): Close parent's copy of child
read fd earlier.
(virCommandProcessIO): Don't let EPIPE be fatal; the child may
be done parsing input.
* tests/commandhelper.c (main): Set up a SIGPIPE situation.
* tests/commandtest.c (test20): Trigger it.
* tests/commanddata/test20.log: New file.
Although src/util/viratomic.h has been added to the repo, up until now
it hasn't been used. Stefan Berger is using it in his proposed dhcp
snooping patches, and an rpm build with those patches failed due to
viratomic.h not being packed up with the rest of the sources.
EBADF errors are logged as warnings as they normally indicate a double
close bug. This patch also provides VIR_MASS_CLOSE helper to be user in
the only case of mass close after fork when EBADF should rather be
ignored.
With support for multiple IP addresses per interface in place, this patch
now adds support for multiple IP addresses per interface for the DHCP
snooping code.
Testing:
Since the infrastructure I tested this with does not provide multiple IP
addresses per MAC address (anymore), I either had to plug the VM's interface
from the virtual bride connected directly to the infrastructure to virbr0
to get a 2nd IP address from dnsmasq (kill and run dhclient inside the VM)
or changed the lease file (/var/run/libvirt/network/nwfilter.leases) and
restart libvirtd to have a 2nd IP address on an existing interface.
Note that dnsmasq can take a lease timeout parameter as part of the --dhcp-range
command line parameter, so that timeouts can be tested that way
(--dhcp-range 192.168.122.2,192.168.122.254,120). So, terminating and restarting
dnsmasq with that parameter is another choice to watch an IP address disappear
after 120 seconds.
Regards,
Stefan
The goal of this patch is to prepare for support for multiple IP
addresses per interface in the DHCP snooping code.
Move the code for the IP address map that maps interface names to
IP addresses into their own file. Rename the functions on the way
but otherwise leave the code as-is. Initialize this new layer
separately before dependent layers (iplearning, dhcpsnooping)
and shut it down after them.
This patch adds DHCP snooping support to libvirt. The learning method for
IP addresses is specified by setting the "CTRL_IP_LEARNING" variable to one of
"any" [default] (existing IP learning code), "none" (static only addresses)
or "dhcp" (DHCP snooping).
Active leases are saved in a lease file and reloaded on restart or HUP.
The following interface XML activates and uses the DHCP snooping:
<interface type='bridge'>
<source bridge='virbr0'/>
<filterref filter='clean-traffic'>
<parameter name='CTRL_IP_LEARNING' value='dhcp'/>
</filterref>
</interface>
All filters containing the variable 'IP' are automatically adjusted when
the VM receives an IP address via DHCP. However, multiple IP addresses per
interface are silently ignored in this patch, thus only supporting one IP
address per interface. Multiple IP address support is added in a later
patch in this series.
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Currently, monitoring QEMU virtual machines with standard Unix
sysadmin tools is harder than it has to be. The QEMU command line is
often miles long and mostly redundant, it's hard to tell which process
is which.
This patch reorders the QEMU -name argument to be the first, so it's
immediately visible in "ps x", htop and "atop -c" output.
This patch resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=827519
The problem is that an interface with type='hostdev' will have an
alias of the form "hostdev%d", while the function that looks through
existing netdevs to determine the name to use for a new addition will
fail if there's an existing entry that does not match the form
"net%d".
This is another of the handful of places that need an exception due to
the hybrid nature of <interface type='hostdev'> (which is not exactly
an <interface> or a <hostdev>, but is both at the same time).
If we migrate to fd, spec->fwdType is not MIGRATION_FWD_DIRECT,
we will close spec->dest.fd.local in qemuMigrationRun(). So we
should set spec->dest.fd.local to -1 in qemuMigrationRun().
Bug present since 0.9.5 (commit 326176179).
Wen Congyang reported that we have a double-close bug if we fail
virFDStreamOpenInternal, since childfd duplicated one of the fds[]
array contents. In truth, since we always transfer both members
of fds to other variables, we should close the fds through those
other names, and just use fds[] for pipe().
Bug present since 0.9.0 (commit e886237a).
* src/fdstream.c (virFDStreamOpenFileInternal): Swap scope of
childfd and fds[], to avoid a double close.
KAMEZAWA Hiroyuki reported a nasty double-free bug when virCommand
is used to convert a string into input to a child command. The
problem is that the poll() loop of virCommandProcessIO would close()
the write end of the pipe in order to let the child see EOF, then
the caller virCommandRun() would also close the same fd number, with
the second close possibly nuking an fd opened by some other thread
in the meantime. This in turn can have all sorts of bad effects.
The bug has been present since the introduction of virCommand in
commit f16ad06f.
This is based on his first attempt at a patch, at
https://bugzilla.redhat.com/show_bug.cgi?id=823716
* src/util/command.c (_virCommand): Drop inpipe member.
(virCommandProcessIO): Add argument, to avoid closing caller's fd
without informing caller.
(virCommandRun, virCommandNewArgs): Adjust clients.
Apart from the non-sanlock check build, there is also a little fix for
qemu (EXTRA_DIST had qemu.conf and others inside even if the build was
supposed to be without qemu).
Some of our rules used $(PERL), while others used 'perl'. Always
using the variable allows a developer to point to a different (often
better) perl than the default one found on $PATH.
* daemon/Makefile.am ($(srcdir)/remote_dispatch.h): s/perl/$(PERL).
* src/Makefile.am ($(srcdir)/remote/remote_client_bodies.h)
(PDWTAGS, %protocol.c, %_probes.stp): Likewise.
Without this fix, a VPATH build (such as used by ./autobuild.sh)
fails with messages like:
make[3]: Entering directory `/home/remote/eblake/libvirt-tmp2/build/daemon'
../../build-aux/augeas-gentest.pl libvirtd.conf ../../daemon/test_libvirtd.aug.in test_libvirtd.aug
cannot read libvirtd.conf: No such file or directory at ../../build-aux/augeas-gentest.pl line 38.
Since the test files are not part of the tarball, we can generate
them into the build dir, but rather than create a subdirectory
just for the test file, it is easier to test them directly in
libvirt.git/src.
* daemon/Makefile.am (AUG_GENTEST): Factor out definition.
(test_libvirtd.aug): Look for correct file.
* src/Makefile.am (AUG_GENTEST): Use $(PERL).
(qemu/test_libvirtd_qemu.aug, lxc/test_libvirtd_lxc.aug)
(locking/test_libvirt_sanlock.aug): Rename to avoid subdirectories.
(check-augeas-qemu, check-augeas-lxc, check-augeas-sanlock): Reflect
location of built tests.
* configure.ac (PERL): Substitute perl.
Currently, we are logging only one side of pipes we
create in virCommandRequireHandshake(); This is enough
in cases where pipe2() returns two consecutive FDs. However,
it is not guaranteed and it may return any FDs.
Therefore, it's wise to log the other ends as well.
When getting number of CPUs the host has assigned, there was always
number "1" returned. Even though all lxc domains with no pinning
launched by libvirt run on all pCPUs (by default, no matter what's the
number), we should at least return the same number as the user
specified when creating the domain.
I added libvirt_qemu_probes.h into BUILT_SOURCES. That makes it
generated, but most probably it is not the clearest way how to do
that, but it fixes the build.
The previous patch fixed an incremental build, but missed that on
a fresh checkout, we now have nothing left that stops make from
nuking libvirt_qemu_probes.o.
* src/Makefile.am ($(libvirt_driver_qemu_la_SOURCES)): Delete,
since this variable is empty.
(.PRECIOUS): Add %_probes.o, so they don't get nuked as an
intermediate by-product after creating %_probes.lo.
The moment you specify a _DEPENDENCIES, older automake (stupidly)
assumes that you will specify _all_ dependencies for that target.
This stupidity has been fixed in automake 1.12, but we cannot rely on
newer automake everywhere. For libvirt_la_DEPENDENCIES, we took
care of providing the full list, but for libvirt_qemu_la_DEPENDENCIES,
we were missing the dependency on libvirt_qemu_impl.la, which resulted
in a failed build:
make[3]: Entering directory `/home/ajia/Workspace/libvirt/src'
CCLD libvirt_driver_qemu.la
libtool: link: `libvirt_qemu_probes.lo' is not a valid libtool object
* src/Makefile.am (libvirt_driver_qemu_la_DEPENDENCIES): Delete;
automake does a better job if it does the entire job.
==3240== 23 bytes in 1 blocks are definitely lost in loss record 242 of 744
==3240== at 0x4C2A4CD: malloc (vg_replace_malloc.c:236)
==3240== by 0x8077537: __vasprintf_chk (vasprintf_chk.c:82)
==3240== by 0x509C677: virVasprintf (stdio2.h:199)
==3240== by 0x509C733: virAsprintf (util.c:1912)
==3240== by 0x1906583A: qemudStartup (qemu_driver.c:679)
==3240== by 0x511991D: virStateInitialize (libvirt.c:809)
==3240== by 0x40CD84: daemonRunStateInit (libvirtd.c:751)
==3240== by 0x5098745: virThreadHelper (threads-pthread.c:161)
==3240== by 0x7953D8F: start_thread (pthread_create.c:309)
==3240== by 0x805FF5C: clone (clone.S:115)
To ensure consistent error reporting of invalid arguments,
provide a number of predefined helper methods & macros.
- An arg which must not be NULL:
virCheckNonNullArgReturn(argname, retvalue)
virCheckNonNullArgGoto(argname, label)
- An arg which must be NULL
virCheckNullArgGoto(argname, label)
- An arg which must be positive (ie 1 or greater)
virCheckPositiveArgGoto(argname, label)
- An arg which must not be 0
virCheckNonZeroArgGoto(argname, label)
- An arg which must be zero
virCheckZeroArgGoto(argname, label)
- An arg which must not be negative (ie 0 or greater)
virCheckNonNegativeArgGoto(argname, label)
* src/libvirt.c, src/libvirt-qemu.c,
src/nodeinfo.c, src/datatypes.c: Update to use
virCheckXXXX macros
* po/POTFILES.in: Add libvirt-qemu.c and virterror_internal.h
* src/internal.h: Define macros for checking invalid args
* src/util/virterror_internal.h: Define macros for reporting
invalid args
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Libtool is picky about linking against a module library (aka a .so);
giving lots of warnings like this in the tests directory:
CCLD networkxml2argvtest
*** Warning: Linking the executable networkxml2argvtest against the loadable module
*** libvirt_driver_network.so is not portable!
Fix that by splitting things into a convenience library which can
be used directly by the tests, and making the real .so just wrap
the convenience library.
Based on a suggestion by Daniel P. Berrange.
* configure.ac (--with-driver-modules): Fix help test.
* src/Makefile.am (libvirt_driver_xen.la, libvirt_driver_libxl.la)
(libvirt_driver_qemu.la, libvirt_driver_lxc.la)
(libvirt_driver_uml.la): Factor into new convenience libraries.
* tests/Makefile.am (xen_LDADDS, qemu_LDADDS, lxc_LDADDS)
(networkxml2argvtest_LDADD): Link to convenience libraries, not
shared libraries.
There was no rule forcing libvirt_qemu_probes.o to be built
before libvirt_qemu_probes.lo was used. Also libvirtd was
still referencing the .o file, rather than the .lo file.
Both the .lo and .o file must be listed as DEPENDENCIES,
otherwise libtool will unhelpfully delete the .o file
once the .lo file is created.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The CoTaskMemFree function requires the ole32 DLL to be
linked against. Currently this is only done for the
VirtualBox driver. Also add it to libvirt_util.la
* configure.ac: Unconditionally add ole32 DLL to Win32
* src/Makefile.am: Link old32 to libvirt_util.la
When adding new config file parameters, the corresponding
additions to the augeas lens' are constantly forgotten.
Also there are augeas test cases, these don't catch the
error, since they too are never updated.
To address this, the augeas test cases need to be auto-generated
from the example config files.
* build-aux/augeas-gentest.pl: Helper to generate an
augeas test file, substituting in elements from the
example config files
* src/Makefile.am, daemon/Makefile.am: Switch to
auto-generated augeas test cases
* daemon/test_libvirtd.aug, daemon/test_libvirtd.aug.in,
src/locking/test_libvirt_sanlock.aug,
src/locking/test_libvirt_sanlock.aug.in,
src/lxc/test_libvirtd_lxc.aug,
src/lxc/test_libvirtd_lxc.aug.in,
src/qemu/test_libvirtd_qemu.aug,
src/qemu/test_libvirtd_qemu.aug.in: Remove example
config file data, replacing with a ::CONFIG:: placeholder
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently all the config options are listed under a 'vnc_entry'
group. Create a bunch of new groups & move options to the
right place
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add nmissing 'host_uuid' entry to libvirtd.conf lens and
rename spice_passwd to spice_password in qemu.conf lens
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of doing
# example_config
use
#example_config
so it is possible to programatically uncomment example config
options, as distinct from their comment/descriptions
Also delete rogue trailing comma not allowed by lens
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add an impl of +virGetUserRuntimeDirectory, virGetUserCacheDirectory
virGetUserConfigDirectory and virGetUserDirectory for Win32 platform.
Also create stubs for non-Win32 platforms which lack getpwuid_r()
In adding these two helpers were added virFileIsAbsPath and
virFileSkipRoot, along with some macros VIR_FILE_DIR_SEPARATOR,
VIR_FILE_DIR_SEPARATOR_S, VIR_FILE_IS_DIR_SEPARATOR,
VIR_FILE_PATH_SEPARATOR, VIR_FILE_PATH_SEPARATOR_S
All this code was adapted from GLib2 under terms of LGPLv2+ license.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove the uid param from virGetUserConfigDirectory,
virGetUserCacheDirectory, virGetUserRuntimeDirectory,
and virGetUserDirectory
These functions were universally called with the
results of getuid() or geteuid(). To make it practical
to port to Win32, remove the uid parameter and hardcode
geteuid()
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When you try to connect to a socket in the abstract namespace,
the error will be ECONNREFUSED for a non-listening daemon. With
the non-abstract namespace though, you instead get ENOENT. Add
a check for this extra errno when auto-spawning the daemon
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove a number of pointless checks against PATH_MAX and
add a syntax-check rule to prevent its use in future
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Libtool supports linking directly against .o files on some platforms
(such as Linux), which happens to be the only place where we are
actually doing that (for the dtrace-generated probes.o files). However,
it raises a big stink about the non-portability, even though we don't
attempt it on platforms where it would actually fail:
CCLD libvirt_driver_qemu.la
*** Warning: Linking the shared library libvirt_driver_qemu.la against
the non-libtool
*** objects libvirt_qemu_probes.o is not portable!
This shuts libtool up by creating a proper .lo file that matches
what libtool normally expects.
* src/Makefile.am (%_probes.lo): New rule.
(libvirt_probes.stp, libvirt_qemu_probes.stp): Simplify into...
(%_probes.stp): ...shorter rule.
(CLEANFILES): Clean new .lo files.
(libvirt_la_BUILT_LIBADD, libvirt_driver_qemu_la_LIBADD)
(libvirt_lxc_LDADD, virt_aa_helper_LDADD): Link against .lo file.
* tests/Makefile.am (PROBES_O, qemu_LDADDS): Likewise.
If vdsm is installed and configured in Fedora 17, we add the following
items into qemu.conf:
spice_tls=1
spice_tls_x509_cert_dir="/etc/pki/vdsm/libvirt-spice"
However, after this changes, augtool cannot identify qemu.conf anymore.
Add a VIR_ERR_DOMAIN_LAST sentinel for virErrorDomain and
replace the virErrorDomainName function by a VIR_ENUM_IMPL
In the process the naming of error domains is sanitized
* src/util/virterror.c: Use VIR_ENUM_IMPL for converting
error domains to strings
* include/libvirt/virterror.h: Add VIR_ERR_DOMAIN_LAST
The libvirt_private.syms file exports virNetlinkEventServiceLocalPid
so there needs to be a no-op stub for Win32 to avoid linker errors
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
* daemon/libvirtd.c: Set custom driver module dir if the current
binary name is 'lt-libvirtd' (indicating execution directly
from GIT checkout)
* src/driver.c, src/driver.h, src/libvirt_driver_modules.syms: Add
virDriverModuleInitialize to allow driver module location to
be changed
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When building as driver modules, it is not possible for the QEMU
driver module to reference the DTrace/SystemTAP probes linked into
the main libvirt.so. Thus we need to move the QEMU probes into a
separate file 'libvirt_qemu_probes.d'. Also rename the existing
file from 'probes.d' to 'libvirt_probes.d' while we're at it
* daemon/Makefile.am, src/internal.h: Include libvirt_probes.h
instead of probes.h
* src/Makefile.am: Add rules for libvirt_qemu_probes.d
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor_json.c,
src/qemu/qemu_monitor_text.c: Include libvirt_qemu_probes.h
* src/libvirt_probes.d: Rename from probes.d
* src/libvirt_qemu_probes.d: QEMU specific probes formerly
in probes.d
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Since we have drivers which depend on each other (ie QEMU/LXC
depend on the network driver APIs), we need to use RTLD_GLOBAL
instead of RTLD_LOCAL. While this pollutes the calling binary
with many more symbols, this is no worse than if we directly
link to the drivers, and this only applies to libvirtd
* src/driver.c: s/RTLD_LOCAL/RTLD_GLOBAL/
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Only libvirt_driver_storage.la links to libblkid currently. If
we are running in a scenario with driver modules, LXC must
directly link to it, since it can't assume the storage driver
is present
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The libvirt_test.la library was introduced to allow test suites
to reference internal-only symbols. These days, nearly every
symbol we care about is in src/libvirt_private.syms, so there
is no need for libvirt_test.la to continue to exist
* src/Makefile.am: Delete libvirt_test.la & add new .syms files
* src/libvirt_private.syms: Export symbols needed by test suite
* tests/Makefile.am: Link to libvirt_test.la. Ensure LXC tests link
to network_driver.la
* src/libvirt_esx.syms, src/libvirt_openvz.syms: Add exports needed
by test suite
libvirt_driver_nodedev.la should not link against either
libvirt_util.la or gnulib.la, since libvirt.so brings
in those deps.
* src/Makefile.am: Fix broken linkage of libvirt_driver_nodedev.la
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The driver modules all use symbols which are defined in libvirt.so.
Thus for loading of modules to work, the binary that libvirt.so
is linked to must export its symbols back to modules. If the
libvirt.so itself is dlopen()d then the RTLD_GLOBAL flag must
be set. Unfortunately few, if any, programming languages use
the RTLD_GLOBAL flag when loading modules :-( This means is it
not practical to use driver modules for any libvirt client side
drivers (OpenVZ, VMWare, Hyper-V, Remote client, test).
This patch changes the build process so only server side drivers
are built as modules (Xen, QEMU, LXC, UML)
* daemon/libvirtd.c: Add missing load of 'interface' driver
* src/Makefile.am: Only build server side drivers as modules
* src/libvirt.c: Don't load any driver modules
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
and use it for virDomainParseMemory. This allows to parse arbitrary
scaled value, not only memory related values as needed for the
filesystem limits code following later in this series.
This reverts commit b1e374a7ac, which was
rather bad since I failed to consider all sides of the issue. The main
things I didn't consider properly are:
- a thread which sends a non-blocking call waits for the thread with
the buck to process the call
- the code doesn't expect non-blocking calls to remain in the queue
unless they were already partially sent
Thus, the reverted patch actually breaks more than what it fixes and
clients (which may even be libvirtd during p2p migrations) will likely
end up in a deadlock.
The pciDevice structure corresponding to the device being hot-unplugged
was freed after it was "stolen" from activeList. The pointer was still
used for eg-inactive list. This patch removes the free of the structure
and frees it only if reset fails on the device.
I'm tired of writing:
bool sep = false;
while (...) {
if (sep)
virBufferAddChar(buf, ',');
sep = true;
virBufferAdd(buf, str);
}
This makes it easier, allowing one to write:
while (...)
virBufferAsprintf(buf, "%s,", str);
virBufferTrim(buf, ",", -1);
to trim any remaining comma.
* src/util/buf.h (virBufferTrim): Declare.
* src/util/buf.c (virBufferTrim): New function.
* tests/virbuftest.c (testBufTrim): Test it.
This patch adds support for a new storage backend with RBD support.
RBD is the RADOS Block Device and is part of the Ceph distributed storage
system.
It comes in two flavours: Qemu-RBD and Kernel RBD, this storage backend only
supports Qemu-RBD, thus limiting the use of this storage driver to Qemu only.
To function this backend relies on librbd and librados being present on the
local system.
The backend also supports Cephx authentication for safe authentication with
the Ceph cluster.
For storing credentials it uses the built-in secret mechanism of libvirt.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
When the last reference to a virConnectPtr is released by
libvirtd, it was possible for a deadlock to occur in the
virDomainEventState functions. The virDomainEventStatePtr
holds a reference on virConnectPtr for each registered
callback. When removing a callback, the virUnrefConnect
function is run. If this causes the last reference on the
virConnectPtr to be released, then virReleaseConnect can
be run, which in turns calls qemudClose. This function has
a call to virDomainEventStateDeregisterConn which is intended
to remove all callbacks associated with the virConnectPtr
instance. This will try to grab a lock on virDomainEventState
but this lock is already held. Deadlock ensues
Thread 1 (Thread 0x7fcbb526a840 (LWP 23185)):
Since each callback associated with a virConnectPtr holds a
reference on virConnectPtr, it is impossible for the qemudClose
method to be invoked while any callbacks are still registered.
Thus the call to virDomainEventStateDeregisterConn must in fact
be a no-op. Thus it is possible to just remove all trace of
virDomainEventStateDeregisterConn and avoid the deadlock.
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Delete virDomainEventStateDeregisterConn
* src/libxl/libxl_driver.c, src/lxc/lxc_driver.c,
src/qemu/qemu_driver.c, src/uml/uml_driver.c: Remove
calls to virDomainEventStateDeregisterConn
This patch adds support for the recent ipset iptables extension
to libvirt's nwfilter subsystem. Ipset allows to maintain 'sets'
of IP addresses, ports and other packet parameters and allows for
faster lookup (in the order of O(1) vs. O(n)) and rule evaluation
to achieve higher throughput than what can be achieved with
individual iptables rules.
On the command line iptables supports ipset using
iptables ... -m set --match-set <ipset name> <flags> -j ...
where 'ipset name' is the name of a previously created ipset and
flags is a comma-separated list of up to 6 flags. Flags use 'src' and 'dst'
for selecting IP addresses, ports etc. from the source or
destination part of a packet. So a concrete example may look like this:
iptables -A INPUT -m set --match-set test src,src -j ACCEPT
Since ipset management is quite complex, the idea was to leave ipset
management outside of libvirt but still allow users to reference an ipset.
The user would have to make sure the ipset is available once the VM is
started so that the iptables rule(s) referencing the ipset can be created.
Using XML to describe an ipset in an nwfilter rule would then look as
follows:
<rule action='accept' direction='in'>
<all ipset='test' ipsetflags='src,src'/>
</rule>
The two parameters on the command line are also the two distinct XML attributes
'ipset' and 'ipsetflags'.
FYI: Here is the man page for ipset:
https://ipset.netfilter.org/ipset.man.html
Regards,
Stefan
We were being lazy - virnetlink.c was getting uint32_t as a
side-effect from glibc 2.14's <unistd.h>, but older glibc 2.11
does not provide uint32_t from <unistd.h>. In fact, POSIX states
that <unistd.h> need only provide intptr_t, not all of <stdint.h>,
so the bug really is ours. Reported by Jonathan Alescio.
* src/util/virnetlink.h: Include <stdint.h>.
This involves setting the cpuacct cgroup to a per-vcpu granularity,
as well as summing the each vcpu accounting into a common array.
Now that we are reading more than one cgroup file, we double-check
that cpus weren't hot-plugged between reads to invalidate our
summing.
Signed-off-by: Eric Blake <eblake@redhat.com>
If qemuPrepareHostdevUSBDevices fail it will roll back devices added
to the driver list of used devices. However, if it may fail because
the device is being used already. But then again - with roll back.
Therefore don't try to remove a usb device manually if the function
fail. Although, we want to remove the device if any operation
performed afterwards fail.
Allow the logging APIs to be called with a va_list for format
args, instead of requiring var-args usage.
* src/util/logging.h, src/util/logging.c: Add virLogVMessage
The use of readlink() in lxc_container.c is intentional; we don't
want an absolute pathname there.
* src/util/cgroup.h (VIR_CGROUP_SYSFS_MOUNT): Indent properly.
* cfg.mk (exclude_file_name_regexp--sc_prohibit_readlink): Add
exemption.
One of our latest USB device handling patches
05abd1507d introduced a regression.
That is, we first create a temporary list of all USB devices that
are to be used by domain just starting up. Then we iterate over and
check if a device from the list is in the global list of currently
assigned devices (activeUsbHostdevs). If not, we add it there and
continue with next iteration then. But if a device from temporary
list is either taken already or adding to the activeUsbHostdevs fails,
we remove all devices in temp list from the activeUsbHostdevs list.
Therefore, if a device is already taken we remove it from
activeUsbHostdevs even if we should not. Thus, next time we allow
the device to be assigned to another domain.
Most versions of libselinux do not contain the function
selinux_lxc_contexts_path() that the security driver
recently started using for LXC. We must add a conditional
check for it in configure and then disable the LXC security
driver for builds where libselinux lacks this function.
* configure.ac: Check for selinux_lxc_contexts_path
* src/security/security_selinux.c: Disable LXC security
if selinux_lxc_contexts_path() is missing
Normal practice is for cgroups controllers to be mounted at
/sys/fs/cgroup. When setting up a container, /sys is mounted
with a new sysfs instance, thus we must re-mount all the
cgroups controllers. The complexity is that we must mount
them in the same layout as the host OS. ie if 'cpu' and 'cpuacct'
were mounted at the same location in the host we must preserve
this in the container. Also if any controllers are co-located
we must setup symlinks from the individual controller name to
the co-located mount-point
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Both /proc and /sys may have sub-mounts in them from the host
OS. We must explicitly unmount them all before mounting the
new instance over that location. If we don't then /proc/mounts
will show the sub-mounts as existing, even though nothing will
be able to access them, due to the over-mount.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If the LXC config has a filesystem
<filesystem>
<source dir='/'/>
<target dir='/'/>
</filesystem>
then there is no need to go down the pivot root codepath.
We can simply use the existing root as needed.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently to make sysfs readonly, we remount the existing
instance and then bind it readonly. Unfortunately this means
sysfs is still showing device objects wrt the host OS namespace.
We need it to reflect the container namespace, so we must mount
a completely new instance of it. Do the same for selinuxfs since
there is no benefit to bind mounting & this lets us simplify
the code.
* src/lxc/lxc_container.c: Mount fresh sysfs instance
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of hardcoding use of SELinux contexts in the LXC driver,
switch over to using the official security driver API.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some security drivers require special options to be passed to
the mount system call. Add a security driver API for handling
this data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The SELinux policy for LXC uses a different configuration file
than the traditional svirt one. Thus we need to load
/etc/selinux/targeted/contexts/lxc_contexts which contains
something like this:
process = "system_u:system_r:svirt_lxc_net_t:s0"
file = "system_u:object_r:svirt_lxc_file_t:s0"
content = "system_u:object_r:virt_var_lib_t:s0"
cleverly designed to be parsable by virConfPtr
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the SELinux driver stores its state in a set of global
variables. This switches it to use a private data struct instead.
This will enable different instances to have their own data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The AppArmour driver does not currently have support for LXC
so ensure that when probing, it claims to be disabled
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To allow the security drivers to apply different configuration
information per hypervisor, pass the virtualization driver name
into the security manager constructor.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Thanks to this new option we are now able to use modern CPU models (such
as Westmere) defined in external configuration file.
The qemu-1.1{,-device} data files for qemuhelptest are filled in with
qemu-1.1-rc2 output for now. I will update those files with real
qemu-1.1 output once it is released.
The uhci1, uhci2, uhci3 companion controllers for ehci1 must
have a master start port set. Since this value is predictable
we should set it automatically if the app does not supply it
Currently each USB2 companion controller gets put on a separate
PCI slot. Not only is this wasteful of PCI slots, but it is not
in compliance with the spec for USB2 controllers. The master
echi1 and all companion controllers should be in the same slot,
with echi1 in function 7, and uhci1-3 in functions 0-2 respectively.
* src/qemu/qemu_command.c: Special case handling of USB2 controllers
to apply correct pci slot assignment
* tests/qemuxml2argvdata/qemuxml2argv-usb-ich9-ehci-addr.args,
tests/qemuxml2argvdata/qemuxml2argv-usb-ich9-ehci-addr.xml: Expand
test to cover automatic slot assignment
The virDomainDeviceInfoIsSet API was only checking if an
address or alias was set in the struct. Thus if only a
rom bar setting / filename, boot index, or USB master
value was set, they could be accidentally dropped when
formatting XML
Callers of virGetUser{Config,Runtime,Cache}Directory all
append further path component. We should not be
adding a trailing slash in the return path otherwise we
get paths containing '//'
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Sometimes it is useful to see the callpath for log messages.
This change enhances the log filter syntax so that stack traces
can be show by setting '1:+NAME' instead of '1:NAME'.
This results in output like:
2012-05-09 14:18:45.136+0000: 13314: debug : virInitialize:414 : register drivers
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(virInitialize+0xd6)[0x7f89188ebe86]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x431921]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x3a21e21735]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x40a279]
2012-05-09 14:18:45.136+0000: 13314: debug : virRegisterDriver:775 : driver=0x7f8918d02760 name=Test
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(virRegisterDriver+0x6b)[0x7f89188ec717]
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(+0x11b3ad)[0x7f891891e3ad]
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(virInitialize+0xf3)[0x7f89188ebea3]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x431921]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x3a21e21735]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x40a279]
* docs/logging.html.in: Document new syntax
* configure.ac: Check for execinfo.h
* src/util/logging.c, src/util/logging.h: Add support for
stack traces
* tests/testutils.c: Adapt to API change
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The current unprivileged user libvirtd sockets are in the abstract
namespace. This has a number of problems
- You can't connect to them remotely using the nc/ssh tunnel
- This is not portable for OS-X, BSD & probably others
- Parent directory permissions don't apply
"Instead of developing one CPU with 12 cores, the Magny Cours is
actually two 6 core “Bulldozer” CPUs combined in to one package"
I.e, each package has two NUMA nodes, and the two numa nodes share
the same core ID set (0-6), which means parsing the cores number
from sysfs doesn't work in this case.
And the wrong CPU number could cause three problems for libvirt:
1) performance lost
A domain without "cpuset" or "placement='auto'" (to drive numad)
specified will be only pinned to part of the CPUs.
2) domain can be started
If a domain uses numad, and the advisory nodeset returned from
numad contains node which exceeds the range of wrong total CPU
number. The domain will fail to start, as the bitmask passed to
sched_setaffinity could be fully filled with zero.
3) wrong CPU number affects lots of stuffs.
E.g. for command "virsh vcpuinfo", "virsh vcpupin", it will always
output with the truncated CPU list.
For more details:
https://www.redhat.com/archives/libvir-list/2012-May/msg00607.html
This patch is to fix the problem by parsing /proc/cpuinfo to get
the value of field "cpu cores", and use it as nodeinfo->cores if
it's greater than the cores number from sysfs.
Like for 'static' placement, when the memory policy mode is
'strict', set the memory policy by writing the advisory nodeset
returned from numad to cgroup file cpuset.mems,
On some of the NUMA platforms, the CPU index in each NUMA node
grows non-consecutive. While on other platforms, it can be inconsecutive,
E.g.
% numactl --hardware
available: 4 nodes (0-3)
node 0 cpus: 0 4 8 12 16 20 24 28
node 0 size: 131058 MB
node 0 free: 86531 MB
node 1 cpus: 1 5 9 13 17 21 25 29
node 1 size: 131072 MB
node 1 free: 127070 MB
node 2 cpus: 2 6 10 14 18 22 26 30
node 2 size: 131072 MB
node 2 free: 127758 MB
node 3 cpus: 3 7 11 15 19 23 27 31
node 3 size: 131072 MB
node 3 free: 127226 MB
node distances:
node 0 1 2 3
0: 10 20 20 20
1: 20 10 20 20
2: 20 20 10 20
3: 20 20 20 10
This patch is to fix the problem by using the CPU index in
caps->host.numaCell[i]->cpus[i] to set the bitmask instead of
assuming the CPU index of the NUMA nodes are always sequential.
For pseries guest, the default controller model is
ibmvscsi controller, this controller only can work
on spapr-vio address.
This patch is to assign spapr-vio address type to
ibmvscsi controller and correct vscsi test case.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
We had previously weakened our nodeinfotest in order to ignore parsed
node values, because the parse function was mistakenly relying on
host files. A better fix is to avoid using the numactl library, but
to instead parse the same files that numactl would read, all while
allowing the files to be relative to our choice of directory.
* src/nodeinfo.c (CPU_SYS_PATH, NODE_SYS_PATH): Replace with...
(SYSFS_SYSTEM_PATH): ...parent directory.
(linuxNodeInfoCPUPopulate): Check NUMA nodes from requested
directory (by inlining numactl code).
(nodeGetCPUmap, nodeGetMemoryStats): Adjust macro use.
* tests/nodeinfotest.c (linuxTestCompareFiles, linuxTestNodeInfo):
Update test to match.
We were wasting time to malloc a copy of a constant string, then
copy it into static storage, for every call to nodeGetInfo. At
least we were lucky that it was a constant source, and thus not
subject to even worse issues with one thread clobbering the static
storage while another was using it. This gets rid of the waste,
by passing the string through the stack instead, as well as renaming
internal functions to better match our conventions.
* src/nodeinfo.c (sysfs_path): Delete.
(get_cpu_value, count_thread_siblings, parse_socket): Add
parameter, and rename...
(virNodeGetCpuValue, virNodeCountThreadSiblings)
(virNodeParseSocket): ... into a common namespace.
(cpu_online, parse_core): Inline into callers.
(linuxNodeInfoCPUPopulate): Update caller.
(nodeGetInfo): Drop a useless malloc.
Commit cdce2f42d tried to silence a compiler warning on 32-bit builds,
but the gcc shipped with RHEL 5 is old enough that the type conversion
via multiplication by 1 was insufficient for the task.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Previous attempt
didn't get past all gcc versions.
As defined in:
http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
This offers a number of advantages:
* Allows sharing a home directory between different machines, or
sessions (eg. using NFS)
* Cleanly separates cache, runtime (eg. sockets), or app data from
user settings
* Supports performing smart or selective migration of settings
between different OS versions
* Supports reseting settings without breaking things
* Makes it possible to clear cache data to make room when the disk
is filling up
* Allows us to write a robust and efficient backup solution
* Allows an admin flexibility to change where data and settings are stored
* Dramatically reduces the complexity and incoherence of the
system for administrators
Appending an item to a list transfers ownership of that item to the
list owner. But an error can occur in between item allocation and
appending it to the list. In this case the item has to be freed
explicitly. This was not done in some special cases resulting in
possible memory leaks.
Reported by Coverity.
This patch lifts the limit of calling thread detection code only on KVM
guests. With upstream qemu the thread mappings are reported also on
non-KVM machines.
QEMU adopted the thread_id information from the kvm branch.
To remain compatible with older upstream versions of qemu the check is
attempted but the failure to detect threads (or even run the monitor
command - on older versions without SMP support) is treated non-fatal
and the code reports one vCPU with pid of the hypervisor (in same
fashion this was done on non-KVM guests).
After a cpu hotplug the qemu driver did not refresh information about
virtual processors used by qemu and their corresponding threads. This
patch forces a re-detection as is done on start of QEMU.
This ensures that correct information is reported by the
virDomainGetVcpus API and "virsh vcpuinfo".
A failure to obtain the thread<->vcpu mapping is treated non-fatal and
the mapping is not updated in a case of failure as not all versions of
QEMU report this in the info cpus command.
This patch changes a switch statement into ifs when handling live vs.
configuration modifications getting rid of redundant code in case when
both live and persistent configuration gets changed.
when failing to attach another usb device to a domain for some reason
which has one use device attached before, the libvirtd crashed.
The crash is caused by null-pointer dereference error in invoking
usbDeviceListSteal passed in NULL value usb variable.
commit 05abd1507d introduces the bug.
Detected by valgrind. Leaks are introduced in commit 122fa379.
src/conf/storage_conf.c: fix memory leaks.
How to reproduce?
$ make && make -C tests check TESTS=storagepoolxml2xmltest
$ cd tests && valgrind -v --leak-check=full ./storagepoolxml2xmltest
actual result:
==28571== LEAK SUMMARY:
==28571== definitely lost: 40 bytes in 5 blocks
==28571== indirectly lost: 0 bytes in 0 blocks
==28571== possibly lost: 0 bytes in 0 blocks
==28571== still reachable: 1,054 bytes in 21 blocks
==28571== suppressed: 0 bytes in 0 blocks
Signed-off-by: Alex Jia <ajia@redhat.com>
No useful error was being reported when an invalid character device
target type is specified in the domainXML. E.g.
...
<console type="pty">
<source path="/dev/pts/2"/>
<target type="kvm" port="0"/>
</console>
...
resulted in
error: Failed to define domain from x.xml
error: An error occurred, but the cause is unknown
With this small patch, the error is more helpful
error: Failed to define domain from x.xml
error: XML error: unknown target type 'kvm' specified for character device
<vcpu> is not an optional node. The value for its 'placement'
actually always defaults to 'static' in the underlying codes.
(Even no 'cpuset' and 'placement' is specified, the domain
process will be pinned to all the available pCPUs).
Though numad will manage the memory allocation of task dynamically,
it wants management application (libvirt) to pre-set the memory
policy according to the advisory nodeset returned from querying numad,
(just like pre-bind CPU nodeset for domain process), and thus the
performance could benefit much more from it.
This patch introduces new XML tag 'placement', value 'auto' indicates
whether to set the memory policy with the advisory nodeset from numad,
and its value defaults to the value of <vcpu> placement, or 'static'
if 'nodeset' is specified. Example of the new XML tag's usage:
<numatune>
<memory placement='auto' mode='interleave'/>
</numatune>
Just like what current "numatune" does, the 'auto' numa memory policy
setting uses libnuma's API too.
If <vcpu> "placement" is "auto", and <numatune> is not specified
explicitly, a default <numatume> will be added with "placement"
set as "auto", and "mode" set as "strict".
The following XML can now fully drive numad:
1) <vcpu> placement is 'auto', no <numatune> is specified.
<vcpu placement='auto'>10</vcpu>
2) <vcpu> placement is 'auto', no 'placement' is specified for
<numatune>.
<vcpu placement='auto'>10</vcpu>
<numatune>
<memory mode='interleave'/>
</numatune>
And it's also able to control the CPU placement and memory policy
independently. e.g.
1) <vcpu> placement is 'auto', and <numatune> placement is 'static'
<vcpu placement='auto'>10</vcpu>
<numatune>
<memory mode='strict' nodeset='0-10,^7'/>
</numatune>
2) <vcpu> placement is 'static', and <numatune> placement is 'auto'
<vcpu placement='static' cpuset='0-24,^12'>10</vcpu>
<numatune>
<memory mode='interleave' placement='auto'/>
</numatume>
A follow up patch will change the XML formatting codes to always output
'placement' for <vcpu>, even it's 'static'.
It turns out that when cgroups are enabled, the use of a block device
for a snapshot target was failing with EPERM due to libvirt failing
to add the block device to the cgroup whitelist. See also
https://bugzilla.redhat.com/show_bug.cgi?id=810200
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotUndoSingleDiskActive): Account for cgroup.
(qemuDomainSnapshotCreateDiskActive): Update caller.
qemu's behavior in this case is to change the spice server behavior to
require secure connection to any channel not otherwise specified as
being in plaintext mode. libvirt doesn't currently allow requesting this
(via plaintext-channel=<channel name>).
RHBZ: 819499
Signed-off-by: Alon Levy <alevy@redhat.com>
Until now, the nl_pid of the source address of every message sent by
virNetlinkCommand has been set to the value of getpid(). Most of the
time this doesn't matter, and in the one case where it does
(communication with lldpad), it previously was the proper thing to do,
because the netlink event service (which listens on a netlink socket
for unsolicited messages from lldpad) coincidentally always happened
to bind with a local nl_pid == getpid().
With the fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=816465
that particular nl_pid is now effectively a reserved value, so the
netlink event service will always bind to something else
(coincidentally "getpid() + (1 << 22)", but it really could be
anything). The result is that communication between lldpad and
libvirtd is broken (lldpad gets a "disconnected" error when it tries
to send a directed message).
The solution to this problem caused by a solution, is to query the
netlink event service's nlhandle for its "local_port", and send that
as the source nl_pid (but only when sending to lldpad, of course - in
other cases we maintain the old behavior of sending getpid()).
There are two cases where a message is being directed at lldpad - one
in virNetDevLinkDump, and one in virNetDevVPortProfileOpSetLink.
The case of virNetDevVPortProfileOpSetLink is simplest to explain -
only if !nltarget_kernel, i.e. the message isn't targetted for the
kernel, is the dst_pid set (by calling
virNetDevVPortProfileGetLldpadPid()), so only in that case do we call
virNetlinkEventServiceLocalPid() to set src_pid.
For virNetDevLinkDump, it's a bit more complicated. The call to
virNetDevVPortProfileGetLldpadPid() was effectively up one level (in
virNetDevVPortProfileOpCommon), although obscured by an unnecessary
passing of a function pointer. This patch removes the function
pointer, and calls virNetDevVPortProfileGetLldpadPid() directly in
virNetDevVPortProfileOpCommon - if it's doing this, it knows that it
should also call virNetlinkEventServiceLocalPid() to set src_pid too;
then it just passes src_pid and dst_pid down to
virNetDevLinkDump. Since (src_pid == 0 && dst_pid == 0) implies that
the kernel is the destination, there is no longer any need to send
nltarget_kernel as an arg to virNetDevLinkDump, so it's been removed.
The disparity between src_pid being int and dst_pid being uint32_t may
be a bit disconcerting to some, but I didn't want to complicate
virNetlinkEventServiceLocalPid() by having status returned separately
from the value.
This value will be needed to set the src_pid when sending netlink
messages to lldpad. It is part of the solution to:
https://bugzilla.redhat.com/show_bug.cgi?id=816465
Note that libnl's port generation algorithm guarantees that the
nl_socket_get_local_port() will always be > 0 (since it is "getpid() +
(n << 22>" where n is always < 1024), so it is okay to cast the
uint32_t to int (thus allowing us to use -1 as an error sentinel).
Until now, virNetlinkCommand has assumed that the nl_pid in the source
address of outgoing netlink messages should always be the return value
of getpid(). In most cases it actually doesn't matter, but in the case
of communication with lldpad, lldpad saves this info and later uses it
to send netlink messages back to libvirt. A recent patch to fix Bug
816465 changed the order of the universe such that the netlink event
service socket is no longer bound with nl_pid == getpid(), so lldpad
could no longer send unsolicited messages to libvirtd. Adding src_pid
as an argument to virNetlinkCommand() is the first step in notifying
lldpad of the proper address of the netlink event service socket.
src/qemu/qemu_hostdev.c:
refactor qemuPrepareHostdevUSBDevices function, make it focus on
adding usb device to activeUsbHostdevs after check. After that,
the usb hotplug function qemuDomainAttachHostDevice also could use
it.
expand qemuPrepareHostUSBDevices to perform the usb search,
rollback on failure.
src/qemu/qemu_hotplug.c:
If there are multiple usb devices available with same vendorID and productID,
but with different value of "bus, device", we give an error to let user
use <address> to specify the desired one.
usbFindDevice():get usb device according to
idVendor, idProduct, bus, device
it is the exact match of the four parameters
usbFindDeviceByBus():get usb device according to bus, device
it returns only one usb device same as usbFindDevice
usbFindDeviceByVendor():get usb device according to idVendor,idProduct
it probably returns multiple usb devices.
usbDeviceSearch(): a helper function to do the actual search
When we added the default USB controller into domain XML, we efficiently
broke migration to older versions of libvirt that didn't support USB
controllers at all (0.9.4 and earlier) even for domains that don't use
anything that the older libvirt can't provide. We still want to present
the default USB controller in any XML seen by a user/app but we can
safely remove it from the domain XML used during migration. If we are
migrating to a new enough libvirt, it will add the controller XML back,
while older libvirt won't be confused with it although it will still
tell qemu to create the controller.
Similar approach can be used in the future whenever we find out we
always enabled some kind of device without properly advertising it in
domain XML.
Commit 4c82f09e added a capability check for qemu per-device io
throttling, but only applied it to domain startup. As mentioned
in the previous commit (98cec05), the user can still get an 'internal
error' message during a hotplug attempt, when the monitor command
doesn't exist. It is confusing to allow tuning on inactive domains
only to then be rejected when starting the domain.
* src/qemu/qemu_driver.c (qemuDomainSetBlockIoTune): Reject
offline tuning if online can't match it.
If you have a qemu build that lacks the blockio tune monitor command,
then this command:
$ virsh blkdeviotune rhel6u2 hda --total_bytes_sec 1000
error: Unable to change block I/O throttle
error: internal error Unexpected error
fails as expected (well, the error message is lousy), but the next
dumpxml shows that the domain was modified anyway. Worse, that means
if you save the domain then restore it, the restore will likely fail
due to throttling being unsupported, even though no throttling should
even be active because the monitor command failed in the first place.
* src/qemu/qemu_driver.c (qemuDomainSetBlockIoTune): Check for
error before making modification permanent.
These two functions are called from main() on all platforms, and
always return success on platforms that don't support libnl. They
still log an error message, though, which doesn't make sense - they
should just be NOPs on those platforms. (Per a suggestion during
review, I've turned the logs into debug messages rather than removing
them completely).
Error: STRING_NULL:
/libvirt/src/node_device/node_device_linux_sysfs.c:80:
string_null_argument: Function "saferead" does not terminate string "*buf".
/libvirt/src/util/util.c:101:
string_null_argument: Function "read" fills array "*buf" with a non-terminated string.
/libvirt/src/node_device/node_device_linux_sysfs.c:87:
string_null: Passing unterminated string "buf" to a function expecting a null-terminated string.
Error: STRING_NULL:
/libvirt/src/util/uuid.c:273:
string_null_argument: Function "getDMISystemUUID" does not terminate string "*dmiuuid".
/libvirt/src/util/uuid.c:241:
string_null_argument: Function "saferead" fills array "*uuid" with a non-terminated string.
/libvirt/src/util/util.c:101:
string_null_argument: Function "read" fills array "*buf" with a non-terminated string.
/libvirt/src/util/uuid.c:274:
string_null: Passing unterminated string "dmiuuid" to a function expecting a null-terminated string.
/libvirt/src/util/uuid.c:138:
var_assign_parm: Assigning: "cur" = "uuidstr". They now point to the same thing.
/libvirt/src/util/uuid.c:164:
string_null_sink_loop: Searching for null termination in an unterminated array "cur".
Error: RESOURCE_LEAK:
/libvirt/src/qemu/qemu_driver.c:6968:
alloc_fn: Calling allocation function "calloc".
/libvirt/src/qemu/qemu_driver.c:6968:
var_assign: Assigning: "nodeset" = storage returned from "calloc(1UL, 1UL)".
/libvirt/src/qemu/qemu_driver.c:6977:
noescape: Variable "nodeset" is not freed or pointed-to in function "virTypedParameterAssign".
/libvirt/src/qemu/qemu_driver.c:6997:
leaked_storage: Variable "nodeset" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/libvirt/src/vmx/vmx.c:2431:
alloc_fn: Calling allocation function "calloc".
/libvirt/src/vmx/vmx.c:2431:
var_assign: Assigning: "networkName" = storage returned from "calloc(1UL, 1UL)".
/libvirt/src/vmx/vmx.c:2495:
leaked_storage: Variable "networkName" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/nodeinfo.c:629: alloc_fn: Calling allocation function "fopen".
/builddir/build/BUILD/libvirt-0.9.10/src/nodeinfo.c:629: var_assign: Assigning: "cpuinfo" = storage returned from "fopen("/proc/cpuinfo", "r")".
/builddir/build/BUILD/libvirt-0.9.10/src/nodeinfo.c:638: leaked_storage: Variable "cpuinfo" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/test/test_driver.c:1041: alloc_arg: Calling allocation function "virXPathNodeSet" on "devs".
/builddir/build/BUILD/libvirt-0.9.10/src/util/xml.c:621: alloc_arg: "virAllocN" allocates memory that is stored into "*list".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:129: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:129: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(count, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/util/xml.c:625: noescape: Variable "*list" is not freed or pointed-to in function "memcpy".
/builddir/build/BUILD/libvirt-0.9.10/src/test/test_driver.c:1098: leaked_storage: Variable "devs" going out of scope leaks the storage it points to.
Coverity logs:
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_inotify.c:103: alloc_fn: Calling allocation function "xenDaemonLookupByUUID".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xend_internal.c:2534: alloc_fn: Storage is returned from allocation function "virGetDomain".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:191: alloc_arg: "virAlloc" allocates memory that is stored into "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(1UL, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:210: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xend_internal.c:2534: var_assign: Assigning: "ret" = "virGetDomain(conn, name, uuid)".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xend_internal.c:2541: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_inotify.c:103: var_assign: Assigning: "dom" = storage returned from "xenDaemonLookupByUUID(conn, rawuuid)".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_inotify.c:126: leaked_storage: Variable "dom" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2742: alloc_fn: Calling allocation function "fopen".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2742: var_assign: Assigning: "cpuinfo" = storage returned from "fopen("/proc/cpuinfo", "r")".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2763: noescape: Variable "cpuinfo" is not freed or pointed-to in function "xenHypervisorMakeCapabilitiesInternal".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2574:45: noescape: "xenHypervisorMakeCapabilitiesInternal" does not free or save its pointer parameter "cpuinfo".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2768: leaked_storage: Variable "cpuinfo" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2752: alloc_fn: Calling allocation function "fopen".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2752: var_assign: Assigning: "capabilities" = storage returned from "fopen("/sys/hypervisor/properties/capabilities", "r")".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2763: noescape: Variable "capabilities" is not freed or pointed-to in function "xenHypervisorMakeCapabilitiesInternal".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2574:60: noescape: "xenHypervisorMakeCapabilitiesInternal" does not free or save its pointer parameter "capabilities".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2768: leaked_storage: Variable "capabilities" going out of scope leaks the storage it points to.
Coverity logs:
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:523: alloc_fn: Calling allocation function "fopen".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:523: var_assign: Assigning: "fd" = storage returned from "fopen(local_file, "rb")".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:540: noescape: Variable "fd" is not freed or pointed-to in function "fread".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:542: noescape: Variable "fd" is not freed or pointed-to in function "feof".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:575: leaked_storage: Variable "fd" going out of scope leaks the storage it points to.
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:585: leaked_storage: Variable "fd" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2088: alloc_fn: Calling allocation function "phypVolumeLookupByName".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2026: alloc_fn: Storage is returned from allocation function "virGetStorageVol".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:724: alloc_arg: "virAlloc" allocates memory that is stored into "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(1UL, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:753: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2026: var_assign: Assigning: "vol" = "virGetStorageVol(pool->conn, pool->name, volname, key)".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2030: return_alloc: Returning allocated memory "vol".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2088: leaked_storage: Failing to save storage allocated by "phypVolumeLookupByName(pool, voldef->name)" leaks it.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2725: alloc_fn: Calling allocation function "phypGetStoragePoolLookUpByUUID".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2689: alloc_fn: Storage is returned from allocation function "virGetStoragePool".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:592: alloc_arg: "virAlloc" allocates memory that is stored into "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(1UL, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:610: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2689: var_assign: Assigning: "sp" = "virGetStoragePool(conn, pools[i], uuid)".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2694: return_alloc: Returning allocated memory "sp".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2725: leaked_storage: Failing to save storage allocated by "phypGetStoragePoolLookUpByUUID(conn, def->uuid)" leaks it.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2719: alloc_fn: Calling allocation function "phypStoragePoolLookupByName".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2254: alloc_fn: Storage is returned from allocation function "virGetStoragePool".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:592: alloc_arg: "virAlloc" allocates memory that is stored into "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(1UL, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:610: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2254: return_alloc_fn: Directly returning storage allocated by "virGetStoragePool".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2719: leaked_storage: Failing to save storage allocated by "phypStoragePoolLookupByName(conn, def->name)" leaks it.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2270: alloc_fn: Calling allocation function "phypStoragePoolLookupByName".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2254: alloc_fn: Storage is returned from allocation function "virGetStoragePool".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:592: alloc_arg: "virAlloc" allocates memory that is stored into "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(1UL, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:610: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2254: return_alloc_fn: Directly returning storage allocated by "virGetStoragePool".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2270: var_assign: Assigning: "sp" = storage returned from "phypStoragePoolLookupByName(vol->conn, vol->pool)".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2324: leaked_storage: Variable "sp" going out of scope leaks the storage it points to.
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2327: leaked_storage: Variable "sp" going out of scope leaks the storage it points t
On 32-bit platforms, gcc warns that the comparison between a long
and (ULLONG_MAX/1024/1024) is always false; throwing in a type
conversion shuts up the warning.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Shut gcc up.
configure.ac: check for libnl-3 in addition to libnl-1
src/Makefile.am: link against libnl when needed
src/util/virnetlink.c:
support libnl3 api. To minimize impact on code flow, wrap the
differences under the virNetlink* namespace.
Unfortunately libnl3 moves netlink/msg.h to
/usr/include/libnl3/netlink/msg.h, so the LIBNL_CFLAGS need to be added
to a bunch of places where they weren't needed with libnl1.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Add function virJSONValueObjectKeysNumber, virJSONValueObjectGetKey
and virJSONValueObjectGetValue, which allow you to iterate over all
fields of json object: you can get number of fields and then get
name and value, stored in field with that name by index.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
The ATTRIBUTE_NONNULL(m) macro normally resolves to the gcc builtin
__attribute__((__nonnull__(m))). The effect of this in gcc is
unfortunately only to make gcc believe that "m" can never possibly be
NULL, *not* to add in any checks to guarantee that it isn't ever NULL
(i.e. it is an optimization aid, *not* something to verify code
correctness.) - see the following gcc bug report for more details:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17308
Static source analyzers such as clang and coverity apparently can use
ATTRIBUTE_NONNULL(), though, to detect dead code (in the case that the
arg really is guaranteed non-NULL), as well as situations where an
obviously NULL arg is given to the function.
https://bugzilla.redhat.com/show_bug.cgi?id=815270 is a good example
of a bug caused by erroneous application of ATTRIBUTE_NONNULL().
Several people spent a long time staring at this code and not finding
the problem, because the problem wasn't in the function itself, but in
the prototype that specified ATTRIBUTE_NONNULL() for an arg that
actually *wasn't* always non-NULL, and caused a segv when dereferenced
(even though the code that dereferenced the pointer was inside an if()
that checked for a NULL pointer, that code was optimized out by gcc).
There may be some very small gain to be had from the optimizations
that can be inferred from ATTRIBUTE_NONNULL(), but it seems safer to
err on the side of generating code that behaves as expected, while
turning on the attribute for static analyzers.
Once lxcContainerSetStdio is invoked, logging will not work as
expected in libvirt_lxc. So make sure this is the last thing to
be called, in particular after setting the security process label
The virLogSetFromEnv call was done too late in startup to
catch many log messages (eg from security driver initialization).
To assist debugging also explicitly log the security details
at startup
The driver->securityDriverName field may be NULL, if automatic
probing is used to determine security driver. This meant that
unless selinux was explicitly requested in lxc.conf, it was
not being sent to the libvirt_lxc process.
The driver->securityManager field is guaranteed non-NULL, since
there will always be the 'none' security driver present if
nothing else exists. So use that to set the driver name for
libvirt_lxc
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the libvirt_lxc process uses VIR_DOMAIN_XML_INACTIVE
when loading the XML for the container. This means it loses
any dynamic data such as the, just allocated, SELinux label.
Further there is an inconsistency in the libvirt LXC driver
whereby it saves the live config XML and then later overwrites
the file with the live status XML instead. Add a comment about
this for future reference.
* src/lxc/lxc_controller.c: Remove VIR_DOMAIN_XML_INACTIVE
when loading XML
* src/lxc/lxc_driver.c: Add comment about inconsistent
config file formats
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This works with newer qemu that doesn't allow escaping spaces.
It's backwards compatible as well.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
In fact, the 'tapfd' is always NULL, the function 'virNetDevTapCreate()' hasn't
assign 'fd' to 'tapfd', when the function 'virNetDevSetMAC()' is failed then
goto 'error' label, finally, the VIR_FORCE_CLOSE() will deref a NULL 'tapfd'.
* util/virnetdevtap.c (virNetDevTapCreateInBridgePort): fix a NULL pointer derefing.
* How to reproduce?
$ cat > /tmp/net.xml <<EOF
<network>
<name>test</name>
<forward mode='nat'/>
<bridge name='br1' stp='off' delay='1' />
<mac address='00:00:00:00:00:00'/>
<ip address='192.168.100.1' netmask='255.255.255.0'>
<dhcp>
<range start='192.168.100.2' end='192.168.100.254' />
</dhcp>
</ip>
</network>
EOF
$ virsh net-define /tmp/net.xml
$ virsh net-start test
error: Failed to start network brTest
error: End of file while reading data: Input/output error
Signed-off-by: Alex Jia <ajia@redhat.com>
The previous storage patch missed an instance affected by the struct
member rename. It also had some botched whitespace detected by
'make check'.
* src/storage/storage_backend_iscsi.c
(virStorageBackendISCSIFindPoolSources): Adjust to new struct.
* src/conf/storage_conf.c (virStoragePoolSourceFormat): Fix
indentation.
It doesn't break out the "for" loop even if duplicate pool is
found, and thus the "matchpool" could be overriden as NULL again
if there is different pool afterwards.
To address the problem in libvirt-user list:
https://www.redhat.com/archives/libvirt-users/2012-April/msg00150.html
The current storage pools for NFS and iSCSI only require one host to
connect to. Future storage pools like RBD and Sheepdog will require
multiple hosts.
This patch allows multiple source hosts and rewrites the current
storage drivers.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
When libvirtd is started, we create "libvirt/qemu" directories under
hugetlbfs mount point. Only the "qemu" subdirectory is chowned to qemu
user and "libvirt" remains owned by root. If umask was too restrictive
when libvirtd started, qemu user may lose access to "qemu"
subdirectory. Let's explicitly grant search permissions to "libvirt"
directory for all users.
Currently, qemu GA is not providing 'desc' field for errors like
we are used to from qemu monitor. Therefore, we fall back to this
general 'unknown error' string. However, GA is reporting 'class' which
is not perfect, but much more helpful than generic error string.
Thus we should fall back to class firstly and if even no class
is presented, then we can fall back to that generic string.
Before this patch:
virsh # dompmsuspend --target mem f16
error: Domain f16 could not be suspended
error: internal error unable to execute QEMU command
'guest-suspend-ram': unknown QEMU command error
After this patch:
virsh # dompmsuspend --target mem f16
error: Domain f16 could not be suspended
error: internal error unable to execute QEMU command
'guest-suspend-ram': The command has not been found
More bug extermination in the category of:
Error: CHECKED_RETURN:
/libvirt/src/conf/network_conf.c:595:
check_return: Calling function "virAsprintf" without checking return value (as is done elsewhere 515 out of 543 times).
/libvirt/src/qemu/qemu_process.c:2780:
unchecked_value: No check of the return value of "virAsprintf(&msg, "was paused (%s)", virDomainPausedReasonTypeToString(reason))".
/libvirt/tests/commandtest.c:809:
check_return: Calling function "setsid" without checking return value (as is done elsewhere 4 out of 5 times).
/libvirt/tests/commandtest.c:830:
unchecked_value: No check of the return value of "virTestGetDebug()".
/libvirt/tests/commandtest.c:831:
check_return: Calling function "virTestGetVerbose" without checking return value (as is done elsewhere 41 out of 42 times).
/libvirt/tests/commandtest.c:833:
check_return: Calling function "virInitialize" without checking return value (as is done elsewhere 18 out of 21 times).
One note about the error in commandtest line 809: setsid() seems to fail when running the test -- could be removed ?
With RHEL 6.2, virDomainBlockPull(dom, dev, bandwidth, 0) has a race
with non-zero bandwidth: there is a window between the block_stream
and block_job_set_speed monitor commands where an unlimited amount
of data was let through, defeating the point of a throttle.
This race was first identified in commit a9d3495e, and libvirt was
able to reduce the size of the window for that race. In the meantime,
the qemu developers decided to fix things properly; per this message:
https://lists.gnu.org/archive/html/qemu-devel/2012-04/msg03793.html
the fix will be in qemu 1.1, and changes block-job-set-speed to use
a different parameter name, as well as adding a new optional parameter
to block-stream, which eliminates the race altogether.
Since our documentation already mentioned that we can refuse a non-zero
bandwidth for some hypervisors, I think the best solution is to do
just that for RHEL 6.2 qemu, so that the race is obvious to the user
(anyone using stock RHEL 6.2 binaries won't have this patch, and anyone
building their own libvirt with this patch for RHEL can also rebuild
qemu to get the modern semantics, so it is no real loss in behavior).
Meanwhile the code must be fixed to honor actual qemu 1.1 naming.
Rename the parameter to 'modern', since the naming difference now
covers more than just 'async' block-job-cancel. And while at it,
fix an unchecked integer overflow.
* src/qemu/qemu_monitor.h (enum BLOCK_JOB_CMD): Drop unused value,
rename enum to match conventions.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Reflect enum rename.
* src/qemu_qemu_monitor_json.h (qemuMonitorJSONBlockJob): Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Likewise,
and support difference between RHEL 6.2 and qemu 1.1 block pull.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Reject
bandwidth during pull with too-old qemu.
* src/libvirt.c (virDomainBlockPull, virDomainBlockRebase):
Document this.
Error: UNINIT:
/libvirt/src/lxc/lxc_driver.c:1412:
var_decl: Declaring variable "fd" without initializer.
/libvirt/src/lxc/lxc_driver.c:1460:
uninit_use_in_call: Using uninitialized value "fd" when calling "virFileClose".
/libvirt/src/util/virfile.c:50:
read_parm: Reading a parameter value.
Error: DEADCODE:
/libvirt/src/lxc/lxc_controller.c:960:
dead_error_condition: On this path, the condition "ret == 4" cannot be true.
/libvirt/src/lxc/lxc_controller.c:959:
at_most: After this line, the value of "ret" is at most -1.
/libvirt/src/lxc/lxc_controller.c:959:
new_values: Noticing condition "ret < 0".
/libvirt/src/lxc/lxc_controller.c:961:
dead_error_line: Execution cannot reach this statement "continue;".
Error: UNINIT:
/libvirt/src/lxc/lxc_controller.c:1104:
var_decl: Declaring variable "consoles" without initializer.
/libvirt/src/lxc/lxc_controller.c:1237:
uninit_use: Using uninitialized value "consoles".
QEMU binary is called several times when we probe different kinds of
capabilities the binary supports. This patch introduces new common
helper so that all probes use a consistent way of invoking qemu.
https://bugzilla.redhat.com/show_bug.cgi?id=816662 pointed out
that attempting 'virsh blockpull' on an offline domain gave a
misleading error message about qemu lacking support for the
operation, even when qemu was specifically updated to support it.
The real problem is that we have several capabilities that are
only determined when starting a domain, and therefore are still
clear when first working with an inactive domain (namely, any
capability set by qemuMonitorJSONCheckCommands).
While this patch was able to hoist an existing check in one of the
three culprits, it had to add redundant checks in the other two
places (because you always have to check for an active domain after
obtaining a VM job lock, but the capability bits were being checked
prior to obtaining the job lock).
Someday it would be nice to patch libvirt to cache the set of
capabilities per qemu binary (as determined by inode and timestamp),
rather than re-probing the binary every time a domain is started,
and to teach the cache how to query the monitor during the one
time the probe is made rather than having to wait until a guest
is started; then, a capability probe would succeed even for offline
guests because it just refers to the cache, and the single check for
an active domain after grabbing the job lock would be sufficient.
But since that will involve a lot more coding, I'm happy to go
with this simpler solution for an immediate solution.
* src/qemu/qemu_driver.c (qemuDomainPMSuspendForDuration)
(qemuDomainSnapshotCreateXML, qemuDomainBlockJobImpl): Check for
offline state before checking an online-only cap.
Below patch fixes the following coverity findings
Error: OVERRUN_STATIC:
/libvirt/src/qemu/qemu_command.c:152:
overrun-buffer-val: Overrunning static array "net->mac" of size 6 bytes by passing it as an argument to a function which indexes it at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:948:
access_dbuff_const: Calling "virNetDevMacVLanVPortProfileRegisterCallback" indexes array "macaddress" at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:773:
access_dbuff_const: Calling "memcpy" indexes array "macaddress" with index "16UL" at byte position 15.
Error: OVERRUN_STATIC:
/libvirt/src/qemu/qemu_migration.c:2744:
overrun-buffer-val: Overrunning static array "net->mac" of size 6 bytes by passing it as an argument to a function which indexes it at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:773:
access_dbuff_const: Calling "memcpy" indexes array "macaddress" with index "16UL" at byte position 15.
Error: OVERRUN_STATIC:
/libvirt/src/qemu/qemu_driver.c:435:
overrun-buffer-val: Overrunning static array "net->mac" of size 6 bytes by passing it as an argument to a function which indexes it at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:1036:
access_dbuff_const: Calling "virNetDevMacVLanVPortProfileRegisterCallback" indexes array "macaddress" at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:773:
access_dbuff_const: Calling "memcpy" indexes array "macaddress" with index "16UL" at byte position 15.
This patch addresses the following coverity findings:
/libvirt/src/conf/nwfilter_params.c:390:
var_assigned: Assigning: "varValue" = null return value from "virHashLookup".
/libvirt/src/conf/nwfilter_params.c:392:
dereference: Dereferencing a pointer that might be null "varValue" when calling "virNWFilterVarValueGetNthValue".
/libvirt/src/conf/nwfilter_params.c:399:
dereference: Dereferencing a pointer that might be null "tmp" when calling "virNWFilterVarValueGetNthValue".
This patch addresses the following coverity findings:
/libvirt/src/conf/nwfilter_params.c:157:
deref_parm: Directly dereferencing parameter "val".
/libvirt/src/conf/nwfilter_params.c:473:
negative_returns: Using variable "iterIndex" as an index to array "res->iter".
/libvirt/src/nwfilter/nwfilter_ebiptables_driver.c:2891:
unchecked_value: No check of the return value of "virAsprintf(&protostr, "-d 01:80:c2:00:00:00 ")".
/libvirt/src/nwfilter/nwfilter_ebiptables_driver.c:2894:
unchecked_value: No check of the return value of "virAsprintf(&protostr, "-p 0x%04x ", l3_protocols[protoidx].attr)".
/libvirt/src/nwfilter/nwfilter_ebiptables_driver.c:3590:
var_deref_op: Dereferencing null variable "inst".
Some of the error messages in this function should have been
virReportSystemError (since they have an errno they want to log), but
were mistakenly written as netlinkError, which expects a libvirt error
code instead. The result was that when one of the errors was
encountered, "No error message provided" would be printed instead of
something meaningful (see
https://bugzilla.redhat.com/show_bug.cgi?id=816465 for an example).
Once qemu monitor reports migration has completed, we just closed our
end of the pipe and let migration tunnel die. This generated bogus error
in case we did so before the thread saw EOF on the pipe and migration
was aborted even though it was in fact successful.
With this patch we first wake up the tunnel thread and once it has read
all data from the pipe and finished the stream we close the
filedescriptor.
A small additional bonus of this patch is that real errors reported
inside qemuMigrationIOFunc are not overwritten by virStreamAbort any
more.
When QEMU reported failed or canceled migration, we correctly detected
it but didn't really consider it as an error condition and migration
protocol just went on. Luckily, some of the subsequent steps eventually
failed end we reported an (unrelated and mostly random) error back to
the caller.
Currently, non-blocking calls are either sent immediately or discarded
in case sending would block. This was implemented based on the
assumption that the non-blocking keepalive call is not needed as there
are other calls in the queue which would keep the connection alive.
However, if those calls are no-reply calls (such as those carrying
stream data), the remote party knows the connection is alive but since
we don't get any reply from it, we think the connection is dead.
This is most visible in tunnelled migration. If it happens to be longer
than keepalive timeout (30s by default), it may be unexpectedly aborted
because the connection is considered to be dead.
With this patch, we only discard non-blocking calls when the last call
with a thread is completed and thus there is no thread left to keep
sending the remaining non-blocking calls.
In some cases (spotted with broken connection during tunneled migration)
we were overwriting the original error with worse or even misleading
errors generated when we were cleaning up after failed migration.
The docs for virConnectSetKeepAlive() advertise that this function
should be able to disable keepalives on negative or zero interval time.
This patch removes the check that prohibited this and adds code to
disable keepalives on negative/zero interval.
* src/libvirt.c: virConnectSetKeepAlive(): - remove check for negative
values
* src/rpc/virnetclient.c
* src/rpc/virnetclient.h: - add virNetClientKeepAliveStop() to disable
keepalive messages
* src/remote/remote_driver.c: remoteSetKeepAlive(): -add ability to
disable keepalives
This patch resolves https://bugzilla.redhat.com/show_bug.cgi?id=815270
The function virNetDevMacVLanVPortProfileRegisterCallback() takes an
arg "virtPortProfile", and was checking it for non-NULL before using
it. However, the prototype for
virNetDevMacVLanPortProfileRegisterCallback had marked that arg with
ATTRIBUTE_NONNULL(). Contrary to what one may think,
ATTRIBUTE_NONNULL() does not provide any guarantee that an arg marked
as such really is always non-null; the only effect to the code
generated by gcc, is that gcc *assumes* it is non-NULL; this results
in, for example, the check for a non-NULL value being optimized out.
(Unfortunately, this code removal only occurs when optimization is
enabled, and I am in the habit of doing local builds with optimization
off to ease debugging, so the bug didn't show up in my earlier local
testing).
In general, virPortProfile might always be NULL, so it shouldn't be
marked as ATTRIBUTE_NONNULL. One other function prototype made this
same error, so this patch fixes it as well.
Add 2 new functions to the virSocketAddr 'class':
- virSocketAddrEqual: tests whether two IP addresses and their ports are equal
- virSocketaddSetIPv4Addr: set a virSocketAddr given a 32 bit int
This patch improves the previously added virAtomicInt implementation
by using gcc-builtins if possible. The needed builtins are available
since GCC >= 4.1. At least the 4.0 docs don't mention them.
vboxArray is not castable to a COM item type. vboxArray is a
wrapper around the XPCOM and MSCOM specific array handling.
In this case we can avoid passing NULL as an empty array to
IMachine::Delete by passing a dummy IMedium* array with a single
NULL item.
In order to track a block copy job across libvirtd restarts, we
need to save internal XML that tracks the name of the file
holding the mirror. Displaying this name in dumpxml might also
be useful to the user, even if we don't yet have a way to (re-)
start a domain with mirroring enabled up front. This is done
with a new <mirror> sub-element to <disk>, as in:
<disk type='file' device='disk'>
<driver name='qemu' type='raw'/>
<source file='/var/lib/libvirt/images/original.img'/>
<mirror file='/var/lib/libvirt/images/copy.img' format='qcow2' ready='yes'/>
...
</disk>
For now, the element is output-only, in live domains; it is ignored
when defining a domain or hot-plugging a disk (since those contexts
use VIR_DOMAIN_XML_INACTIVE in parsing). The 'ready' attribute appears
when libvirt knows that the job has changed from the initial pulling
phase over to the mirroring phase, although absence of the attribute
is not a sure indicator of the current phase. If we come up with a way
to make qemu start with mirroring enabled, we can relax the xml
restriction, and allow <mirror> (but not attribute 'ready') on input.
Testing active-only XML meant tweaking the testsuite slightly, but it
was worth it.
* docs/schemas/domaincommon.rng (diskspec): Add diskMirror.
* docs/formatdomain.html.in (elementsDisks): Document it.
* src/conf/domain_conf.h (_virDomainDiskDef): New members.
* src/conf/domain_conf.c (virDomainDiskDefFree): Clean them.
(virDomainDiskDefParseXML): Parse them, but only internally.
(virDomainDiskDefFormat): Output them.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: New test file.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-mirror.xml: Likewise.
* tests/qemuxml2xmltest.c (testInfo): Alter members.
(testCompareXMLToXMLHelper): Allow more test control.
(mymain): Run new test.
This patch introduces a new block job, useful for live storage
migration using pre-copy streaming. Justification for including
this under virDomainBlockRebase rather than adding a new command
includes: 1) there are now two possible block jobs in qemu, with
virDomainBlockRebase starting either type of command, and
virDomainBlockJobInfo and virDomainBlockJobAbort working to end
either type; 2) reusing this command allows distros to backport
this feature to the libvirt 0.9.10 API without a .so bump.
Note that a future patch may add a more powerful interface named
virDomainBlockJobCopy, dedicated to just the block copy job, in
order to expose even more options (such as setting an arbitrary
format type for the destination without having to probe it from a
pre-existing destination file); adding a new command for targetting
just block copy would be similar to how we already have
virDomainBlockPull for targetting just the block pull job.
Using a live VM with the backing chain:
base <- snap1 <- snap2
as the starting point, we have:
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY)
creates /path/to/copy with the same format as snap2, with no backing
file, so entire chain is copied and flattened
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)
creates /path/to/copy as a raw file, so entire chain is copied and
flattened
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_SHALLOW)
creates /path/to/copy with the same format as snap2, but with snap1 as
a backing file, so only snap2 is copied.
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT)
reuse existing /path/to/copy (must have empty contents, and format is
probed[*] from the metadata), and copy the full chain
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT|
VIR_DOMAIN_BLOCK_REBASE_SHALLOW)
reuse existing /path/to/copy (contents must be identical to snap1,
and format is probed[*] from the metadata), and copy only the contents
of snap2
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT|
VIR_DOMAIN_BLOCK_REBASE_SHALLOW|VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)
reuse existing /path/to/copy (must be raw volume with contents
identical to snap1), and copy only the contents of snap2
Less useful combinations:
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_SHALLOW|
VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)
fail if source is not raw, otherwise create /path/to/copy as raw and
the single file is copied (no chain involved)
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT|
VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)
makes little sense: the destination must be raw but have no contents,
meaning that it is an empty file, so there is nothing to reuse
The other three flags are rejected without VIR_DOMAIN_BLOCK_COPY.
[*] Note that probing an existing file for its format can be a security
risk _if_ there is a possibility that the existing file is 'raw', in
which case the guest can manipulate the file to appear like some other
format. But, by virtue of the VIR_DOMAIN_BLOCK_REBASE_COPY_RAW flag,
it is possible to avoid probing of raw files, at which point, probing
of any remaining file type is no longer a security risk.
It would be nice if we could issue an event when pivoting from phase 1
to phase 2, but qemu hasn't implemented that, and we would have to poll
in order to synthesize it ourselves. Meanwhile, qemu will give us a
distinct job info and completion event when we either cancel or pivot
to end the job. Pivoting is accomplished via the new:
virDomainBlockJobAbort(dom, disk, VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT)
Management applications can pre-create the copy with a relative
backing file name, and use the VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT
flag to have qemu reuse the metadata; if the management application
also copies the backing files to a new location, this can be used
to perform live storage migration of an entire backing chain.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_JOB_TYPE_COPY):
New block job type.
(virDomainBlockJobAbortFlags, virDomainBlockRebaseFlags): New enums.
* src/libvirt.c (virDomainBlockRebase): Document the new flags,
and implement general restrictions on flag combinations.
(virDomainBlockJobAbort): Document the new flag.
(virDomainSaveFlags, virDomainSnapshotCreateXML)
(virDomainRevertToSnapshot, virDomainDetachDeviceFlags): Document
restrictions.
* include/libvirt/virterror.h (VIR_ERR_BLOCK_COPY_ACTIVE): New
error.
* src/util/virterror.c (virErrorMsg): Define it.
This patch modifies the CPU comparrison function to report the
incompatibilities in more detail to ease identification of problems.
* src/cpu/cpu.h:
cpuGuestData(): Add argument to return detailed error message.
* src/cpu/cpu.c:
cpuGuestData(): Add passthrough for error argument.
* src/cpu/cpu_x86.c
x86FeatureNames(): Add function to convert a CPU definition to flag
names.
x86Compute(): - Add error message parameter
- Add macro for reporting detailed error messages.
- Improve error reporting.
- Simplify calculation of forbidden flags.
x86DataIteratorInit():
x86cpuidMatchAny(): Remove functions that are no longer needed.
* src/qemu/qemu_command.c:
qemuBuildCpuArgStr(): - Modify for new function prototype
- Add detailed error reports
- Change error code on incompatible processors
to VIR_ERR_CONFIG_UNSUPPORTED instead of
internal error
* tests/cputest.c:
cpuTestGuestData(): Modify for new function prototype
virThreadSelf tries to access the virThreadPtr stored in TLS for the
current thread via TlsGetValue. When virThreadSelf is called on a thread
that was not created via virThreadCreate (e.g. the main thread) then
TlsGetValue returns NULL as TlsAlloc initializes TLS slots to NULL.
virThreadSelf can be called on the main thread via this call chain from
virsh
vshDeinit
virEventAddTimeout
virEventPollAddTimeout
virEventPollInterruptLocked
virThreadIsSelf
triggering a segfault as virThreadSelf unconditionally dereferences the
return value of TlsGetValue.
Fix this by making virThreadSelf check the TLS slot value for NULL and
setting the given virThreadPtr accordingly.
Reported by Marcel Müller.
POSIX says that sa_sigaction is only safe to use if sa_flags
includes SA_SIGINFO; conversely, sa_handler is only safe to
use when flags excludes that bit. Gnulib doesn't guarantee
an implementation of SA_SIGINFO, but does guarantee that
if SA_SIGINFO is undefined, we can safely define it to 0 as
long as we don't dereference the 2nd or 3rd argument of
any handler otherwise registered via sa_sigaction.
Based on a report by Wen Congyang.
* src/rpc/virnetserver.c (SA_SIGINFO): Stub for mingw.
(virNetServerSignalHandler): Avoid bogus dereference.
(virNetServerFatalSignal, virNetServerNew): Set flags properly.
(virNetServerAddSignalHandler): Drop unneeded #ifdef.
I almost copied-and-pasted some redundant () into my new code,
and figured a general cleanup prereq patch would be better instead.
No semantic change.
* src/conf/domain_conf.c (virDomainLeaseDefParseXML)
(virDomainDiskDefParseXML, virDomainFSDefParseXML)
(virDomainActualNetDefParseXML, virDomainNetDefParseXML)
(virDomainGraphicsDefParseXML, virDomainVideoAccelDefParseXML)
(virDomainVideoDefParseXML, virDomainHostdevFind)
(virDomainControllerInsertPreAlloced, virDomainDefParseXML)
(virDomainObjParseXML, virDomainCpuSetFormat)
(virDomainCpuSetParse, virDomainDiskDefFormat)
(virDomainActualNetDefFormat, virDomainNetDefFormat)
(virDomainTimerDefFormat, virDomainGraphicsListenDefFormat)
(virDomainDefFormatInternal, virDomainNetGetActualHostdev)
(virDomainNetGetActualBandwidth, virDomainGraphicsGetListen):
Reduce extra ().
Ensure we don't introduce any more lousy integer parsing in new
code, while avoiding a scrub-down of existing legacy code.
Note that we also need to enable sc_prohibit_atoi_atof (see cfg.mk
local-checks-to-skip) before we are bulletproof, but that also
entails scrubbing I'm not ready to do at the moment.
* src/util/util.c (virStrToLong_i, virStrToLong_ui)
(virStrToLong_l, virStrToLong_ul, virStrToLong_ll)
(virStrToLong_ull, virStrToDouble): Mark exemptions.
* src/util/virmacaddr.c (virMacAddrParse): Likewise.
* cfg.mk (sc_prohibit_strtol): New syntax check.
(exclude_file_name_regexp--sc_prohibit_strtol): Ignore files that
I'm not willing to fix yet.
(local-checks-to-skip): Re-enable sc_prohibit_atoi_atof.
https://bugzilla.redhat.com/show_bug.cgi?id=617711 reported that
even with my recent patched to allow <memory unit='G'>1</memory>,
people can still get away with trying <memory>1G</memory> and
silently get <memory unit='KiB'>1</memory> instead. While
virt-xml-validate catches the error, our C parser did not.
Not to mention that it's always fun to fix bugs while reducing
lines of code. :)
* src/conf/domain_conf.c (virDomainParseMemory): Check for parse error.
(virDomainDefParseXML): Avoid strtoll.
* src/conf/storage_conf.c (virStorageDefParsePerms): Likewise.
* src/util/xml.c (virXPathLongBase, virXPathULongBase)
(virXPathULongLong, virXPathLongLong): Likewise.
Commit 78345c68 makes at least gcc 4.1.2 on RHEL 5 complain:
cc1: warnings being treated as errors
In file included from vbox/vbox_V4_0.c:13:
vbox/vbox_tmpl.c: In function 'vboxDomainUndefineFlags':
vbox/vbox_tmpl.c:5298: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
* src/vbox/vbox_tmpl.c (vboxDomainUndefineFlags): Use union to
avoid compiler warning.
DBus connection. The HAL device code further requires that
the DBus connection is integrated with the event loop and
provides such glue logic itself.
The forthcoming FirewallD integration also requires a
dbus connection with event loop integration. Thus we need
to pull the current event loop glue out of the HAL driver.
Thus we create src/util/virdbus.{c,h} files. This contains
just one method virDBusGetSystemBus() which obtains a handle
to the single shared system bus instance, with event glue
automagically setup.
Fix the support for trusted DHCP server in the ebtables code's
hard-coded function applying DHCP only filtering rules:
Rather than using a char * use the more flexible
virNWFilterVarValuePtr that contains the trusted DHCP server(s)
IP address. Process all entries.
Since all callers so far provided NULL as parameter, no changes
are necessary in any other code.
Currently upon a migration a callback is created when a 802.1qbg link
is set to PREASSOCIATE, this should not happen because this is a no-op
on most switches, and does not lead to an ASSOCIATE state. This patch
only creates callbacks when CREATE or RESTORE is requested. Migration
and libvirtd restart scenarios are already handled elsewhere.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>