Commit Graph

11614 Commits

Author SHA1 Message Date
Chunyan Liu
b5d5eb9bc5 qemu: remove functions used internally only from qemu_hostdev.h 2014-03-12 16:03:04 +00:00
Chunyan Liu
802c59d4b9 qemu: reuse hostdev interfaces to avoid duplicate
Same logic of preparing/reattaching hostdevs could be used in attach/detach
hotplug places, so reuse hostdev interfaces to avoid duplicate, also for later
extracting general code to common library.
2014-03-12 16:03:04 +00:00
Chunyan Liu
95fa4906b2 update qemuPrepareHostUSBDevices parameters to keep consistency
Update parameters from vm->def to specific name, hostdevs, nhostdevs to keep
consistentcy with PreparePCIDevices and PrepareSCSIDevices. And, at the same
time, make it reusable in later patch.
2014-03-12 16:03:04 +00:00
Chunyan Liu
6b306d66fa virhostdev: use virObject to virHostdevManager to keep reference
Use virObject to virHostdevManager, so that each driver using virHostdevManager
can keep a reference to it, and through counting refs to make virHostdevManager
get freed.
2014-03-12 16:03:04 +00:00
Jiri Denemark
e562e82f76 Load CPU map from builddir when run uninstalled
When libvirtd is run from a build directory without being installed, it
should not depend on files from a libvirt package installed in the
system. Not only because there may not be any libvirt installed at all.
We already do a good job for plugins but cpu_map.xml was still loaded
from the system.

The Makefile.am change is necessary to make this all work from VPATH
builds since libvirtd has no idea where to find libvirt sources. It only
knows the path from which it was started, i.e, a builddir.

https://bugzilla.redhat.com/show_bug.cgi?id=1074327
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2014-03-12 16:31:57 +01:00
Ján Tomko
7b91dc3ecd Introduce vircommandpriv.h for functions used by tests
So far it's just virCommandSetDryRun.
2014-03-12 15:53:16 +01:00
Ján Tomko
94b57a9de0 Use size_t for ndevice in pool source definition
This allows it to be used by the VIR_*_ELEMENT macros.

Also use them for parsing the definiton and remove the redundant
freeing of 'nodeset' before jumping to the cleanup label.
2014-03-12 15:51:40 +01:00
Ján Tomko
20f0cd4ca3 Introduce virStoragePoolSourceDeviceClear
Open-coding one VIR_FREE in the test suite just doesn't seem right.
2014-03-12 15:51:40 +01:00
Ján Tomko
cc8bc54bfc Change virStorageBackendISCSISession 'probe' arg to bool
It quacks like a bool.
2014-03-12 15:51:40 +01:00
Stefan Berger
41064facd4 nwfilter: Add missing goto err_exit in error path
https://bugzilla.redhat.com/show_bug.cgi?id=1071095

Add a missing goto err_exit in the error path where an unsupported
value is assigned to the CTRL_IP_LEARNING key.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
2014-03-12 10:35:13 -04:00
Daniel P. Berrange
06e788e518 Fix sec label setup when attaching to QEMU processes
When attaching to a QEMU process, the def->seclabels array is
going to be empty. The qemuProcessAttach method must thus
populate it with data for the security drivers.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-03-12 13:14:38 +00:00
Eric Blake
359f4b11a6 qemu: don't munge user input during block commit
While investigating https://bugzilla.redhat.com/show_bug.cgi?id=1061827
I noticed that we pass user input unscathed for block-pull, but
always pass a canonical absolute name through for block-commit.
[Note that we probably _ought_ to validate that the user's request
for block-pull actually matches the backing chain, the way we already
do for block-commit - but that's a separate issue.  Further note that
the ability to pass user input through unscathed allows backdoors
such as specifying a backing image that is a network URI such as
a gluster disk, instead of forcing things to the local file system;
which is an area still under active investigation on whether libvirt
needs to behave differently for network disks.]

Since qemu may write the name that the user passed in as the backing
file, a user may have a reason to want a relative file name passed
through to qemu, and always munging things to absolute prevents that.

Put another way, if you have the backing chain:

[A] <- [B(back=./A)] <- [C(back=./B)]

and commit B into A (virsh blockcommit $dom vda --base A --top B),
the metadata of C will have to be re-written. But should it be
rewritten as [C(back=./A)] or as [C(back=/path/to/A)]?  Still up in
the air is whether qemu's decision should be based on whether B
and/or C had relative paths, or on whether the --base and/or
--top arguments to the command were relative paths; but if we always
pass a canonical name, we've prevented the spelling of the command
arguments from being part of the hueristics that qemu uses.

I also audited the code, and verified that we never call
qemuMonitorBlockCommit() with a NULL base, either before or after
the change to qemu_driver.c.

* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Preserve user's
spelling, since absolute vs. relative matters to qemu.
* src/qemu/qemu_monitor.h (qemuMonitorBlockCommit): Base is never
null.
* src/qemu/qemu_monitor.c (qemuMonitorBlockCommit): Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockCommit):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockCommit):
Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-03-11 17:53:19 -06:00
Eric Blake
e686ce8aa2 iptables: don't log command probe failures
Commit b9dd878f caused a regression in iptables interaction by
logging non-zero status at a higher level than VIR_INFO.  Revert
that portion of the commit, as well as adding a comment explaining
why we check the status ourselves.

Reported by Nehal J Wani.

* src/util/viriptables.c (virIpTablesOnceInit): Undo log regression.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-03-11 17:43:47 -06:00
Jim Fehlig
f68246ac94 libxl: support sexpr in native to XML conversion
Supporting sexpr in connectDomainXMLFromNative in the libxl driver
adds flexibility for users importing legacy Xen configuration into
libvirt.  E.g. this patch allows importing previous xend-managed
domains from /var/lib/xend/domains/<dom-uuid>/config.sxp into the
libvirt libxl driver.
2014-03-11 14:31:08 -06:00
John Ferlan
ea10cd76f8 storage: Fix bugs in VIR_APPEND_ELEMENT series
From commit id 'd53bbfd1'

Found one core and one possible memory leak. Core seen during local
virt-test/tp_libvirt run for the vol_create_from test. The memory leak
was seen by inspection during a review of all VIR_APPEND_ELEMENT changes

In storage_backend_disk/virStorageBackendDiskMakeDataVol(), the 'vol'
needs to be kept around since it's used later, so use the _COPY macro.
This caused a segv in libvirtd:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe87c3700 (LWP 6919)]
virStorageBackendDiskMakeDataVol (vol=0x0, groups=0x7fffc8000d70, pool=0x7fffc8002460) at storage/storage_backend_disk.c:66
66          if (vol->target.path == NULL) {

In storage_backend_rbd/virStorageBackendRBDRefreshPool() there's a failure
path where the 'vol' needs to go through virStorageVolDefFree() since it
wouldn't be appended.
2014-03-11 15:51:47 -04:00
Daniel P. Berrange
cfb92c9b0c Remove broken error reporting in QEMU mac filtering
The qemu_bridge_filter.c file had some helpers for calling
the ebtablesXXX functions todo bridge filtering. The only
thing these helpers did was to overwrite the original error
message from the ebtables code. For added fun, the callers
of these helpers overwrote the errors yet again. For even
more fun, one of the helpers called another helper and
overwrite its errors too.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-03-11 11:04:55 +00:00
Daniel P. Berrange
dafa39adbc Remove unused ebtablesRemoveForwardPolicyReject method
The ebtablesRemoveForwardPolicyReject method was unused and
would not do anything useful even if called.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-03-11 11:01:52 +00:00
Daniel P. Berrange
6e69008f3e Remove worthless ebtRules data structure
The ebtRules data structure serves no useful purpose as
the table name is never used and only 1 single chain name
needs to be stored. Just store the chain name directly
in the ebtablesContext instead.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-03-11 11:01:52 +00:00
Daniel P. Berrange
78629cf531 Remove data structure holding list of ebtables rules
When adding/removing ebtables rules, the code would keep
an array of all rules in memory. This list of rules was
never used for any purpose and would be lost if libvirtd
restarted. Delete all the unused code.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-03-11 11:01:52 +00:00
Daniel P. Berrange
ca3dafef41 Remove unused variables from ebtablesContext
The input_filter and nat_postrouting variables were never
used to create any firewall rules.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-03-11 11:01:51 +00:00
Daniel P. Berrange
c383e13a37 Make ebtablesForwardPolicyReject static
The ebtablesForwardPolicyReject method is only used internally
to the ebtables code and thus should have been static.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-03-11 11:01:51 +00:00
Daniel P. Berrange
184d464661 Remove decl of method which doesn't exist in virebtables.h
There is no impl of the ebtablesSaveRules method and nothing
attempts to use it.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-03-11 11:01:51 +00:00
Daniel P. Berrange
a84f9bd555 Remove many decls from bridge driver platform header
The bridge_driver_platform.h defines many functions that
a platform driver must implement. Only two of these
functions are actually called from the main bridge driver
code. The remainder can be made internal to the linux
driver only.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-03-11 11:01:51 +00:00
Daniel P. Berrange
cbde35899b Cache result of QEMU capabilities extraction
Extracting capabilities from QEMU takes a notable amount of time
when all QEMU binaries are installed. Each system emulator
needs about 200-300ms multiplied by 26 binaries == ~5-8 seconds.

This change causes the QEMU driver to save an XML file containing
the content of the virQEMUCaps object instance in the cache
dir eg /var/cache/libvirt/qemu/capabilities/$SHA256(binarypath).xml
or $HOME/.cache/libvirt/qemu/cache/capabilities/$SHA256(binarypath).xml

We attempt to load this and only if it fails, do we fallback to
probing the QEMU binary. The ctime of the QEMU binary and libvirtd
are stored in the cached file and its data discarded if either
of them change.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-03-11 10:59:00 +00:00
Daniel P. Berrange
f5059a929e Change QEMU capabilities cache to check ctime instead of mtime
Debian's package manager will preserve mtime timestamp on binaries
from the time they are built, rather than installed. So if a
user downgrades their QEMU dpkg, the libvirt capabilities
cache will not refresh. The fix is to use ctime instead of mtime
since it cannot be faked.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-03-11 10:52:29 +00:00
Daniel P. Berrange
10ec072545 Add helper APIs to track if libvirtd or loadable modules have changed
The future QEMU capabilities cache needs to be able to invalidate
itself if the libvirtd binary or any loadable modules are changed
on disk. Record the 'ctime' value for these binaries and provide
helper APIs to query it. This approach assumes that if libvirt.so
is changed, then libvirtd will also change, which should usually
be the case with libtool's wrapper scripts that cause libvirtd to
get re-linked

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-03-11 10:51:49 +00:00
Michal Privoznik
f5796b61cc virSecurityDACSetSecurityImageLabel: Unmark @def as unused
The @def is clearly used just a few lines below. There's no need to use
ATTRIBUTE_UNUSED for it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-03-11 11:18:06 +01:00
Stefan Berger
6768b21033 BZ1072677: Avoid freeing of 0 file descriptor
Avoid the freeing of an array of zero file descriptors in case
of error. Initialize the array to -1 using memset.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
2014-03-10 18:47:19 -04:00
Daniel P. Berrange
ed839f9aef Convert lock driver plugins to use new crypto APIs
Convert the sanlock and lockd lock driver plugins over to use
the new virCryptoHashString APIs instead of having their own
duplicated code.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-03-10 16:44:14 +00:00
Daniel P. Berrange
3a7fe8d508 Add helper APIs for generating cryptographic hashes
GNULIB provides APIs for calculating md5 and sha256 hashes,
but these APIs only return you raw byte arrays. Most users
in libvirt want the hash in printable string format. Add
some helper APIs in util/vircrypto.{c,h} for doing this.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-03-10 16:39:18 +00:00
Ján Tomko
9b9d7704b5 Change file names in comments to match the files they are in
Some of these are leftovers from renaming the files, others
are just typos.

Also introduce an ugly awk script to enforce this.
2014-03-10 14:26:04 +01:00
Michal Privoznik
17d6a91854 src/xenxs: Utilize more of VIR_(APPEND|INSERT|DELETE)_ELEMENT
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-03-10 13:45:11 +01:00
Michal Privoznik
ce17ddacca src/xen: Utilize more of VIR_(APPEND|INSERT|DELETE)_ELEMENT
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-03-10 13:45:11 +01:00
Michal Privoznik
fb9bec1055 src/util: Utilize more of VIR_(APPEND|INSERT|DELETE)_ELEMENT
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-03-10 13:45:11 +01:00
Michal Privoznik
7e89de172d src/test: Utilize more of VIR_(APPEND|INSERT|DELETE)_ELEMENT
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-03-10 13:45:10 +01:00
Michal Privoznik
d53bbfd159 src/storage: Utilize more of VIR_(APPEND|INSERT|DELETE)_ELEMENT
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-03-10 13:45:10 +01:00
Michal Privoznik
ba52e4c715 src/rpc: Utilize more of VIR_(APPEND|INSERT|DELETE)_ELEMENT
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-03-10 13:45:10 +01:00
Michal Privoznik
5ab80fc1ae src/qemu: Utilize more of VIR_(APPEND|INSERT|DELETE)_ELEMENT
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-03-10 13:45:10 +01:00
Michal Privoznik
3f8b040d9a src/phyp: Utilize more of VIR_(APPEND|INSERT|DELETE)_ELEMENT
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-03-10 13:45:10 +01:00
Michal Privoznik
d7d06cc183 src/parallels: Utilize more of VIR_(APPEND|INSERT|DELETE)_ELEMENT
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-03-10 13:45:10 +01:00
Michal Privoznik
d9e4d5cb7c src/openvz: Utilize more of VIR_(APPEND|INSERT|DELETE)_ELEMENT
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-03-10 13:45:10 +01:00
Michal Privoznik
6c1bde6a94 src/nwfilter: Utilize more of VIR_(APPEND|INSERT|DELETE)_ELEMENT
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-03-10 13:45:10 +01:00
Michal Privoznik
6fca03f0a0 src/lxc/: Utilize more of VIR_(APPEND|INSERT|DELETE)_ELEMENT
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-03-10 13:45:10 +01:00
Michal Privoznik
2133441a07 conf: Utilize more of VIR_(APPEND|INSERT|DELETE)_ELEMENT
This fixes a possible double free. In virNetworkAssignDef() if
virBitmapNew() fails, then virNetworkObjFree(network) is called.
However, with network->def pointing to actual @def. So if caller
frees @def again, ...

Moreover, this fixes one possible memory leak too. In
virInterfaceAssignDef() if appending to the list of interfaces
fails, we ought to call virInterfaceObjFree() instead of bare
VIR_FREE().

Although, in order to do that some array size variables needs
to be turned into size_t rather than int.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-03-10 13:45:10 +01:00
Daniel P. Berrange
925de19ed7 Add a mutex to serialize updates to firewall
The nwfilter conf update mutex previously serialized
updates to the internal data structures for firewall
rules, and updates to the firewall itself. The latter
was recently turned into a read/write lock, and filter
instantiation allowed to proceed in parallel. It was
believed that this was ok, since each filter is created
on a separate iptables/ebtables chain.

It turns out that there is a subtle lock ordering problem
on virNWFilterObjPtr instances. __virNWFilterInstantiateFilter
will hold a lock on the virNWFilterObjPtr it is instantiating.
This in turn invokes virNWFilterInstantiate which then invokes
virNWFilterDetermineMissingVarsRec which then invokes
virNWFilterObjFindByName. This iterates over every single
virNWFilterObjPtr in the list, locking them and checking their
name. So if 2 or more threads try to instantiate a filter in
parallel, they'll all hold 1 lock at the top level in the
__virNWFilterInstantiateFilter method which will cause the
other thread to deadlock in virNWFilterObjFindByName.

The fix is to add an exclusive mutex to serialize the
execution of __virNWFilterInstantiateFilter.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-03-10 11:56:45 +00:00
John Ferlan
ea1eadd6a3 virscsi: Introduce virSCSIDeviceUsedByInfoFree
This resolves a Coverity RESOURCE_LEAK issue introduced by commit
id 'de6fa535' where the virSCSIDeviceSetUsedBy() didn't VIR_FREE
the 'copy' or possibly VIR_STRDUP()'d values.  It also ensures that
the VIR_APPEND_ELEMENT is successful...
2014-03-07 12:24:44 -05:00
Michael Chapman
1af9800b55 virIdentityGetSystem: don't fail if SELinux is disabled
If SELinux is compiled into libvirt but it is disabled on the host,
libvirtd logs:

  error : virIdentityGetSystem:173 : Unable to lookup SELinux process
  context: Invalid argument

on each and every client connection.

Use is_selinux_enabled() to skip retrieval of the process's SELinux
context if SELinux is disabled.

Signed-off-by: Michael Chapman <mike@very.puzzling.org>
2014-03-07 15:01:33 +01:00
Martin Kletzander
45ad1adb4a qemu: Reject unsupported tuning in session mode
When domain is started with setting that cannot be done, i.e. those
that require cgroups, there is no error reported and it succeeds
without any message whatsoever.

When setting with API, virsh, an error is reported, but only due to
the fact that no cgroups are mounted (priv->cgroup == NULL).

Given the above it seems reasonable to reject such unsupported
settings.

This patch effectively changes the error message from:

$ virsh -c qemu:///session schedinfo dummy
Scheduler      : Unknown
error: Requested operation is not valid: cgroup CPU controller is not mounted

to:

$ virsh -c qemu:///session schedinfo dummy
Scheduler      : Unknown
error: Operation not supported: CPU tuning is not available in session mode

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1023366

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-03-06 15:29:07 +01:00
Michael Chapman
e5cd28c023 datatypes: update comments of Dispose functions
As of commit 46ec5f85, the conn.lock mutex does not need to be held
when calling any vir*Dispose() function in datatypes.c (via virObjectUnref()).

Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2014-03-06 09:39:49 +01:00
Peter Krempa
3e04d65a07 qemu: monitor: Provide more information in generic block job error
The qemuMonitorJSONBlockJob handles a few errors internally. If qemu
returns a different error we would report a rather unhelpful message:

 $ virsh blockpull gluster-job vda --base /dev/null
 error: internal error: Unexpected error

As the actual message from qemu contains a bit more info, let's use it
to report something a little more useful:

 $ virsh blockpull gluster-job vda --base /dev/null
 error: internal error: Unexpected error: (GenericError) 'Base '/dev/null' not found'
2014-03-05 15:08:56 +01:00
Peter Krempa
46446313e8 storage: Don't lie about path used to look up in error message
In storageVolLookupByPath the provided path is "sanitized" at first.
This removes some extra slashes and stuff. When the lookup of the volume
fails the original path is used which makes it hard to trace errors in
some cases.

Improve the error message to print the sanitized path along with the
user provided path if they are not equal.
2014-03-05 09:22:09 +01:00
Peter Krempa
7fb3902b0f storage: Avoid mangling paths of non-local filesystems when looking up
When looking up a volume by path on a non-local filesystem don't use the
"cleaned" path that might be mangled in such a way that it will differ
from a path provided by a storage backend.

Skip the cleanup step for gluster, sheepdog and RBD.
2014-03-05 09:20:05 +01:00
Peter Krempa
429bf2534c storage: Error out when attempting to vol-upload into a remote pool
Pools that are not backed by files in the filesystem cause problems with
some APIs. Error out when attempting to upload a volume in such a pool
as currently we expect a local file representation for it.
2014-03-05 09:08:32 +01:00
Peter Krempa
e45c30ee69 storage: Use cleanup label instead of out 2014-03-05 09:08:32 +01:00
Chunyan Liu
6b4c0a635e add virhostdev files to maintain global state of host devices
Signed-off-by: Chunyan Liu <cyliu@suse.com>
2014-03-04 12:28:45 +00:00
Chunyan Liu
de6fa535b0 add 'driver' info to used_by
Specify which driver and which domain in used_by area to avoid conflict among
different drivers.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
2014-03-04 12:24:13 +00:00
Cédric Bosdonnat
9194ccecf1 apparmor: handle "none" type 2014-03-04 11:26:59 +00:00
Cédric Bosdonnat
ef7dc7d429 add support for apparmor in lxc-enter-namespace 2014-03-04 11:15:47 +00:00
Cédric Bosdonnat
500b2e9655 apparmor: add debug traces when changing profile.
The reason for these is that aa-status doesn't show the process using
the profile as they are in another namespace.
2014-03-04 11:07:05 +00:00
Cédric Bosdonnat
43c030f046 LXC driver: generate apparmor profiles for guests
use_apparmor() was first designed to be called from withing libvirtd,
but libvirt_lxc also uses it. in libvirt_lxc, there is no need to check
whether to use apparmor or not: just use it if possible.
2014-03-04 11:07:05 +00:00
Peter Krempa
a31bd18f43 qemu: monitor: Fix error message and comment when getting cpu info
In qemuMonitorJSONExtractCPUInfo an error message hinted on missing
character device data which is wrong.

Also a comment states that only qemu-kvm tree includes the thread_id
field. This is no longer true.
2014-03-04 11:17:52 +01:00
Peter Krempa
d410e6f19d qemu: snapshot: Use better check when reverting external snapshots
https://bugzilla.redhat.com/show_bug.cgi?id=1071264

Reverting of external snapshots is not supported currently. The check
that is present doesn't properly check for all aspects that make a
snapshot external. Use virDomainSnapshotIsExternal() to do the check.
2014-03-04 11:12:44 +01:00
Michal Privoznik
042c4ab1c9 qemuBuildNicDevStr: Adapt to new advisory on multiqueue
As I did previously in 4f588a1b46, libvirt needs to set virtio vectors.
Previously, we were advised to use vectors=N, where

N = 2 * (number of queues) + 1

However, just recently this advisory has changed on the Multiquue wiki
page [1] to:

N = 2 * (number of queues) + 2

1: http://www.linux-kvm.org/page/Multiqueue#Enable_MQ_feature

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-03-04 10:43:05 +01:00
Ján Tomko
12ee0b98d3 Check if systemd is running before creating machines
If systemd is installed, but is not the init system,
systemd-machined fails with an unhelpful error message:
Launch helper exited with unknown return code 1

Currently we only check if the "machine1" service is
available (in ListActivatableNames).
Also check if "systemd1" service is registered with DBus
(ListNames).

This fixes https://bugs.gentoo.org/show_bug.cgi?id=493246#c22
2014-03-04 09:14:52 +01:00
Ján Tomko
65a4cb03c7 Split out most of virDBusIsServiceEnabled
Introduce virDBusIsServiceInList which can be used to call other
methods for listing services (ListNames), not just ListActivatableNames.

No functional change, fixed the 'Retruns' typo.
2014-03-04 09:14:52 +01:00
Eric Blake
b75c7bd6b9 build: fix cppi warning
Jenkins pointed out that the previous commit violates syntax
check when cppi is installed.

* src/nwfilter/nwfilter_dhcpsnoop.c (SNOOP_POLL_MAX_TIMEOUT_MS):
Update indentation.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-03-03 14:02:42 -07:00
Stefan Berger
49b59a151f nwfilter: Increase buffer size for libpcap
Libpcap 1.5 requires a larger buffer than previous pcap versions.
Adjust the size of the buffer to 128kb.

This patch should address symptoms in BZ 1071181 and BZ 731059

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
2014-03-03 15:13:50 -05:00
Stefan Berger
64df4c7518 nwfilter: Display the pcap errror message
Display the pcap error message in the log.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
2014-03-03 15:13:47 -05:00
Stefan Berger
a718eb19e3 nwfilter: Cap the poll timeout in the DHCP Snooping code
Cap the poll timeout in the DHCP Snooping code to a max. of 10 seconds
to not hold up the libvirt shutdown longer than this.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
2014-03-03 15:13:44 -05:00
Eric Blake
25f87817ab virFork: simplify semantics
The old semantics of virFork() violates the priciple of good
usability: it requires the caller to check the pid argument
after use, *even when virFork returned -1*, in order to properly
abort a child process that failed setup done immediately after
fork() - that is, the caller must call _exit() in the child.
While uses in virfile.c did this correctly, uses in 'virsh
lxc-enter-namespace' and 'virt-login-shell' would happily return
from the calling function in both the child and the parent,
leading to very confusing results. [Thankfully, I found the
problem by inspection, and can't actually trigger the double
return on error without an LD_PRELOAD library.]

It is much better if the semantics of virFork are impossible
to abuse.  Looking at virFork(), the parent could only ever
return -1 with a non-negative pid if it misused pthread_sigmask,
but this never happens.  Up until this patch series, the child
could return -1 with non-negative pid if it fails to set up
signals correctly, but we recently fixed that to make the child
call _exit() at that point instead of forcing the caller to do
it.  Thus, the return value and contents of the pid argument are
now redundant (a -1 return now happens only for failure to fork,
a child 0 return only happens for a successful 0 pid, and a
parent 0 return only happens for a successful non-zero pid),
so we might as well return the pid directly rather than an
integer of whether it succeeded or failed; this is also good
from the interface design perspective as users are already
familiar with fork() semantics.

One last change in this patch: before returning the pid directly,
I found cases where using virProcessWait unconditionally on a
cleanup path of a virFork's -1 pid return would be nicer if there
were a way to avoid it overwriting an earlier message.  While
such paths are a bit harder to come by with my change to a direct
pid return, I decided to keep the virProcessWait change in this
patch.

* src/util/vircommand.h (virFork): Change signature.
* src/util/vircommand.c (virFork): Guarantee that child will only
return on success, to simplify callers.  Return pid rather than
status, now that the situations are always the same.
(virExec): Adjust caller, also avoid open-coding process death.
* src/util/virprocess.c (virProcessWait): Tweak semantics when pid
is -1.
(virProcessRunInMountNamespace): Adjust caller.
* src/util/virfile.c (virFileAccessibleAs, virFileOpenForked)
(virDirCreate): Likewise.
* tools/virt-login-shell.c (main): Likewise.
* tools/virsh-domain.c (cmdLxcEnterNamespace): Likewise.
* tests/commandtest.c (test23): Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-03-03 12:40:32 -07:00
Eric Blake
b9dd878ff8 util: make it easier to grab only regular command exit
Auditing all callers of virCommandRun and virCommandWait that
passed a non-NULL pointer for exit status turned up some
interesting observations.  Many callers were merely passing
a pointer to avoid the overall command dying, but without
caring what the exit status was - but these callers would
be better off treating a child death by signal as an abnormal
exit.  Other callers were actually acting on the status, but
not all of them remembered to filter by WIFEXITED and convert
with WEXITSTATUS; depending on the platform, this can result
in a status being reported as 256 times too big.  And among
those that correctly parse the output, it gets rather verbose.
Finally, there were the callers that explicitly checked that
the status was 0, and gave their own message, but with fewer
details than what virCommand gives for free.

So the best idea is to move the complexity out of callers and
into virCommand - by default, we return the actual exit status
already cleaned through WEXITSTATUS and treat signals as a
failed command; but the few callers that care can ask for raw
status and act on it themselves.

* src/util/vircommand.h (virCommandRawStatus): New prototype.
* src/libvirt_private.syms (util/command.h): Export it.
* docs/internals/command.html.in: Document it.
* src/util/vircommand.c (virCommandRawStatus): New function.
(virCommandWait): Adjust semantics.
* tests/commandtest.c (test1): Test it.
* daemon/remote.c (remoteDispatchAuthPolkit): Adjust callers.
* src/access/viraccessdriverpolkit.c (virAccessDriverPolkitCheck):
Likewise.
* src/fdstream.c (virFDStreamCloseInt): Likewise.
* src/lxc/lxc_process.c (virLXCProcessStart): Likewise.
* src/qemu/qemu_command.c (qemuCreateInBridgePortWithHelper):
Likewise.
* src/xen/xen_driver.c (xenUnifiedXendProbe): Simplify.
* tests/reconnect.c (mymain): Likewise.
* tests/statstest.c (mymain): Likewise.
* src/bhyve/bhyve_process.c (virBhyveProcessStart)
(virBhyveProcessStop): Don't overwrite virCommand error.
* src/libvirt.c (virConnectAuthGainPolkit): Likewise.
* src/openvz/openvz_driver.c (openvzDomainGetBarrierLimit)
(openvzDomainSetBarrierLimit): Likewise.
* src/util/virebtables.c (virEbTablesOnceInit): Likewise.
* src/util/viriptables.c (virIpTablesOnceInit): Likewise.
* src/util/virnetdevveth.c (virNetDevVethCreate): Fix debug
message.
* src/qemu/qemu_capabilities.c (virQEMUCapsInitQMP): Add comment.
* src/storage/storage_backend_iscsi.c
(virStorageBackendISCSINodeUpdate): Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-03-03 12:40:32 -07:00
Eric Blake
c72e76c3d9 util: make it easier to grab only regular process exit
Right now, a caller waiting for a child process either requires
the child to have status 0, or must use WIFEXITED() and friends
itself.  But in many cases, we want the middle ground of treating
fatal signals as an error, and directly accessing the normal exit
value without having to use WEXITSTATUS(), in order to easily
detect an expected non-zero exit status.  This adds the middle
ground to the low-level virProcessWait; the next patch will add
it to virCommand.

* src/util/virprocess.h (virProcessWait): Alter signature.
* src/util/virprocess.c (virProcessWait): Add parameter.
(virProcessRunInMountNamespace): Adjust caller.
* src/util/vircommand.c (virCommandWait): Likewise.
* src/util/virfile.c (virFileAccessibleAs): Likewise.
* src/lxc/lxc_container.c (lxcContainerHasReboot)
(lxcContainerAvailable): Likewise.
* daemon/libvirtd.c (daemonForkIntoBackground): Likewise.
* tools/virt-login-shell.c (main): Likewise.
* tools/virsh-domain.c (cmdLxcEnterNamespace): Likewise.
* tests/testutils.c (virtTestCaptureProgramOutput): Likewise.
* tests/commandtest.c (test23): Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-03-03 12:40:31 -07:00
Eric Blake
8b24a803ad util: preserve exit status from mount namespace callback
The documentation of namespace callbacks was inconsistent on whether
it preserved positive return values.  Now that we have a dedicated
EXIT_CANCELED to flag all errors before getting to the callback,
it is possible to use positive return values (not that any of the
current callers do, but it is better to match the docs).

Also, while vircommand.c is careful to close fds that a child should
not have, it's still better to be in the practice of setting
FD_CLOEXEC up front.

* src/util/virprocess.c (virProcessRunInMountNamespace): Tweak
return value to pass back non-zero status.  Avoid leaking pipe fds
to other threads.
* src/util/virprocess.h: Fix comment.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-03-03 12:40:31 -07:00
Eric Blake
2b4f162eb4 util: make it easier to reflect child exit status
Thanks to namespaces, we have a couple of places in the code
base that want to reflect a child exit status, including the
ability to detect death by a signal, back to a grandparent.
Best to make it a reusable function.

* src/util/virprocess.h (virProcessExitWithStatus): New prototype.
* src/libvirt_private.syms (util/virprocess.h): Export it.
* src/util/virprocess.c (virProcessExitWithStatus): New function.
* tests/commandtest.c (test23): Test it.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-03-03 12:40:31 -07:00
Eric Blake
631923e7f2 virFork: give specific status on failure prior to exec
When a child fails without exec'ing, we want a well-known status;
best is to match what env(1), nice(1), su(1), and other wrapper
programs do.  This patch adds enum values that later patches will
use, and sets up virFork as the first client of EXIT_CANCELED
for errors detected prior to even attempting exec, as well as
virExec to distinguish between a missing executable vs. a binary
that cannot be executed.

This is a slight semantic change in the unlikely case of a child
process failing to restore its signal mask - we now kill the
child with a known status instead of relying on the caller to
notice and do an appropriate _exit().  A subsequent patch will
make further cleanups based on an audit of all callers.

* src/internal.h (EXIT_CANCELED, EXIT_CANNOT_INVOKE)
(EXIT_ENOENT): New enum.
* src/util/vircommand.c (virFork): Document specific exit value if
child aborts early.
(virExec): Distinguish between various exec failures.
* tests/commandtest.c (test1): Enhance test.
(test22): New test.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-03-03 12:40:31 -07:00
Eric Blake
f972a7c72c nwfilter: make ignoring non-zero status easier to follow
While auditing all callers of virCommandRun, I noticed that nwfilter
code never paid attention to commands with a non-zero status; they
were merely passing a pointer to avoid spamming the logs with a
message about commands that might indeed fail.  But proving this
required chasing through a lot of code; refactoring things to
localize the decision of whether to ignore non-zero status makes
it easier to prove that later changes to virFork don't negatively
affect this code.

While at it, I also noticed that ebiptablesRemoveRules would
actually report success if the child process failed for a
reason other than non-zero status, such as OOM.

* src/nwfilter/nwfilter_ebiptables_driver.c (ebiptablesExecCLI):
Change parameter from pointer to bool.
(ebtablesApplyBasicRules, ebtablesApplyDHCPOnlyRules)
(ebtablesApplyDropAllRules, ebtablesCleanAll)
(ebiptablesApplyNewRules, ebiptablesTearNewRules)
(ebiptablesTearOldRules, ebiptablesAllTeardown)
(ebiptablesDriverInitWithFirewallD)
(ebiptablesDriverTestCLITools, ebiptablesDriverProbeStateMatch):
Adjust all clients.
(ebiptablesRemoveRules): Likewise, and fix return value on failure.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-03-03 12:40:31 -07:00
Oleg Strikov
72bddd5f2f qemu: Implement a stub cpuArchDriver.baseline() handler for arm
Openstack Nova calls virConnectBaselineCPU() during initialization
of the instance to get a full list of CPU features.
This patch adds a stub to arm-specific code to handle
this request (no actual work is done).

Signed-off-by: Oleg Strikov <oleg.strikov@canonical.com>
2014-03-03 11:06:25 -05:00
Daniel P. Berrange
36ff4ed1ec Generate a unique journald log for QEMU capabilities failure
When probing QEMU capabilities fails for a binary generate a
log message with MESSAGE_ID==8ae2f3fb-2dbe-498e-8fbd-012d40afa361.

This can be directly queried from journald based on the UUID
instead of needing string grep. This lets tools like libguestfs'
bug reporting tool trivially do automated sanity tests on the
host they're running on.

 $ journalctl MESSAGE_ID=8ae2f3fb-2dbe-498e-8fbd-012d40afa361
 Feb 21 17:11:01 localhost.localdomain lt-libvirtd[9196]:
 Failed to probe capabilities for /bin/qemu-system-alpha:
 internal error: Child process (LC_ALL=C LD_LIBRARY_PATH=
 /home/berrange/src/virt/libvirt/src/.libs PATH=/usr/lib64/
 ccache:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:
 /usr/bin:/root/bin HOME=/root USER=root LOGNAME=root
 /bin/qemu-system-alpha -help) unexpected exit status 127:
 /bin/qemu-system-alpha: error while loading shared libraries:
 libglapi.so.0: cannot open shared object file: No such file
 or directory

 $ journalctl MESSAGE_ID=8ae2f3fb-2dbe-498e-8fbd-012d40afa361 --output=json
 { ...snip...
  "LIBVIRT_SOURCE" : "file",
  "PRIORITY" : "3",
  "CODE_FILE" : "qemu/qemu_capabilities.c",
  "CODE_LINE" : "2770",
  "CODE_FUNC" : "virQEMUCapsLogProbeFailure",
  "MESSAGE_ID" : "8ae2f3fb-2dbe-498e-8fbd-012d40afa361",
  "LIBVIRT_QEMU_BINARY" : "/bin/qemu-system-xtensa",
  "MESSAGE" : "Failed to probe capabilities for /bin/qemu-system-xtensa:
   internal error: Child process (LC_ALL=C LD_LIBRARY_PATH=/home/berrange
   /src/virt/libvirt/src/.libs PATH=/usr/lib64/ccache:/usr/local/sbin:
   /usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin HOME=/root
   USER=root LOGNAME=root /bin/qemu-system-xtensa -help) unexpected
   exit status 127: /bin/qemu-system-xtensa: error while loading shared
   libraries: libglapi.so.0: cannot open shared object file: No such
    file or directory\n" }

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-03-03 11:42:37 +00:00
Roman Bogorodskiy
e2d85e6fa1 bhyve: add basic documentation 2014-03-01 23:44:58 +04:00
Roman Bogorodskiy
ae49a093c8 bhyve: defined domains should be persistent 2014-03-01 11:44:19 +04:00
Roman Bogorodskiy
91f396b33b bhyve: support domain undefine
Implement domainUndefine and required helper functions:
 - domainIsActive
 - domainIsPersistent
2014-02-28 23:23:44 +04:00
Daniel P. Berrange
f223b96051 Add comments describing the different log sources
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-28 17:38:46 +00:00
Daniel P. Berrange
0915053e97 Include error domain and code in log messages from errors
When a virError is raised, pass the error domain and code
onto the systemd journald using metadata fields.

This allows error messages to be queried by code eg

  $ journalctl LIBVIRT_CODE=43

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-28 17:38:46 +00:00
Daniel P. Berrange
21d370f0b9 Fix journald PRIORITY values
The systemd journal expects log record PRIORITY values to
be encoded using the syslog compatible numbering scheme,
not libvirt's own native numbering scheme. We must therefore
apply a conversion.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-28 17:37:38 +00:00
Daniel P. Berrange
54209df345 Send virLogMetadata fields onto the journal
The systemd journal accepts arbitrary user specified log
fields. These can be passed into virLogMessage via the
virLogMetadata structure. Allow up to 5 custom fields to
be reported by libvirt callers.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-28 17:37:38 +00:00
Oleg Strikov
97962616c1 qemu: Enable 'host-passthrough' cpu mode for arm
This patch allows libvirt user to specify 'host-passthrough'
cpu mode while using qemu/kvm backend on arm (arm32).
It uses 'host' as a CPU model name instead of some other stub
(correct CPU detection is not implemented yet) to allow libvirt
user to specify 'host-model' cpu mode as well.

Signed-off-by: Oleg Strikov <oleg.strikov@canonical.com>
2014-02-28 11:31:00 -05:00
Michal Privoznik
1df00e2b22 virDomainBlockStats(Flags): Produce saner error message on empty disk path
As of 0bd2ccdec an empty disk path for virDomainBlockStats (or the one
with Flags) is allowed meaning "get me overall summarized statistics".
However, running 'virsh domblkstat $dom' throws a misleading error:

  # ./tools/virsh domblkstat dom
  error: Failed to get block stats dom
  error: invalid argument: invalid path:

while after this commit

  # virsh domblkstat dom
  error: Operation not supported: summary statistics are not supported yet

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-02-28 09:50:01 +01:00
Jiri Denemark
8f10c1e77f sanlock: Truncate domain names longer than SANLK_NAME_LEN
Libvirt uses a domain name to fill in owner_name in sanlock_options in
virLockManagerSanlockAcquire. Unfortunately, owner_name is limited to
SANLK_NAME_LEN characters (including trailing '\0'), which means domains
with longer names fail to start when sanlock is enabled. However, we can
truncate the name when setting owner_name as explained by sanlock's
author:

Setting sanlk_options or the owner_name is unnecessary, and has very
little to no benefit.  If you do provide something in owner_name, it can
be anything, sanlock doesn't care or use it.

If you run the command "sanlock status", the output will display a list
of clients connected to the sanlock daemon.  This client list is
displayed as "pid owner_name" if the client has provided an owner_name
via sanlk_options. This debugging output is the only usage of
owner_name, so its only benefit is to potentially provide a more human
friendly output for debugging purposes.
2014-02-27 09:32:41 +01:00
Ian Campbell
bf5dbce61e libxl: Recognise ARM architectures
Only tested on v7 but the v8 equivalent seems pretty obvious.

XEN_CAP_REGEX already accepts more than it should (e.g. x86_64p or x86_32be)
but I have stuck with the existing pattern.

With this I can create a guest from:
  <domain type='xen'>
    <name>libvirt-test</name>
    <uuid>6343998e-9eda-11e3-98f6-77252a7d02f3</uuid>
    <memory>393216</memory>
    <currentMemory>393216</currentMemory>
    <vcpu>1</vcpu>
    <os>
      <type arch='armv7l' machine='xenpv'>linux</type>
      <kernel>/boot/vmlinuz-arm-native</kernel>
      <cmdline>console=hvc0 earlyprintk debug root=/dev/xvda1</cmdline>
    </os>
    <clock offset='utc'/>
    <on_poweroff>destroy</on_poweroff>
    <on_reboot>restart</on_reboot>
    <on_crash>destroy</on_crash>
    <devices>
      <disk type='block' device='disk'>
        <source dev='/dev/marilith-n0/debian-disk'/>
        <target dev='xvda1'/>
      </disk>
      <interface type='bridge'>
        <mac address='8e:a7:8e:3c:f4:f6'/>
        <source bridge='xenbr0'/>
      </interface>
    </devices>
  </domain>

Using virsh create and I can destroy it too.

Currently virsh console fails with:
  Connected to domain libvirt-test
  Escape character is ^]
  error: internal error: cannot find character device <null>

I haven't investigated yet.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2014-02-26 06:33:23 -07:00
Laine Stump
eed46d4cfe network: unplug bandwidth and call networkRunHook only when appropriate
According to commit b4e0299d if networkAllocateActualDevice() was
successful, it will *always* allocate an iface->data.network.actual,
so we can use this during networkReleaseActualDevice() to know if
there is really anything to undo. We were properly using this
information to only decrement the network connections counter if it
had previously been incremented, but we were unconditionally
unplugging bandwidth and calling the "unplugged" network hook for
*all* interfaces (during qemuProcessStop()) whether they had been
previously plugged or not. This caused problems if a domain failed to
start at some time prior to all interfaces being allocated. (I
encountered this when an interface had a bandwidth floor set but no
inbound QoS).

This patch changes both the call to networkUnplugBandwidth() and the
call to networkRunHook() to only be called if there was a previous
call to "plug" for the same interface.
2014-02-26 13:08:56 +02:00
Laine Stump
0700a3dac4 network: don't even call networkRunHook if there is no network
networkAllocateActualDevice() is called for *all* interfaces, not just
those with type='network'. In that case, it will jump down to its
validate: label immediately, without allocating anything. After
validation is done, two counters are potentially updated (one for the
network, and one for any particular physical device that is chosen),
and then networkRunHook() is called.

This patch refactors that code a slight bit so that networkRunHook()
doesn't get called if netdef is NULL (i.e. type != network) and to
place the conditional increment of dev->connections inside the "if
(netdef)" as well - dev can never be non-null if netdef is null
(because "dev" is the pointer to a device in a network's pool of
devices), so this doesn't have any functional effect, it just makes
the code clearer.
2014-02-26 13:03:49 +02:00
Nehal J Wani
969493f91d Fix memory leak in virSCSIDeviceListDel()
While running virscsitest, it was found that valgrind pointed out the following
memory leak:

==320== 5 bytes in 1 blocks are definitely lost in loss record 4 of 37
==320==    at 0x4A069EE: malloc (vg_replace_malloc.c:270)
==320==    by 0x3E6CE81171: strdup (strdup.c:43)
==320==    by 0x4CB28DF: virStrdup (virstring.c:554)
==320==    by 0x4CAC987: virSCSIDeviceSetUsedBy (virscsi.c:289)
==320==    by 0x402321: test2 (virscsitest.c:100)
==320==    by 0x403231: virtTestRun (testutils.c:199)
==320==    by 0x402121: mymain (virscsitest.c:180)
==320==    by 0x4039AD: virtTestMain (testutils.c:782)
==320==    by 0x3E6CE1ED1C: (below main) (libc-start.c:226)
==320==

Introduced by commit fd243fc.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2014-02-26 11:41:40 +01:00
Michal Privoznik
c0d162c68c virNetDevVethCreate: Serialize callers
Consider dozen of LXC domains, each of them having this type of interface:

    <interface type='network'>
      <mac address='52:54:00:a7:05:4b'/>
      <source network='default'/>
    </interface>

When starting these domain in parallel, all workers may meet in
virNetDevVethCreate() where a race starts. Race over allocating veth
pairs because allocation requires two steps:

  1) find first nonexistent '/sys/class/net/vnet%d/'
  2) run 'ip link add ...' command

Now consider two threads. Both of them find N as the first unused veth
index but only one of them succeeds allocating it. The other one fails.
For such cases, we are running the allocation in a loop with 10 rounds.
However this is very flaky synchronization. It should be rather used
when libvirt is competing with other process than when libvirt threads
fight each other. Therefore, internally we should use mutex to serialize
callers, and do the allocation in loop (just in case we are competing
with a different process). By the way we have something similar already
since 1cf97c87.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-02-26 08:50:47 +01:00
Eric Blake
fa2e4dbfd6 build: fix cgroups on non-Linux
Running ./autobuild.sh detected a mingw failure:

  CCLD     libvirt.la
Cannot export virCgroupGetPercpuStats: symbol not defined
Cannot export virCgroupSetOwner: symbol not defined

* src/util/vircgroup.c (virCgroupGetPercpuStats)
(virCgroupSetOwner): Implement stubs.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-02-25 17:38:46 -07:00
Jim Fehlig
4d975deddd libxl: queue domain event earlier in shutdown handler
The shutdown handler may restart a domain when handling a reboot
event or when <on_*> is set to 'restart'.  Restarting consists of
calling libxlVmCleanup followed by libxlVmStart.  libxlVmStart will
emit a VIR_DOMAIN_EVENT_STARTED event, but the SHUTDOWN event is
not emitted until exiting the shutdown handler, after the STARTED
event.

This patch changes the logic a bit to queue the event at the start
of the shutdown action, ensuring it is queued before any subsequent
events that may be generated while executing the shutdown action.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-25 10:54:04 -07:00
Laine Stump
2122cf3979 network: include plugged interface XML in "plugged" network hook
The network hook script gets called whenever an interface is plugged
into or unplugged from a network, but even though the full XML of both
the network and the domain is included, there is no reasonable way to
determine what exact resources the plugged interface is using:

1) Prior to a recent patch which modified the status XML of interfaces
to include the information about actual hardware resources used, it
would be possible to scan through the domain XML output sent to the
hook, and from there find the correct interface, but that interface
definition would not include any runtime info (e.g. bandwidth or vlan
taken from a portgroup, or which physdev was used in case of a macvtap
network).

2) After the patch modifying the status XML of interfaces, the network
name would no longer be included in the domain XML, so it would be
completely impossible to determine which interface was the one being
plugged.

To solve that problem, this patch includes a single <interface>
element at the beginning of the XML sent to the network hook for
"plugged" and "unplugged" (just inside <hookData>) that is the status
XML of the interface being plugged. This XML will include all info
gathered from the chosen network and portgroup.

NB: due to hardcoded spaces in all of the device *Format() functions,
the <interface> element inside the <hookData> will be indented by 6
spaces rather than 2. I had intended to fix this, but it turns out
that to make virDomainNetDefFormat() indentation relative, I would
have to do the same to virDomainDeviceInfoFormat(), and that function
is called from 19 places - making that a prerequisite of this patch
would cause too many merge difficulties if we needed to backport
network hooks, so I chose to ignore the problem here and fix the
problem for *all* devices in a followup later.
2014-02-25 16:07:36 +02:00
Laine Stump
7d5bf48474 conf: output actual netdev status in <interface> XML
Until now, the "live" XML status of an <interface type='network'>
device would always show the network information, rather than the
exact hardware device that was used. It would also show the name of
any portgroup the interface belonged to, rather than providing the
configuration that was derived from that portgroup. As an example,
given the following network definition:

[A]
  <network>
    <name>testnet</name>
    <forward type='bridge' dev='p4p1_0'>
      <interface dev='p4p1_0'/>
      <interface dev='p4p1_1'/>
      <interface dev='p4p1_2'/>
      <interface dev='p4p1_3'/>
    </forward>
    <portgroup name='admin'>
      <bandwidth>
          <inbound average='1000' peak='5000' burst='1024'/>
          <outbound average='128' peak='256' burst='256'/>
      </bandwidth>
    </portgroup>
  </network>

and the following domain <interface>:

[B]
  <interface type='network'>
    <source network='testnet' portgroup='admin'/>
  </interface>

the output of "virsh dumpxml $domain" while the domain was running
would yield something like this:

[C]
  <interface type='network'>
    <source network='testnet' portgroup='admin'/>
    <target dev='macvtap0'/>
    <alias name='net0'/>
    <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
  </interface>

In order to learn the exact bandwidth information of the interface, a
management application would need to retrieve the XML for testnet,
then search for the portgroup named "admin". Even worse, there was no
simple and standard way to learn which host physdev the macvtap0
device is attached to.

Internally, libvirt has always kept this information in the
virDomainDef that is held in memory, as well as storing it in the
(libvirt-internal-only) domain status XML (in
/var/run/libvirt/qemu/$domain.xml). In order to not confuse the runtime
"actual state" with the config of the device, it's internally stored
like this:

[D]
  <interface type='network'>
    <source network='testnet' portgroup='admin'/>
    <actual type='direct'>
      <source dev='p4p1_0' mode='bridge'/>
      <bandwidth>
          <inbound average='1000' peak='5000' burst='1024'/>
          <outbound average='128' peak='256' burst='256'/>
      </bandwidth>
    </actual>
    <target dev='macvtap0'/>
    <alias name='net0'/>
    <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
  </interface>

This was never exposed outside of libvirt though, because I thought it
would be too awkward for a management application to need to look in
two places for the same information, but I also wasn't sure that it
would be okay to overwrite the config info (in this case "<source
network='testnet' portgroup='admin'/>") with the actual runtime info
(everything inside <actual> above).

Now we have a need for this information to be made available to
management applications (in particular, so that a network "plugged"
hook will have full information about the device that is being plugged
in), so it's time to take the leap and decide that it is acceptable
for the config info to be replaced with actual runtime state (but
*only* when reporting domain live status, *not* when saving state in
/var/run/libvirt/qemu/$domain.xml - that remains the same so that
there is no loss of information). That is what this patch does - once
applied, the output of "virsh dumpxml $domain" when the domain is
running will contain something like this:

[E]
  <interface type='direct'>
    <source dev='p4p1_0' mode='bridge'/>
    <bandwidth>
        <inbound average='1000' peak='5000' burst='1024'/>
        <outbound average='128' peak='256' burst='256'/>
    </bandwidth>
    <target dev='macvtap0'/>
    <alias name='net0'/>
    <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
  </interface>

In effect, everything that is internally stored within <actual> is
moved up a level to where a management application will expect
it. This means that the management application will only look in a
single place to learn - the type of interface in use, the name of the
physdev (if relevant), the <bandwidth>, <vlan>, and <virtualport>
settings in use.

The potential downside is that a management app looking at this output
will not see that the physdev 'p4p1_0' was actually allocated from the
network named 'testnet', or that the bandwidth numbers were taken from
the portgroup 'admin'. However, if they are interested in that info,
they can always get the "inactive" XML for the domain.

An example of where this could cause problems is in virt-manager's
network device display, which shows the status of the device, but
allows you to edit that status info and save it as the new
config. Previously virt-manager would always display the information
in example [C] above, and allow editing that. With this patch, it will
instead display what is in [E] and allow editing it directly, which
could lead to some confusion. I would suggest that virt-manager have
an "edit" button which would change the display from the "live" xml to
the "inactive" xml, so that editing would be done on that; such a
change would both handle the new situation, and also be compatible
with older releases.
2014-02-25 16:06:43 +02:00
Laine Stump
9da98aa5e1 conf: new function virDomainActualNetDefContentsFormat
This function is currently only called from one place, but in a
subsequent patch will be called from a 2nd place.

The new function exactly replicates the original behavior of the part
of virDomainActualNetDefFormat() that it replaces, but takes a
virDomainNetDefPtr instead of virDomainActualNetDefPtr, and uses the
virDomainNetGetActual*() functions whenever possible, rather than
reaching into def->data.network.actual - this is to be sure that we
are reporting exactly what is being used internally, just in case
there are any discrepancies (there shouldn't be).
2014-02-25 16:04:26 +02:00
Laine Stump
65487c0fc5 conf: re-situate <bandwidth> element in <interface>
This moves the call to virNetDevBandwidthFormat() in
virDomainNetDefFormat() to be called right after the call to
virNetDevVPortProfileFormat(), so that a single chunk of that function
can be placed inside an if that conditionally calls
virDomainActualNetDefContentsFormat() instead (next patch). The
re-ordering necessitates modifying a couple of test data files.
2014-02-25 16:03:05 +02:00
Laine Stump
7c39214cd4 conf: make virDomainNetDefFormat a public function
We will need to call virDomainNetDefFormat() from the network hook (in
the network driver).
2014-02-25 16:01:39 +02:00
Laine Stump
79358733b0 conf: handle null pointer in virNetDevVlanFormat
Other *Format() functions (e.g. virNetDevBandwidthFormat()) return
with no action when called with a NULL *Def pointer. This makes
virNetDevVlanFormat() consistent with that behavior.
2014-02-25 15:56:12 +02:00
Laine Stump
6d4ffae4fc conf: clarify what is returned for actual bandwidth and vlan
In practice, if a virDomainNetDef has a virDomainActualNetDef
allocated, the ActualNetDef will *always* contain the bandwidth and
vlan data from the NetDef (unless there was also a portgroup involved
- see networkAllocateActualDevice()).

However, virDomainNetGetActual(Bandwidth|Vlan)() were coded to make it
appear as if it might be possible to have a valid bandwidth/vlan in
the NetDef, but a NULL in the ActualNetDef. Believing this un-truth
could lead to writing unnecessarily defensive code when dealing with
the virDomainGetActual*() functions, so this patch makes it more
obvious:

   If there is an ActualNetDef, it will always have a copy of the
   various appropriate bits from its parent NetDef, and the
   virDomainGetActual* function will *always* return the data from the
   ActualNetDef, not from the NetDef.

The reason for this effective-NOP patch is that a subsequent patch to
change virDomainNetDefFormat will rely on the above rule.
2014-02-25 15:55:19 +02:00
Wido den Hollander
60f70542f9 rbd: Set timeout options for librados
These timeout values make librados/librbd return -ETIMEDOUT when a
operation is blocking due to a failing/unreachable Ceph cluster.

By having the operations time out libvirt will not block.
2014-02-25 11:14:44 +01:00
Wido den Hollander
761491eb7c rbd: Include return statuses from librados/librbd in logging
With this information it's easier for the user to debug what is
going wrong.
2014-02-25 11:14:28 +01:00
Jim Fehlig
cfad607b23 libxl: handle on_crash coredump actions
Add support for coredump-{destroy,restart} actions of <on_crash> event.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-24 10:39:44 -07:00
Jim Fehlig
c2de456e4e libxl: add dump dir to libxlDriverConfig object
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-24 10:27:53 -07:00
Jim Fehlig
51b9b39127 libxl: honor domain lifecycle event configuration
The libxl driver was ignoring the <on_*> domain event configuration,
causing e.g. a domain to be rebooted even when on_reboot is set to
destroy.

This patch honors the <on_*> configuration in the shutdown event
handler.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-24 10:26:52 -07:00
Richard Weinberger
6fb42d7cdc Ensure systemd cgroup ownership is delegated to container with userns
This function is needed for user namespaces, where we need to chmod()
the cgroup to the initial uid/gid such that systemd is allowed to
use the cgroup.

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-24 15:35:47 +00:00
Roman Bogorodskiy
8ca5f46c59 bhyve: implement node information reporting
- Implement nodeGetCPUStats using nodeGetCPUStats()
- Implement nodeGetMemoryStats using nodeGetMemoryStats()
2014-02-24 19:03:46 +04:00
Daniel P. Berrange
66e3a3e914 Add virStringReplace method for substring replacement
Add a virStringReplace method to virstring.{h,c} to perform
substring matching and replacement

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-24 10:51:22 +00:00
Manuel VIVES
12aa71dfde Add virStringSearch method for regex matching
Add a virStringSearch method to virstring.{c,h} which performs
a regex match against a string and returns the matching substrings.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-24 10:46:28 +00:00
Michal Privoznik
68954fb25c virNetServerRun: Notify systemd that we're accepting clients
Systemd does not forget about the cases, where client service needs to
wait for daemon service to initialize and start accepting new clients.
Setting a dependency in client is not enough as systemd doesn't know
when the daemon has initialized itself and started accepting new
clients. However, it offers a mechanism to solve this. The daemon needs
to call a special systemd function by which the daemon tells "I'm ready
to accept new clients". This is exactly what we need with
libvirtd-guests (client) and libvirtd (daemon). So now, with this
change, libvirt-guests.service is invoked not any sooner than
libvirtd.service calls the systemd notify function.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-02-24 10:54:48 +01:00
Michal Privoznik
ba79e3879e virSystemdCreateMachine: Set dependencies for slices
https://bugzilla.redhat.com/show_bug.cgi?id=1031696

When creating a new domain, we let systemd know about it by calling
CreateMachine() function via dbus. Systemd then creates a scope and
places domain into it. However, later when the host is shutting
down, systemd computes the shutdown order to see what processes can
be shut down in parallel. And since we were not setting
dependencies at all, the slices (and thus domains) were most likely
killed before libvirt-guests.service. So user domains that had to
be saved, shut off, whatever were in fact killed.  This problem can
be solved by letting systemd know that scopes we're creating must
not be killed before libvirt-guests.service.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-02-24 10:21:00 +01:00
Ján Tomko
57e17a74b7 Ignore additional fields in iscsiadm output
There has been a new field introduced in iscsiadm --mode session
output [1], but our regex only expects four fields. This breaks
startup of iscsi pools:
error: Failed to start pool iscsi
error: internal error: cannot find session

Fix this by ignoring anything after the fourth field.

https://bugzilla.redhat.com/show_bug.cgi?id=1067173

[1] https://github.com/mikechristie/open-iscsi/commit/181af9a
2014-02-21 10:35:57 +01:00
Ján Tomko
abf1daf0d7 Add a stub for virCgroupGetDomainTotalCpuStats
Commit 6515889 broke the build on FreeBSD:
In function `qemuDomainGetCPUStats':
/../../src/qemu/qemu_driver.c:16102:
undefined reference to `virCgroupGetDomainTotalCpuStats'
2014-02-21 09:10:48 +01:00
Jim Fehlig
84a6209d7f libxl: queue shutdown event on domain shutdown
Emit libvirt shutdown event when receiving LIBXL_SHUTDOWN_REASON_POWEROFF
event from libxl.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-20 15:50:06 -07:00
Jim Fehlig
d716d942e2 libxl: always use libxlVmCleanupJob in shutdown thread
Commit e4a0e900 missed calling libxlVmCleanupJob in the shutdown
handler when processing a reboot event.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-20 11:50:33 -07:00
Eric Blake
60f7303c15 qemu: adjust maxmem/maxvcpu computation
https://bugzilla.redhat.com/show_bug.cgi?id=1038363

If a domain has a different maximum for persistent and live maxmem
or max vcpus, then it is possible to hit cases where libvirt
refuses to adjust the current values or gets halfway through
the adjustment before failing.  Better is to determine up front
if the change is possible for all requested flags.

Based on an idea by Geoff Franks.

* src/qemu/qemu_driver.c (qemuDomainSetMemoryFlags): Compute
correct maximum if both live and config are being set.
(qemuDomainSetVcpusFlags): Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-02-20 11:27:16 -07:00
Daniel P. Berrange
432a3fee3b Rename virDomainGetRootFilesystem to virDomainGetFilesystemForTarget
The virDomainGetRootFilesystem method can be generalized to allow
any filesystem path to be obtained.

While doing this, start a new test case for purpose of testing various
helper methods in the domain_conf.{c,h} files, such as this one.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-20 15:50:46 +00:00
Daniel P. Berrange
cb9b3bc257 Fix multiple bugs in LXC domainMemoryStats driver
The virCgroupXXX APIs' return value must be checked for
being less than 0, not equal to 0.

An VIR_ERR_OPERATION_INVALID error must also be raised
when the VM is not running to prevent a crash on NULL
priv->cgroup field.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-20 15:32:49 +00:00
Thorsten Behrens
0bd2ccdecc Widening API change - accept empty path for virDomainBlockStats
And provide domain summary stat in that case, for lxc backend.
Use case is a container inheriting all devices from the host,
e.g. when doing application containerization.
2014-02-20 16:20:09 +01:00
Thorsten Behrens
dcc85c603e Implement lxcDomainBlockStats* for lxc driver
Adds lxcDomainBlockStatsFlags and lxcDomainBlockStats functions.
2014-02-20 16:20:09 +01:00
Thorsten Behrens
4b3b2f6ceb Implement domainGetCPUStats for lxc driver. 2014-02-20 16:20:09 +01:00
Thorsten Behrens
65158899b7 Make qemuGetDomainTotalCPUStats a virCgroup function.
To reuse this from other drivers, like lxc.
2014-02-20 16:20:09 +01:00
Thorsten Behrens
192604ddee Implement domainMemoryStats API slot for LXC driver. 2014-02-20 16:20:09 +01:00
Thorsten Behrens
a2bb187c7e Add util virCgroupGetBlkioIo*Serviced methods.
This reads blkio stats from blkio.throttle.io_service_bytes and
blkio.throttle.io_serviced.
2014-02-20 16:20:09 +01:00
Richard Weinberger
39aad72510 lxc: Add destroy support for suspended domains
Destroying a suspended domain needs special action.
We cannot simply terminate all process because they are frozen.
Do deal with that we send them SIGKILL and thaw them.
Upon wakeup the process sees the pending signal and dies immediately.

Signed-off-by: Richard Weinberger <richard@nod.at>
2014-02-20 10:46:31 +01:00
Ján Tomko
057d26b2ac Fix build of portallocator on mingw
IN6ADDR_ANY_INIT does not seem to be working as expected on MinGW:
error: missing braces around initializer [-Werror=missing-braces]
         .sin6_addr = IN6ADDR_ANY_INIT,

Use the in6addr_any variable instead.

Reported by Daniel P. Berrange.
2014-02-20 10:16:07 +01:00
Michal Privoznik
83c404ff9b networkRunHook: Run hook only if possible
Currently, networkRunHook() is called in networkAllocateActualDevice and
friends. These functions, however, doesn't necessarily work on networks,
For example, if domain's interface is defined in this fashion:

    <interface type='bridge'>
      <mac address='52:54:00:0b:3b:16'/>
      <source bridge='virbr1'/>
      <model type='rtl8139'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </interface>

The networkAllocateActualDevice jumps directly onto 'validate' label as
the interface is not type of 'network'. Hence, @network is left
initialized to NULL and networkRunHook(network, ...) is called. One of
the things that the hook function does is dereference @network. Soupir.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-02-20 08:56:17 +01:00
Jim Fehlig
e6dcb0e2a1 libxl: use job functions in libxlDomainSetSchedulerParametersFlags
Modify operation that needs to wait in the queue of modify jobs.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-19 11:10:01 -07:00
Jim Fehlig
7d9ff81603 libxl: use job functions in libxlDomainSetAutostart
Setting autostart is a modify operation that needs to wait in the
queue of modify jobs.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-19 11:10:01 -07:00
Jim Fehlig
85ff3d7aec libxl: use job functions in device attach and detach functions
These operations aren't necessarily time consuming, but need to
wait in the queue of modify jobs.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-19 11:10:01 -07:00
Jim Fehlig
7df46cff6b libxl: use job functions in vcpu set and pin functions
These operations aren't necessarily time consuming, but need to
wait in the queue of modify jobs.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-19 11:10:01 -07:00
Jim Fehlig
f9e6b7024c libxl: use job functions in libxlDomainCoreDump
Dumping a domain's core can take considerable time.  Use the
recently added job functions and unlock the virDomainObj while
dumping core.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-19 11:10:00 -07:00
Jim Fehlig
341870b10d libxl: use job functions in domain save operations
Saving domain memory and cpu state can take considerable time.
Use the recently added job functions and unlock the virDomainObj
while saving the domain.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-19 11:10:00 -07:00
Jim Fehlig
e4a0e900d3 libxl: use job functions when cleaning up a domain
When explicitly destroying a domain (libxlDomainDestroyFlags), or
handling an out-of-band domain shutdown event, cleanup the domain
in the context of a job.  Introduce libxlVmCleanupJob to wrap
libxlVmCleanup in a job block.
2014-02-19 11:10:00 -07:00
Jim Fehlig
f5bc5bd4df libxl: use job functions in libxlDomain{Suspend,Resume}
These operations aren't necessarily time consuming, but need to
wait in the queue of modify jobs.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-19 11:10:00 -07:00
Jim Fehlig
ac1444c35f libxl: use job functions in libxlDomainSetMemoryFlags
Large balloon operation can be time consuming.  Use the recently
added job functions and unlock the virDomainObj while ballooning.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-19 11:10:00 -07:00
Jim Fehlig
491593e840 libxl: use job functions in libxlVmStart
Creating a large domain could potentially be time consuming.  Use the
recently added job functions and unlock the virDomainObj while
the create operation is in progress.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-19 11:10:00 -07:00
Jim Fehlig
4b4b61c329 libxl: Add job support to libxl driver
Follows the pattern used in the QEMU driver for managing multiple,
simultaneous jobs within the driver.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-19 11:10:00 -07:00
Jim Fehlig
343119a44b libxl: remove libxlVmReap function
This function, which only has five call sites, simply calls
libxl_domain_destroy and libxlVmCleanup.  Call those functions
directly at the call sites, allowing more control over how a
domain is destroyed and cleaned up.  This patch maintains the
existing semantic, leaving changes to a subsequent patch.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-19 11:10:00 -07:00
Jim Fehlig
219d34cfe2 libxl: always set vm id to -1 on shutdown
Once a domain has reached the shutdown state, set its ID to -1.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-19 11:10:00 -07:00
Oleg Strikov
41b9b71877 qemu: Use virtio network device for aarch64/virt
This patch changes network device type used by default from rtl8139
to virtio when architecture type is aarch64 and machine type is virt.
Qemu doesn't support any other machine types for aarch64 right now and
we can't make any other aarch64-specific tuning in this function yet.

Signed-off-by: Oleg Strikov <oleg.strikov@canonical.com>
2014-02-19 10:46:10 -05:00
Roman Bogorodskiy
0eb4a5f4f1 bhyve: add a basic driver
At this point it has a limited functionality and is highly
experimental. Supported domain operations are:

  * define
  * start
  * destroy
  * dumpxml
  * dominfo

It's only possible to have only one disk device and only one
network, which should be of type bridge.
2014-02-19 14:21:50 +00:00
Li Zhang
cffa51b81d Add a default USB keyboard and USB mouse for PPC64
There is no keyboard working on PPC64 and PS2 mouse is only for X86
when graphics are enabled.

Add a USB keyboard and USB mouse for PPC64 when graphics are enabled.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2014-02-19 09:16:31 +01:00
Li Zhang
2a81430c85 xen: format xen config for USB keyboard
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2014-02-19 09:16:31 +01:00
Li Zhang
78730478aa qemu: format qemu command line for USB keyboard
Format qemu command line for USB keyboard
and add test cases for it.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2014-02-19 09:16:31 +01:00
Li Zhang
f5ffd45f4c qemu: Add USB keyboard capability
Add USB keyboard capability probing and test cases.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2014-02-19 09:16:31 +01:00
Li Zhang
b39275954b conf: Remove the implicit PS2 devices for non-X86 platforms
PS2 devices only work on X86 platform, other platforms may need
USB devices instead. Athough it doesn't influence the QEMU command line,
it's not right to add PS2 mouse/keyboard for non-X86 platform.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2014-02-19 09:16:31 +01:00
Li Zhang
bc18373391 conf: Add keyboard input device type
There is no keyboard support currently in libvirt.

For some platforms (PPC64 QEMU) this makes graphics unusable,
since the keyboard is not implicit and it can't be added via libvirt.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2014-02-19 09:16:31 +01:00
Li Zhang
f608a713f6 conf: Add one interface to add default input devices
Use it for the default mouse.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2014-02-19 09:16:30 +01:00
Michal Privoznik
4d88294483 bridge_driver.h: Fix build --without-network
The networkNotifyActualDevice function is accepting two arguments, not
one:

qemu/qemu_process.c: In function 'qemuProcessNotifyNets':
qemu/qemu_process.c:2776:47: error: macro "networkNotifyActualDevice" passed 2 arguments, but takes just 1
         if (networkNotifyActualDevice(def, net) < 0)
                                               ^

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-02-18 19:52:39 +01:00
Ján Tomko
adc8b2afbb Fix conflicting types of virInitctlSetRunLevel
aebbcdd didn't change the non-linux definition of the function,
breaking the build on FreeBSD:

../../src/util/virinitctl.c:164: error: conflicting types for
'virInitctlSetRunLevel'
../../src/util/virinitctl.h:40: error: previous declaration of
'virInitctlSetRunLevel' was here
2014-02-18 15:05:06 +01:00
Michal Privoznik
9de7309125 network: Taint networks that are using hook script
Basically, the idea is copied from domain code, where tainting
exists for a while. Currently, only one taint reason exists -
VIR_NETWORK_TAINT_HOOK to mark those networks which caused invoking
of hook script.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-02-18 14:46:49 +01:00
Michal Privoznik
f1ab06e43d network: Introduce network hooks
There might be some use cases, where user wants to prepare the host or
its environment prior to starting a network and do some cleanup after
the network has been shut down. Consider all the functionality that
libvirt doesn't currently have as an example what a hook script can
possibly do.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-02-18 14:46:49 +01:00
Michal Privoznik
e0a31274ec network_conf: Expose virNetworkDefFormatInternal
In the next patch I'm going to need the network format function that
takes virBuffer as argument. However, slightly change of name is more
appropriate then: virNetworkDefFormatBuf to match the rest of functions
that format an object to buffer.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-02-18 14:46:48 +01:00
Daniel P. Berrange
5fc590ad9f CVE-2013-6456: Avoid unsafe use of /proc/$PID/root in LXC hotunplug code
Rewrite multiple hotunplug functions to to use the
virProcessRunInMountNamespace helper. This avoids
risk of a malicious guest replacing /dev with an absolute
symlink, tricking the driver into changing the host OS
filesystem.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-18 12:59:14 +00:00
Daniel P. Berrange
1cadeafcaa CVE-2013-6456: Avoid unsafe use of /proc/$PID/root in LXC chardev hostdev hotplug
Rewrite lxcDomainAttachDeviceHostdevMiscLive function
to use the virProcessRunInMountNamespace helper. This avoids
risk of a malicious guest replacing /dev with a absolute
symlink, tricking the driver into changing the host OS
filesystem.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-18 12:59:14 +00:00
Daniel P. Berrange
1754c7f0ab CVE-2013-6456: Avoid unsafe use of /proc/$PID/root in LXC block hostdev hotplug
Rewrite lxcDomainAttachDeviceHostdevStorageLive function
to use the virProcessRunInMountNamespace helper. This avoids
risk of a malicious guest replacing /dev with a absolute
symlink, tricking the driver into changing the host OS
filesystem.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-18 12:59:11 +00:00
Daniel P. Berrange
7fba01c15c CVE-2013-6456: Avoid unsafe use of /proc/$PID/root in LXC USB hotplug
Rewrite lxcDomainAttachDeviceHostdevSubsysUSBLive function
to use the virProcessRunInMountNamespace helper. This avoids
risk of a malicious guest replacing /dev with a absolute
symlink, tricking the driver into changing the host OS
filesystem.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-18 12:59:07 +00:00
Daniel P. Berrange
4dd3a7d5bc CVE-2013-6456: Avoid unsafe use of /proc/$PID/root in LXC disk hotplug
Rewrite lxcDomainAttachDeviceDiskLive function to use the
virProcessRunInMountNamespace helper. This avoids risk of
a malicious guest replacing /dev with a absolute symlink,
tricking the driver into changing the host OS filesystem.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-18 12:59:05 +00:00
Eric Blake
aebbcdd33c CVE-2013-6456: Avoid unsafe use of /proc/$PID/root in LXC shutdown/reboot code
Use helper virProcessRunInMountNamespace in lxcDomainShutdownFlags and
lxcDomainReboot.  Otherwise, a malicious guest could use symlinks
to force the host to manipulate the wrong file in the host's namespace.

Idea by Dan Berrange, based on an initial report by Reco
<recoverym4n@gmail.com> at
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=732394

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-02-18 12:59:02 +00:00
Daniel P. Berrange
7c72ef6f55 Add helper for running code in separate namespaces
Implement virProcessRunInMountNamespace, which runs callback of type
virProcessNamespaceCallback in a container namespace. This uses a
child process to run the callback, since you can't change the mount
namespace of a thread. This implies that callbacks have to be careful
about what code they run due to async safety rules.

Idea by Dan Berrange, based on an initial report by Reco
<recoverym4n@gmail.com> at
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=732394

Signed-off-by: Daniel Berrange <berrange@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2014-02-18 12:45:41 +00:00
Daniel P. Berrange
c321bfc5c3 Add virFileMakeParentPath helper function
Add a helper function which takes a file path and ensures
that all directory components leading up to the file exist.
IOW, it strips the filename part of the path and passes
the result to virFileMakePath.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-18 12:39:06 +00:00
Daniel P. Berrange
c3eb12cace Move check for cgroup devices ACL upfront in LXC hotplug
The check for whether the cgroup devices ACL is available is
done quite late during LXC hotplug - in fact after the device
node is already created in the container in some cases. Better
to do it upfront so we fail immediately.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-17 15:40:01 +00:00
Daniel P. Berrange
d24e6b8b1e Disks are always block devices, never character devices
The LXC disk hotplug code was allowing block or character devices
to be given as disk. A disk is always a block device.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-17 15:39:55 +00:00
Daniel P. Berrange
2c2bec94d2 Fix reset of cgroup when detaching USB device from LXC guests
When detaching a USB device from an LXC guest we must remove
the device from the cgroup ACL. Unfortunately we were telling
the cgroup code to use the guest /dev path, not the host /dev
path, and the guest device node had already been unlinked.
This was, however, fortunate since the code passed &priv->cgroup
instead of priv->cgroup, so would have crash if the device node
were accessible.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-17 15:39:55 +00:00
Daniel P. Berrange
a537827d15 Record hotplugged USB device in LXC live guest config
After hotplugging a USB device, the LXC driver forgot
to add the device def to the virDomainDefPtr.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-17 15:39:37 +00:00
Daniel P. Berrange
c364897222 Fix path used for USB device attach with LXC
The LXC code missed the 'usb' component out of the path
/dev/bus/usb/$BUSNUM/$DEVNUM, so it failed to actually
setup cgroups for the device. This was in fact lucky
because the call to virLXCSetupHostUsbDeviceCgroup
was also mistakenly passing '&priv->cgroup' instead of
just 'priv->cgroup'. So once the path is fixed, libvirtd
would then crash trying to access the bogus virCgroupPtr
pointer. This would have been a security issue, were it
not for the bogus path preventing the pointer reference
being reached.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-17 15:11:06 +00:00
Daniel P. Berrange
7a44af963e Don't block use of USB with containers
virDomainDefCompatibleDevice blocks use of USB if no USB
controller is present. This is not correct for containers
since devices can be assigned directly regardless of any
controllers.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-02-17 15:11:06 +00:00
Michal Privoznik
3b2c279449 qemu: Implement VIR_DOMAIN_TAINT_HOOK
Currently, there's just one place where we care if hook script is
changing the domain XML: migration hook for incoming migration. In
all other places where a hook script is executed, we don't read the
XML back from the script.

Anyway, the hook script can alter domain XML and hence we should taint
it if the script did.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-02-17 11:38:15 +01:00
Michal Privoznik
287d30a816 virDomainTaintFlags: Introduce VIR_DOMAIN_TAINT_HOOK
This new flag is to be used for tainting domains which
XML definition was altered at runtime by a hook script.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-02-17 11:38:15 +01:00
Peter Krempa
98bbc8d59a Revert "storage: Introduce internal pool support"
The internal pools were an idea in one of the first iterations of the
gluster series, which we decided not to use. Somehow the patch still
got pushed. Remove it as the internal flag isn't needed.

This reverts commit 362da8209d.
2014-02-14 17:39:37 +01:00
Peter Krempa
0a584b1fcd lxc: Don't shadow global symbol "link"
Yet another variable name frowned upon by older compilers. Introduced in
commit b73c029d.
2014-02-14 14:01:45 +01:00
Ján Tomko
0ee9081215 Support IPv6 in port allocator
Also try to bind on IPv6 to check if the port is occupied.

Change the mocked bind in the test to return EADDRINUSE
for some ports only for the IPv4/IPv6 socket if we're testing
on a host with IPv6 compiled in.

Also mock socket() to make it fail with EAFNOTSUPPORTED
if LIBVIRT_TEST_IPV4ONLY is set in the environment, to
simulate a host without IPv6 support in the kernel. The
tests are repeated again with this variable set.

https://bugzilla.redhat.com/show_bug.cgi?id=1025407
2014-02-14 13:18:35 +01:00
Ján Tomko
531bc0bbd0 Split out bind() from virPortAllocatorAcquire 2014-02-14 13:18:35 +01:00
Peter Krempa
ad95fa5957 storage: gluster: Don't leak private data when storage file init fails
In a44b7b87bc I've introduced a function
that initializes a storage file wrapper object on gluster based volumes.

The initialization function leaks the private data pointer in case of
failure. This patch fixes it.

Reported by John Ferlan.
2014-02-14 13:08:39 +01:00
Peter Krempa
8d8b32b0da storage: Fix build with older compilers afeter gluster snapshot series
In commit e32268184b I accidentally added
twice a typedef for virStorageFileBackend when I moved it between files
across patch iterations. The double declaration breaks build on older
compilers in RHEL5 and FreeBSD.

Remove the spurious definition.
2014-02-14 11:46:37 +01:00
Peter Krempa
3cf074ee40 qemu: snapshot: Add support for external active snapshots on gluster
Add support for gluster backed images as sources for snapshots in the
qemu driver. This will also simplify adding further network backed
volumes as sources for snapshot in case qemu will support them.
2014-02-14 11:07:29 +01:00
Peter Krempa
7183d7d2e8 qemu: snapshot: Use new APIs to detect presence of existing storage files
Use the new storage driver based "stat" api to detect exiting files just
as we did with local files.
2014-02-14 11:07:29 +01:00
Peter Krempa
8f4091d677 qemu: Switch snapshot deletion to the new API functions
Use the new storage driver APIs to delete snapshot backing files in case
of failure instead of directly relying on "unlink". This will help us in
the future when we will be adding network based storage without local
representation in the host.
2014-02-14 11:07:29 +01:00
Peter Krempa
a44b7b87bc storage: Add storage file backends for gluster
Implement storage backend functions to deal with gluster volumes and
implement the "stat" and "unlink" backend APIs.
2014-02-14 11:07:23 +01:00
Peter Krempa
e62d09b155 storage: add file functions for local and block files
Implement the "stat" and "unlink" function for "file" volumes and "stat"
for "block" volumes using the regular system calls.
2014-02-14 10:47:57 +01:00
Peter Krempa
e32268184b storage: Add file storage APIs in the default storage driver
Add APIs that will allow to use the storage driver to assist in
operations on files even for remote filesystems without native
representation as files in the host.
2014-02-14 10:47:56 +01:00
Peter Krempa
6fb5a397bf conf: Move qemuSnapshotDiskGetActualType to virDomainSnapshotDiskGetActualType
All the data for getting the actual type is present in the snapshot
config. There is no need to have this function private to the qemu
driver and it will be re-used later in other parts of libvirt
2014-02-14 10:47:56 +01:00
Peter Krempa
f8f020da0a conf: Move qemuDiskGetActualType to virDomainDiskGetActualType
All the data for getting the actual type is present in the domain
config. There is no need to have this function private to the qemu
driver and it will be re-used later in other parts of libvirt
2014-02-14 10:47:56 +01:00
Cédric Bosdonnat
7554a85be2 lxc from native: removed now remaining useless line 2014-02-13 07:55:05 -05:00
Philipp Hahn
760498fdc7 Fix stream related spelling mistakes
Remove double "is".
Consistent spelling of all-uppercase I/O.

Signed-off-by: Philipp Hahn <hahn@univention.de>
2014-02-13 11:12:02 +01:00
Cédric Bosdonnat
3d58fa3f85 LXC from native: convert blkio throttle config 2014-02-12 17:52:47 +00:00
Cédric Bosdonnat
a09bbc024d LXC from native: map vlan network type
The problem with VLAN is that the user still has to manually create the
vlan interface on the host. Then the generated configuration will use
it as a nerwork hostdev device. So the generated configurations of the
following two fragments are equivalent (see rhbz#1059637).

lxc.network.type = phys
lxc.network.link = eth0.5

lxc.network.type = vlan
lxc.network.link = eth0
lxc.network.vlan.id = 5
2014-02-12 17:52:47 +00:00
Cédric Bosdonnat
d1520c5c9a LXC from native: map block filesystems 2014-02-12 17:52:47 +00:00
Cédric Bosdonnat
0f13a525d2 LXC from native: map lxc.arch to /domain/os/type@arch 2014-02-12 17:52:46 +00:00
Cédric Bosdonnat
5b8bfb0276 LXC from native: add lxc.cgroup.blkio.* mapping 2014-02-12 17:52:46 +00:00
Cédric Bosdonnat
281e2990ee LXC from native: map lxc.cgroup.cpuset.* 2014-02-12 17:52:46 +00:00
Cédric Bosdonnat
4f3f7aea6c LXC from native: map lxc.cgroup.cpu.* 2014-02-12 17:52:46 +00:00
Cédric Bosdonnat
13b9946eb5 LXC from native: migrate memory tuning 2014-02-12 17:52:46 +00:00
Cédric Bosdonnat
99d8cddfbe LXC from native: convert lxc.id_map into <idmap> 2014-02-12 17:52:46 +00:00
Cédric Bosdonnat
8e45b88772 LXC from native: convert macvlan network configuration 2014-02-12 17:52:46 +00:00
Cédric Bosdonnat
f01fe54e75 LXC from native: convert lxc.tty to console devices 2014-02-12 17:52:46 +00:00
Cédric Bosdonnat
69fc236243 LXC from native: convert phys network types to net hostdev devices 2014-02-12 17:52:46 +00:00
Cédric Bosdonnat
b73c029d83 LXC from native: migrate veth network configuration
Some of the LXC configuration properties aren't migrated since they
would only cause problems in libvirt-lxc:
  * lxc.network.ipv[46]: LXC driver doesn't setup IP address of guests,
    see rhbz#1059624
  * lxc.network.name, see rhbz#1059630
2014-02-12 17:52:46 +00:00
Cédric Bosdonnat
7bfd6e97ec LXC from native: implement no network conversion
If no network configuration is provided, LXC only provides the loopback
interface. To match this, we need to use the privnet feature. LXC will
also define a 'none' network type in its 1.0.0 version that fits
libvirt LXC driver's default.
2014-02-12 17:52:46 +00:00
Cédric Bosdonnat
a41680f8c5 LXC from native: migrate fstab and lxc.mount.entry
Tmpfs relative size and default 50% size values aren't supported as
we have no idea of the available memory at the conversion time.
2014-02-12 17:52:46 +00:00
Cédric Bosdonnat
197b13e5d9 LXC from native: import rootfs
LXC rootfs can be either a directory or a block device or an image
file. The first two types have been implemented, but the image file is
still to be done since LXC auto-guesses the file format at mount time
and the LXC driver doesn't support the 'auto' format.
2014-02-12 17:52:46 +00:00
Cédric Bosdonnat
7195c807b2 LXC driver: started implementing connectDomainXMLFromNative
This function aims at converting LXC configuration into a libvirt
domain XML description to help users migrate from LXC to libvirt.

Here is an example of how the lxc configuration works:
virsh -c lxc:/// domxml-from-native lxc-tools /var/lib/lxc/migrate_test/config

It is possible that some parts couldn't be properly mapped into a
domain XML fragment, so users should carefully review the result
before creating the domain.

fstab files in lxc.mount lines will need to be merged into the
configuration file as lxc.mount.entry.

As we can't know the amount of memory of the host, we have to set a
default value for max_balloon that users will probably want to adjust.
2014-02-12 17:52:46 +00:00
Cédric Bosdonnat
3daa14834a Improve virConf parse to handle LXC config format
virConf now honours a VIR_CONF_FLAG_LXC_FORMAT flag to handle LXC
configuration files. The differences are that property names can
contain '.' character and values are all strings without any bounding
quotes.

Provide a new virConfWalk function calling a handler on all non-comment
values. This function will be used by the LXC conversion code to loop
over LXC configuration lines.
2014-02-12 17:52:46 +00:00
Eric Blake
6831c1d327 event: pass reason for PM events
Commit 57ddcc23 (v0.9.11) introduced the pmwakeup event, with
an optional 'reason' field reserved for possible future expansion.
But it failed to wire the field through RPC, so even if we do
add a reason in the future, we will be unable to get it back
to the user.

Worse, commit 7ba5defb (v1.0.0) repeated the same mistake with
the pmsuspend_disk event.

As long as we are adding new RPC calls, we might as well fix
the events to actually match the signature so that we don't have
to add yet another RPC in the future if we do decide to start
using the reason field.

* src/remote/remote_protocol.x
(remote_domain_event_callback_pmwakeup_msg)
(remote_domain_event_callback_pmsuspend_msg)
(remote_domain_event_callback_pmsuspend_disk_msg): Add reason
field.
* daemon/remote.c (remoteRelayDomainEventPMWakeup)
(remoteRelayDomainEventPMSuspend)
(remoteRelayDomainEventPMSuspendDisk): Pass reason to client.
* src/conf/domain_event.h (virDomainEventPMWakeupNewFromDom)
(virDomainEventPMSuspendNewFromDom)
(virDomainEventPMSuspendDiskNewFromDom): Require additional
parameter.
* src/conf/domain_event.c (virDomainEventPMClass): New class.
(virDomainEventPMDispose): New function.
(virDomainEventPMWakeupNew*, virDomainEventPMSuspendNew*)
(virDomainEventPMSuspendDiskNew*)
(virDomainEventDispatchDefaultFunc): Use new class.
* src/remote/remote_driver.c (remoteDomainBuildEvent*PM*): Pass
reason through.
* src/remote_protocol-structs: Regenerate.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-02-12 10:48:16 -07:00
Eric Blake
158795d20e event: convert remaining domain events to new style
Following the patterns established by lifecycle events, this
creates all the new RPC calls needed to pass callback IDs
for every domain event, and changes the limits in client and
server codes to use modern style when possible.

I've tested all combinations: both 'old client and new server'
and 'new client and old server' continue to work with the old
RPCs, and 'new client and new server' benefit from server-side
filtering with the new RPCs.

* src/remote/remote_protocol.x (REMOTE_PROC_DOMAIN_EVENT_*): Add
REMOTE_PROC_DOMAIN_EVENT_CALLBACK_* counterparts.
* daemon/remote.c (remoteRelayDomainEvent*): Send callbackID via
newer RPC when used with new-style registration.
(remoteDispatchConnectDomainEventCallbackRegisterAny): Extend to
cover all domain events.
* src/remote/remote_driver.c (remoteDomainBuildEvent*): Add new
Callback and Helper functions.
(remoteEvents): Match order of RPC numbers, register new handlers.
(remoteConnectDomainEventRegisterAny)
(remoteConnectDomainEventDeregisterAny): Extend to cover all
domain events.
* src/remote_protocol-structs: Regenerate.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-02-12 10:48:16 -07:00
Eric Blake
355ea62650 event: client RPC protocol tweaks for domain lifecycle events
The counterpart to the server RPC additions; here, a single
function can serve both old and new calls, while incoming
events must be serviced by two different functions.  Again,
some wise choices in our XDR made it easier to share code
managing similar events.

While this only supports lifecycle events, it covers the
harder part of how Register and RegisterAny interact; the
remaining 15 events will be a mechanical change in a later
patch.  For Register, we now have a callbackID locally for
more efficient cleanup if the RPC fails; we also prefer to
use the newer RPC where we know it is supported (the older
RPC must be used if we don't know if RegisterAny is
supported).

* src/remote/remote_driver.c (remoteEvents): Register new RPC
event handler.
(remoteDomainBuildEventLifecycle): Move guts...
(remoteDomainBuildEventLifecycleHelper): ...here.
(remoteDomainBuildEventCallbackLifecycle): New function.
(remoteConnectDomainEventRegister)
(remoteConnectDomainEventDeregister)
(remoteConnectDomainEventRegisterAny)
(remoteConnectDomainEventDeregisterAny): Use new RPC when supported.
2014-02-12 10:48:16 -07:00
Eric Blake
caaf6ba1b6 event: prepare client to track domain callbackID
We want to convert over to server-side events, even for older
APIs.  To do that, the client side of the remote driver wants
to distinguish between legacy virConnectDomainEventRegister and
normal virConnectDomainEventRegisterAny, while knowing the
client callbackID and the server's serverID for both types of
registration.  The client also needs to probe whether the
server supports server-side filtering.  However, for ease of
review, we don't actually use the new RPCs until a later patch.

* src/conf/object_event_private.h (virObjectEventStateCallbackID):
Add parameter.
* src/conf/object_event.c (virObjectEventCallbackListAddID)
(virObjectEventStateRegisterID): Separate legacy from callbackID.
(virObjectEventStateCallbackID): Pass through parameter.
(virObjectEventCallbackLookup): Let legacy and global domain
lifecycle events share a common remoteID.
* src/conf/network_event.c (virNetworkEventStateRegisterID):
Update caller.
* src/conf/domain_event.c (virDomainEventStateRegister)
(virDomainEventStateRegisterID, virDomainEventStateDeregister):
Likewise.
(virDomainEventStateRegisterClient)
(virDomainEventStateCallbackID): Implement new functions.
* src/conf/domain_event.h (virDomainEventStateRegisterClient)
(virDomainEventStateCallbackID): New prototypes.
* src/remote/remote_driver.c (private_data): Add field.
(doRemoteOpen): Probe server feature.
(remoteConnectDomainEventRegister)
(remoteConnectDomainEventRegisterAny): Use new function.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-02-12 10:48:15 -07:00
Eric Blake
0372295770 event: server RPC protocol tweaks for domain lifecycle events
This patch adds some new RPC call numbers, but for ease of review,
they sit idle until a later patch adds the client counterpart to
drive the new RPCs.  Also for ease of review, I limited this patch
to just the lifecycle event; although converting the remaining
15 domain events will be quite mechanical.  On the server side,
we have to have a function per RPC call, largely with duplicated
bodies (the key difference being that we store in our callback
opaque pointer whether events should be fired with old or new
style); meanwhile, a single function can drive multiple RPC
messages.  With a strategic choice of XDR struct layout, we can
make the event generation code for both styles fairly compact.

I debated about adding a tri-state witness variable per
connection (values 'unknown', 'legacy', 'modern').  It would start
as 'unknown', move to 'legacy' if any RPC call is made to a legacy
event call, and move to 'modern' if the feature probe is made;
then the event code could issue an error if the witness state is
incorrect (a legacy RPC call while in 'modern', a modern RPC call
while in 'unknown' or 'legacy', and a feature probe while in
'legacy' or 'modern').  But while it might prevent odd behavior
caused by protocol fuzzing, I don't see that it would prevent
any security holes, so I considered it bloat.

Note that sticking @acl markers on the new RPCs generates unused
functions in access/viraccessapicheck.c, because there is no new
API call that needs to use the new checks; however, having a
consistent .x file is worth the dead code.

* src/libvirt_internal.h (VIR_DRV_FEATURE_REMOTE_EVENT_CALLBACK):
New feature.
* src/remote/remote_protocol.x
(REMOTE_PROC_CONNECT_DOMAIN_EVENT_CALLBACK_REGISTER_ANY)
(REMOTE_PROC_CONNECT_DOMAIN_EVENT_CALLBACK_DEREGISTER_ANY)
(REMOTE_PROC_DOMAIN_EVENT_CALLBACK_LIFECYCLE): New RPCs.
* daemon/remote.c (daemonClientCallback): Add field.
(remoteDispatchConnectDomainEventCallbackRegisterAny)
(remoteDispatchConnectDomainEventCallbackDeregisterAny): New
functions.
(remoteDispatchConnectDomainEventRegisterAny)
(remoteDispatchConnectDomainEventDeregisterAny): Mark legacy use.
(remoteRelayDomainEventLifecycle): Change message based on legacy
or new use.
(remoteDispatchConnectSupportsFeature): Advertise new feature.
* src/remote_protocol-structs: Regenerate.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-02-12 10:48:15 -07:00
Michael Chapman
74cf8202d2 storage: handle NULL return from virGetStorageVol
virGetStorageVol can return NULL on out-of-memory. If it does, cleanly
abort the volume clone operation.

Signed-off-by: Michael Chapman <mike@very.puzzling.org>
2014-02-12 15:18:43 +01:00
Ján Tomko
7236a473f0 Revert "storage: disk: Separate creating of the volume from building"
This reverts commit 67ccf91bf2.
We only generate the volume key after we've built it, but the storage
driver expects it to be filled after createVol finishes.
Squash the volume building back with creating to fulfill this
expectation.
2014-02-12 14:54:05 +01:00
Ján Tomko
42bbde2d06 Revert "storage: lvm: Separate creating of the volume from building"
This reverts commit af1fb38f55.
With it, creating new logical volumes fails:
https://www.redhat.com/archives/libvir-list/2014-February/msg00658.html

In the storage driver, we expect CreateVol to fill out the volume key,
but the LVM backend fills the key with the uuid reported by lvs after the
logical volume is created.
2014-02-12 14:51:05 +01:00
Cédric Bosdonnat
d385239260 Fixed build with clang.
Two unused global variables, and DBUS_TYPE_INVALID used as a const
char*.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-02-12 06:36:17 -07:00
Oleg Strikov
69fba97f63 qemu: Implement a stub cpuArchDriver.baseline() handler for aarch64
Openstack Nova calls virConnectBaselineCPU() during initialization
of the instance to get a full list of CPU features.
This patch adds a stub to aarch64-specific code to handle
this request (no actual work is done). That's enough to have
this stub with limited functionality because qemu/kvm backend
supports only 'host-passthrough' cpu mode on aarch64.

Signed-off-by: Oleg Strikov <oleg.strikov@canonical.com>
2014-02-11 17:34:55 -05:00
Jim Fehlig
2fbfedeb0d libxl: fix libxlDoDomainSave documentation
Update the function's comment, which was missed when removing use of
the driver lock everywhere.
2014-02-11 11:03:53 -07:00
Jim Fehlig
3d8a3d6e5b libxl: register for domain events immediately after creation
A small fix for the possiblitiy of jumping to an error path before
registering for domain events, preventing receiving important ones
like shutdown and death.
2014-02-11 11:03:53 -07:00
Jim Fehlig
e20bf46741 libxl: rename libxlCreateDomEvents to libxlDomEventsRegister
libxlDomEventsRegister better reflects its purpose: register for
domain events from libxl.
2014-02-11 11:03:53 -07:00
Ján Tomko
47fa97a799 Rename 'index' in virCapabilitiesGetCpusForNode
This shadows the index function on some systems (RHEL-6.4, FreeBSD 9):
../../src/conf/capabilities.c: In function 'virCapabilitiesGetCpusForNode':
../../src/conf/capabilities.c:1005: warning: declaration of'index'
      shadows a global declaration [-Wshadow]
/usr/include/strings.h:57: warning: shadowed declaration is here [-Wshadow]
2014-02-11 16:35:33 +01:00
Pradipta Kr. Banerjee
cd921cf077 Handle non-sequential NUMA node numbers
On some platforms like IBM PowerNV the NUMA node numbers can be
non-sequential. For eg. numactl --hardware o/p from such a machine looks
as given below

node distances:
   node   0   1  16  17
     0:  10  40  40  40
     1:  40  10  40  40
    16:  40  40  10  40
    17:  40  40  40  10

The NUMA nodes are 0,1,16,17

Libvirt uses sequential index as NUMA node numbers and this can
result in crash or incorrect results.

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
2014-02-11 14:44:20 +00:00
Peter Krempa
037ffda3c7 storage: gluster: Set volume metadata in a separate function
Extract the metadata setting code into a separate function for future
use.
2014-02-11 13:46:32 +01:00
Martin Kletzander
d27e6bc40f qemu: introduce spiceport chardev backend
Add a new backend for any character device.  This backend uses channel
in spice connection.  This channel is similar to spicevmc, but
all-purpose in contrast to spicevmc.

Apart from spicevmc, spiceport-backed chardev will not be formatted
into the command-line if there is no spice to use (with test for that
as well).  For this I moved the def->graphics counting to the start
of the function so its results can be used in rest of the code even in
the future.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-02-11 13:43:55 +01:00
Martin Kletzander
296a4791eb qemu: remove pointless condition
This patch is here just to ease the code review and make related
changes look more sensible.  Apart from removing the condition this is
merely a whitespace (indentation) change.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-02-11 13:43:55 +01:00
Martin Kletzander
a53e504052 qemu: rework '-serial none'
Limiting ourselves to qemu without QEMU_CAPS_DEVICE capability, we
used '-serial none' only if there was no serial device defined in the
domain XML.  This means that if we want to have a possibility of the
device being defined in XML, but not used in the command-line
(e.g. when it's pointless), we'll fail to attach '-serial none' to the
command-line (when skipping the device's command-line building and the
device being the only one).

Since there is no such device, this patch doesn't actually do
anything, but enables easier future additions in this manner.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-02-11 13:43:55 +01:00
Martin Kletzander
5b189541ac conf: introduce spiceport chardev backend
Add a new character device backend called 'spiceport' that uses
spice's channel for communications and apart from spicevmc can be used
as a backend for any character device from libvirt's point of view.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-02-11 13:43:55 +01:00
Wido den Hollander
0227889ab0 rbd: Use rbd_create3 to create RBD format 2 images by default
This new RBD format supports snapshotting and cloning. By having
libvirt create images in format 2 end-users of the created images
can benefit from the new RBD format.

Older versions of libvirt can work with this new RBD format as long
as librbd supports format 2. RBD format is supported by librbd since
version 0.56 (Ceph Bobtail).

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2014-02-11 12:10:22 +00:00
Joel SIMOES
9741006333 Libvirt lose sheepdogs volumes on pool refresh or restart.
When restarting sheepdog pool, all volumes are missing.
This patch add automatically all volume from the added pool.

Adding last Daniel P. Berrange's syntaxes correction.
Adding vol on separeted function 'inspired' from parallels_storage :
parallelsAddDiskVolume
2014-02-11 11:32:04 +00:00
Laine Stump
0144d72963 build: correctly check for SOICGIFVLAN GET_VLAN_VID_CMD command
In order to make a client-only build successful on RHEL4 (yes, you
read that correctly!), commit 3ed2e54 modified src/util/virnetdev.c so
that the functional version of virNetDevGetVLanID() was only compiled
if GET_VLAN_VID_CMD was defined. However, it is *never* defined, but
is only an enum value, so the proper version was no longer compiled
even on platforms that support it. This resulted in the vlan tag not
being properly set for guest traffic on VEPA mode guest macvtap
interfaces that were bound to a vlan interface (that's the only place
that libvirt currently uses virNetDevGetVLanID)

Since there is no way to compile conditionally based on the presence
of an enum value, this patch modifies configure.ac to check for said
enum value with AC_CHECK_DECLS(), which #defines
HAVE_DECL_GET_VLAN_VID_CMD to 1 if it's successful compiling a test
program that uses GET_VLAN_VID_CMD (and still #defines it, but to 0,
if it's not successful).  We can then make the compilation of
virNetDevGetVLanID() conditional on the value of
HAVE_DECL_GET_VLAN_VID_CMD.
2014-02-11 01:43:38 +02:00
Yuri Myasoedov
cc25e45158 maint: fix line numbers in check-aclrules reports
Reset line numbering on each input file in check-aclrules.pl. Otherwise
it reports wrong line numbers in its error messages.

Signed-off-by: Yuri Myasoedov <ymyasoedov@yandex.ru>
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2014-02-10 14:07:22 -07:00
Michal Privoznik
28900766d5 virNetworkLoadState: Disallow mangled 'floor' element
In the network status XML we may have the <floor/> element with the
'sum' attribute. The attribute represents sum of all 'floor'-s of
computed over each interface connected to the network (this is needed to
guarantee certain bandwidth for certain domain). The sum is therefore a
number. However, if the number was mangled (e.g. by an user's
interference to network status file), we've just ignored it without
refusing to parse such file. This was all due to 'goto error' missing.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-02-10 19:26:16 +01:00
Peter Krempa
9bf629ab60 qemu: Use correct permissions when determining the image chain
The code took into account only the global permissions. The domains now
support per-vm DAC labels and per-image DAC labels. Use the most
specific label available.
2014-02-10 15:49:59 +01:00
Michal Privoznik
e209c07760 networkStartNetwork: Be more verbose
The lack of debug printings might be frustrating in the future.
Moreover, this function doesn't follow the usual pattern we have in the
rest of the code:

  int ret = -1;
  /* do some work */
  ret = 0;
cleanup:
  /* some cleanup work */
  return ret;

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-02-10 11:47:24 +01:00
Peter Krempa
600bca592b qemu: hyperv: Add support for timer enlightenments
Add a new <timer> for the HyperV reference time counter enlightenment
and the iTSC reference page for Windows guests.

This feature provides a paravirtual approach to track timer events for
the guest (similar to kvmclock) with the option to use real hardware
clock on systems with a iTSC with compensation across various hosts.
2014-02-10 11:30:10 +01:00
Peter Krempa
8ffaa42d7b conf: Enforce supported options for certain timers
According to the documentation various timer options are only supported
by certain timer types. Add a post parse check to verify that the user
didn't specify invalid options.

Also fix the qemu command line parsing function to set correct default
values for the kvmclock timer so that it passes the new check.
2014-02-10 11:17:32 +01:00
Peter Krempa
bbd392ff86 schema: Fix guest timer specification schema according to the docs
According to the documentation describing various tunables for domain
timers not all the fields are supported by all the driver types. Express
these in the RNG:

- rtc, platform: Only these support the "track" attribute.
- tsc: only one to support "frequency" and "mode" attributes
- hpet, pit: tickpolicy/catchup attribute/element
- kvmclock: no extra attributes are supported

Additionally the attributes of the <catchup> element for
tickpolicy='catchup' are optional according to the parsing code. Express
this in the XML and fix a spurious space added while formatting the
<catchup> element and add tests for it.
2014-02-10 11:09:14 +01:00
John Ferlan
b60644f38f virpci: Resolve coverity issues
Coverity complains about "USE_AFTER_FREE" due to how virPCIDeviceSetStubDriver
"could" return either -1, 0, or 1 from the VIR_STRDUP() and then possibly makes
a call to virPCIDeviceDetach().

The only way this could happen is if NULL were passed as the "driver" name
and virStrdup() returned 0.  Since the calling functions check < 0 on the
initial function call, the 0 possibility causes Coverity to complain.

To fix this - enforce that the second parameter is not NULL using
ATTRIBUTE_NONNULL(2) for the function prototype, then in virPCIDeviceDetach
add an sa_assert(dev->stubDriver). This will result in Coverity not complaining
any more.
2014-02-07 10:58:24 -05:00
Christophe Fergeau
f336b1cccb Add glusterfs to VIR_CONNECT_LIST_STORAGE_POOLS_FILTERS_POOL_TYPE
If it's not present in this list, we won't be able to get only
glusterfs pools when using virConnectListAllStoragePools.
2014-02-07 10:26:46 +01:00
Martin Kletzander
440a1aa508 qemu: keep pre-migration domain state after failed migration
Couple of codepaths shared the same code which can be moved out to a
function and on one of such places, qemuMigrationConfirmPhase(), the
domain was resumed even if it wasn't running before the migration
started.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1057407

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-02-07 10:07:38 +01:00
Jim Fehlig
630b645695 libxl: remove unneeded locking of driver when restoring
libxlDomainRestoreFlags acquires the driver lock while reading the
domain config from the save file and adding it to
libxlDriverPrivatePtr->domains.  But virDomainObjList provides
self-locking APIs, so remove the needless driver locking.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-06 10:39:32 -07:00
Jim Fehlig
778067e195 libxl: improve subprocess handling
If available, let libxl handle reaping any children it creates by
specifying libxl_sigchld_owner_libxl_always_selective_reap.  This
feature was added to improve subprocess handling in libxl when used
in an application that does not install a SIGCHLD handler like
libvirt

http://lists.xen.org/archives/html/xen-devel/2014-01/msg01555.html

Prior to this patch, it is possible to hit asserts in libxl when
reaping subprocesses, particularly during simultaneous operations
on multiple domains.  With this patch, and the corresponding changes
to libxl, I no longer see the asserts.  Note that the libxl changes
will be included in Xen 4.4.0.  Previous Xen versions will be
susceptible to hitting the asserts even with this patch applied to
the libvirt libxl driver.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-06 10:20:31 -07:00
Jim Fehlig
03b3f8940a libxl: handle domain shutdown events in a thread
Handling the domain shutdown event within the event handler seems
a bit unfair to libxl's event machinery.  Domain "shutdown" could
take considerable time.  E.g. if the shutdown reason is reboot,
the domain must be reaped and then started again.

Spawn a shutdown handler thread to do this work, allowing libxl's
event machinery to go about its business.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-06 10:17:58 -07:00
Jim Fehlig
eaa8d9b2c7 libxl: remove list of timer registrations from libxlDomainObjPrivate
Due to some misunderstanding of requirements libxl places on timer
handling, I introduced the half-brained idea of maintaining a list
of timeouts that the driver could force to expire before freeing a
libxlDomainObjPrivate (and hence libxl_ctx).  But testing all
the latest versions of Xen supported by the libxl driver (4.2.3,
4.3.1, 4.4.0 RC3), I see that libxl will handle this just fine and
there is no need to force expiration behind libxl's back.  Indeed it
may be harmful to do so.

This patch removes the timer list, allowing libxl to handle cleanup
of its timer registrations.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-06 10:08:11 -07:00
Jim Fehlig
cda52dbfe5 libxl: fix leaking libxlDomainObjPrivate
When libxl registers an FD with the libxl driver, the refcnt of the
associated libxlDomainObjPrivate object is incremented. The refcnt
is decremented when libxl deregisters the FD.  But some FDs are only
deregistered when their libxl ctx is freed, which unfortunately is
done in the libxlDomainObjPrivate dispose function.  With references
held by the FDs, libxlDomainObjPrivate is never disposed.

I added the ref/unref in FD registration/deregistration when adding
the same in timer registration/deregistration.  For timers, this
is a simple approach to ensuring the libxlDomainObjPrivate is not
disposed prior to their expirtation, which libxl guarantees will
occur.  It is not needed for FDs, and only causes
libxlDomainObjPrivate to leak.

This patch removes the reference on libxlDomainObjPrivate for FD
registrations, but retains them for timer registrations.  Tested on
the latest releases of Xen supported by the libxl driver:  4.2.3,
4.3.1, and 4.4.0 RC3.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2014-02-06 10:06:26 -07:00
Matthieu Coudron
0778fc1ab9 qemu_driver: Introduce <filesystem/> support in device attach/detach
This commit allows to attach/detach a <filesystem> device in qemu. For
this purpose I'm introducing two new functions: virDomainFSInsert() and
virDomainFSRemove() and adding necessary code in the qemu driver.  It
compares filesystems based on their "destination" folder. So if two
filesystems share the same destination, they are considered equal and
the qemu driver would reject the insertion.

Signed-off-by: Matthieu Coudron <mattator@gmail.com>
2014-02-06 17:20:03 +01:00
Matthieu Coudron
8fc98ac8cf virDomainHostdev{Insert,Delete}: Replace VIR_REALLOC_N by VIR_{APPEND,DELETE}_ELEMENT
With this change the code gets shorter and more readable.

Signed-off-by: Matthieu Coudron <mattator@gmail.com>
2014-02-06 17:10:26 +01:00
Roman Bogorodskiy
3b00df01fb BSD: implement nodeGetCPUStats
Implementation obtains CPU usage information using
kern.cp_time and kern.cp_times sysctl(8)s and reports
CPU utilization.
2014-02-06 14:09:15 +01:00
Jiri Denemark
05bf937572 qemu: Fix crash in virDomainMemoryStats with old qemu
If virDomainMemoryStats was run on a domain with virtio balloon driver
running on an old qemu which supports QMP but does not support qom-list
QMP command, libvirtd would crash. The reason is we did not check if
qemuMonitorJSONGetObjectListPaths failed and moreover we even stored its
result in an unsigned integer type.
2014-02-06 11:29:29 +01:00
Peter Krempa
5d2691cc4c qemu: blockjob: Print correct file name in error message
When attempting a blockcommit from the top layer, the base argument
passed is NULL. This will be dereferenced when attempting a commit with
an empty image chain. Output the real volume path instead:

virsh blockcommit --verbose --path vda --domain DOMNAME --wait
error: invalid argument: top '/path/somefile' in chain for 'vda' has no backing file

instead of:

error: invalid argument: top '(null)' in chain for 'vda' has no backing file
2014-02-06 10:43:57 +01:00
Peter Krempa
cc3d335b76 maint: Change the text of the NULLSTR() macro to "<null>"
Eric Blake suggested to change this message to be different from the
glibc's NULL deref protection message in printf to be able to
differentiate errors.
2014-02-06 10:43:57 +01:00
Michal Privoznik
51bea5df5d qemuBuildClockArgStr: Allow localtime clock basis
https://bugzilla.redhat.com/show_bug.cgi?id=1046192

Commit b8bf79a, which adds clock='variable', forgets to check
localtime basis in qemuBuildClockArgStr(). So that localtime
basis could not be used.

Reported-by: Jincheng Miao <jmiao@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-02-06 07:51:07 +01:00
Ján Tomko
0db9b0883c Generate a valid imagelabel even for type 'none'
Commit 2ce63c1 added imagelabel generation when relabeling is turned
off. But we weren't filling out the sensitivity for type 'none' labels,
resulting in an invalid label:

$ virsh managedsave domain
error: unable to set security context 'system_u:object_r:svirt_image_t'
on fd 28: Invalid argument
2014-02-05 19:47:30 +01:00
Eric Blake
f34ea654de maint: fix grammar in conf file
Noticed a misuse of 'to' while testing my event regression under
polkit ACLs, and decided to review the entire conf files for
other legibility bugs.

* daemon/libvirtd.conf: Use correct grammar.
* src/qemu/qemu.conf: Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-02-05 10:40:14 -07:00
Eric Blake
11f20e43f1 event: move event filtering to daemon (regression fix)
https://bugzilla.redhat.com/show_bug.cgi?id=1058839

Commit f9f56340 for CVE-2014-0028 almost had the right idea - we
need to check the ACL rules to filter which events to send.  But
it overlooked one thing: the event dispatch queue is running in
the main loop thread, and therefore does not normally have a
current virIdentityPtr.  But filter checks can be based on current
identity, so when libvirtd.conf contains access_drivers=["polkit"],
we ended up rejecting access for EVERY event due to failure to
look up the current identity, even if it should have been allowed.

Furthermore, even for events that are triggered by API calls, it
is important to remember that the point of events is that they can
be copied across multiple connections, which may have separate
identities and permissions.  So even if events were dispatched
from a context where we have an identity, we must change to the
correct identity of the connection that will be receiving the
event, rather than basing a decision on the context that triggered
the event, when deciding whether to filter an event to a
particular connection.

If there were an easy way to get from virConnectPtr to the
appropriate virIdentityPtr, then object_event.c could adjust the
identity prior to checking whether to dispatch an event.  But
setting up that back-reference is a bit invasive.  Instead, it
is easier to delay the filtering check until lower down the
stack, at the point where we have direct access to the RPC
client object that owns an identity.  As such, this patch ends
up reverting a large portion of the framework of commit f9f56340.
We also have to teach 'make check' to special-case the fact that
the event registration filtering is done at the point of dispatch,
rather than the point of registration.  Note that even though we
don't actually use virConnectDomainEventRegisterCheckACL (because
the RegisterAny variant is sufficient), we still generate the
function for the purposes of documenting that the filtering
takes place.

Also note that I did not entirely delete the notion of a filter
from object_event.c; I still plan on using that for my upcoming
patch series for qemu monitor events in libvirt-qemu.so.  In
other words, while this patch changes ACL filtering to live in
remote.c and therefore we have no current client of the filtering
in object_event.c, the notion of filtering in object_event.c is
still useful down the road.

* src/check-aclrules.pl: Exempt event registration from having to
pass checkACL filter down call stack.
* daemon/remote.c (remoteRelayDomainEventCheckACL)
(remoteRelayNetworkEventCheckACL): New functions.
(remoteRelay*Event*): Use new functions.
* src/conf/domain_event.h (virDomainEventStateRegister)
(virDomainEventStateRegisterID): Drop unused parameter.
* src/conf/network_event.h (virNetworkEventStateRegisterID):
Likewise.
* src/conf/domain_event.c (virDomainEventFilter): Delete unused
function.
* src/conf/network_event.c (virNetworkEventFilter): Likewise.
* src/libxl/libxl_driver.c: Adjust caller.
* src/lxc/lxc_driver.c: Likewise.
* src/network/bridge_driver.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
* src/remote/remote_driver.c: Likewise.
* src/test/test_driver.c: Likewise.
* src/uml/uml_driver.c: Likewise.
* src/vbox/vbox_tmpl.c: Likewise.
* src/xen/xen_driver.c: Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-02-05 08:03:31 -07:00
Laine Stump
eafb53fec2 network: disallow <bandwidth>/<mac> for bridged/macvtap/hostdev networks
https://bugzilla.redhat.com/show_bug.cgi?id=1057321

pointed out that we weren't honoring the <bandwidth> element in
libvirt networks using <forward mode='bridge'/>. In fact, these
networks are just a method of giving a libvirt network name to an
existing Linux host bridge on the system, and libvirt doesn't have
enough information to know where to set such limits. We are working on
a method of supporting network bandwidths for some specific cases of
<forward mode='bridge'/>, but currently libvirt doesn't support it. So
the proper thing to do now is just log an error when someone tries to
put a <bandwidth> element in that type of network. (It's unclear if we
will be able to do proper bandwidth limiting for macvtap networks, and
most definitely we will not be able to support it for hostdev
networks).

While looking through the network XML documentation and comparing it
to the networkValidate function, I noticed that we also ignore the
presence of a mac address in the config in the same cases, rather than
failing so that the user will understand that their desired action has
not been taken.

This patch updates networkValidate() (which is called any time a
persistent network is defined, or a transient network created) to log
an error and fail if it finds either a <bandwidth> or <mac> element
and the network forward mode is anything except 'route'. 'nat', or
nothing. (Yes, neither of those elements is acceptable for any macvtap
mode, nor for a hostdev network).

NB: This does *not* cause failure to start any existing network that
contains one of those elements, so someone might have erroneously
defined such a network in the past, and that network will continue to
function unmodified. I considered it too disruptive to suddenly break
working configs on the next reboot after a libvirt upgrade.
2014-02-05 15:04:58 +02:00
John Ferlan
19259574d5 Honor blacklist for modprobe command
https://bugzilla.redhat.com/show_bug.cgi?id=1045124

When loading modules, libvirt does not honor the modprobe blacklist.
Use the new virKModLoad() API in order to attempt load with blacklist check.
Use the new virKModIsBlacklisted() API to check if the failure to load
was due to the blacklist

Signed-off-by: John Ferlan <jferlan@redhat.com>
2014-02-04 10:43:53 -05:00
John Ferlan
4a2179ea92 utils: Introduce functions for kernel module manipulation
virKModConfig()        - Return a buffer containing kernel module configuration
virKModLoad()          - Load a specific module into the kernel configuration
virKModUnload()        - Unload a specific module from the kernel configuration
virKModIsBlacklisted() - Determine whether a module is blacklisted within
                         the kernel configuration
2014-02-04 08:52:27 -05:00
Laine Stump
0d0a7bf45a qemu: be sure we're using the updated value of backend during hotplug
commit f094aaac changed qemuPrepareHostdevPCIDevices() such that it
may modify the "backend" (vfio vs. legacy kvm) setting in the
virHostdevDef. However, qemuDomainAttachHostPciDevice() (used by
hotplug) copies the backend setting into a local *before* calling
qemuPrepareHostdevPCIDevices(), and then later makes a decision based
on that pre-change value.

The result is that, if the backend had been set to "default" (i.e. not
specified in the config) and was later updated to "VFIO" by
qemuPrepareHostdevPCIDevices(), the qemu process' MacMemLock is not
increased (as is required for VFIO device assignment).

This patch delays making the local copy of backend until after its
potential modification.
2014-02-04 14:05:09 +02:00
Laine Stump
66f75925eb network: change default of forwardPlainNames to 'yes'
The previous patch fixed "forwardPlainNames" so that it really is
doing only what is intended, but left the default to be
"forwardPlainNames='no'". Discussion around the initial version of
that patch led to the decision that the default should instead be
"forwardPlainNames='yes'" (i.e. the original behavior before commit
f3886825). This patch makes that change to the default.
2014-02-04 12:00:26 +02:00
Laine Stump
f69a6b987d network: only prevent forwarding of DNS requests for unqualified names
In commit f386825 we began adding the options

  --domain-needed
  --local=/$mydomain/

to all dnsmasq commandlines with the stated reason of preventing
forwarding of DNS queries for names that weren't fully qualified
domain names ("FQDN", i.e. a name that included some "."s and a domain
name). This was later changed to

  domain-needed
  local=/$mydomain/

when we moved the options from the dnsmasq commandline to a conf file.

The original patch on the list, and discussion about it, is here:

  https://www.redhat.com/archives/libvir-list/2012-August/msg01594.html

When a domain name isn't specified (mydomain == ""), the addition of
"domain-needed local=//" will prevent forwarding of domain-less
requests to the virtualization host's DNS resolver, but if a domain
*is* specified, the addition of "local=/domain/" will prevent
forwarding of any requests for *qualified* names within that domain
that aren't resolvable by libvirt's dnsmasq itself.

An example of the problems this causes - let's say a network is
defined with:

   <domain name='example.com'/>
   <dhcp>
      ..
      <host mac='52:54:00:11:22:33' ip='1.2.3.4' name='myguest'/>
   </dhcp>

This results in "local=/example.com/" being added to the dnsmasq options.

If a guest requests "myguest" or "myguest.example.com", that will be
resolved by dnsmasq. If the guest asks for "www.example.com", dnsmasq
will not know the answer, but instead of forwarding it to the host, it
will return NOT FOUND to the guest. In most cases that isn't the
behavior an admin is looking for.

A later patch (commit 4f595ba) attempted to remedy this by adding a
"forwardPlainNames" attribute to the <dns> element. The idea was that
if forwardPlainNames='yes' (default is 'no'), we would allow
unresolved names to be forwarded. However, that patch was botched, in
that it only removed the "domain-needed" option when
forwardPlainNames='yes', and left the "local=/mydomain/".

Really we should have been just including the option "--domain-needed
--local=//" (note the lack of domain name) regardless of the
configured domain of the network, so that requests for names without a
domain would be treated as "local to dnsmasq" and not forwarded, but
all others (including those in the network's configured domain) would
be forwarded. We also shouldn't include *either* of those options if
forwardPlainNames='yes'. This patch makes those corrections.

This patch doesn't remedy the fact that default behavior was changed
by the addition of this feature. That will be handled in a subsequent
patch.
2014-02-04 12:00:26 +02:00
Martin Kletzander
b44f9e7ec9 spice: don't force user to specify spicevmc channel
We support only one spicevmc channel name anyway and the code is
prepared to use the default one, there's only one check missing.  It
is also mentioned in the documentation already and helps defining
domains with spice vdagent for people using virsh.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-02-03 09:46:47 +01:00
John Ferlan
5c36e63198 Resolve Coverity dead_error_begin
Coverity complains about default: label in libxl_driver.c not be able
to be reached. It's by design for the code and since it's not necessary
in the code nor does it elicit any compiler/make check warnings - just
remove it rather than adding a coverity[dead_error_begin] tag.

While I'm at it, lxc_driver.c and nodeinfo.c have the same design, so I
removed the default labels and the existing coverity tags.
2014-01-31 12:48:01 -05:00
Daniel P. Berrange
6e5c79a1b5 Push nwfilter update locking up to top level
The NWFilter code has as a deadlock race condition between
the virNWFilter{Define,Undefine} APIs and starting of guest
VMs due to mis-matched lock ordering.

In the virNWFilter{Define,Undefine} codepaths the lock ordering
is

  1. nwfilter driver lock
  2. virt driver lock
  3. nwfilter update lock
  4. domain object lock

In the VM guest startup paths the lock ordering is

  1. virt driver lock
  2. domain object lock
  3. nwfilter update lock

As can be seen the domain object and nwfilter update locks are
not acquired in a consistent order.

The fix used is to push the nwfilter update lock upto the top
level resulting in a lock ordering for virNWFilter{Define,Undefine}
of

  1. nwfilter driver lock
  2. nwfilter update lock
  3. virt driver lock
  4. domain object lock

and VM start using

  1. nwfilter update lock
  2. virt driver lock
  3. domain object lock

This has the effect of serializing VM startup once again, even if
no nwfilters are applied to the guest. There is also the possibility
of deadlock due to a call graph loop via virNWFilterInstantiate
and virNWFilterInstantiateFilterLate.

These two problems mean the lock must be turned into a read/write
lock instead of a plain mutex at the same time. The lock is used to
serialize changes to the "driver->nwfilters" hash, so the write lock
only needs to be held by the define/undefine methods. All other
methods can rely on a read lock which allows good concurrency.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-01-30 18:00:20 +00:00
Daniel P. Berrange
0240d94c36 Remove windows thread implementation in favour of pthreads
There are a number of pthreads impls available on Win32
these days, in particular the mingw64 project has a good
impl. Delete the native windows thread implementation and
rely on using pthreads everywhere.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-01-30 18:00:20 +00:00
Daniel P. Berrange
c065984b58 Add a read/write lock implementation
Add virRWLock backed up by a POSIX rwlock primitive

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-01-30 18:00:20 +00:00
Daniel P. Berrange
94e0906839 Skip check-augeas-lockd when QEMU is disabled
The check-augeas-lockd test depends on the file
locking/qemu-lockd.conf, so must be skipped when QEMU
is disabled.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-01-30 18:00:20 +00:00
Osier Yang
b1b81efe9a util: Accept test data path for scsi device's sg_path
Commit 10c9ceff6d intended to introduce new argument for the
testing purpose, but it missed the similar changing of the
device's sg_path. The problem was hidden since my laptop has
the /dev/sg0 and /dev/sg1.  A later patch will modify the tests
accordingly.

Signed-off-by: Osier Yang <jyang@redhat.com>
Reported-by: Pavel Hrdina <phrdina@redhat.com>
2014-01-30 16:34:43 +01:00
Osier Yang
f406aa25f2 qemu: Fix the error message for scsi host device's shareable checking
This fixes the wrong argument order.
2014-01-30 16:50:10 +08:00
Osier Yang
10c9ceff6d util: Add one argument for several scsi utils
To support passing the path of the test data to the utils, one
more argument is added to virSCSIDeviceGetSgName,
virSCSIDeviceGetDevName, and virSCSIDeviceNew, and the related
code is changed accordingly.

Later tests for the scsi utils will be based on this patch.

Signed-off-by: Osier Yang <jyang@redhat.com>
2014-01-30 15:48:28 +08:00
Osier Yang
fd243fc4ad qemu: Don't fail if the SCSI host device is shareable between domains
It doesn't make sense to fail if the SCSI host device is specified
as "shareable" explicitly between domains (NB, it works if and only
if the device is specified as "shareable" for *all* domains,
otherwise it fails).

To fix the problem, this patch introduces an array for virSCSIDevice
struct, which records all the names of domain which are using the
device (note that the recorded domains must specify the device as
shareable).  And the change on the data struct brings on many
subsequent changes in the code.

Prior to this patch, the "shareable" tag didn't work as expected,
it actually work like "non-shareable".  So this patch also added notes
in formatdomain.html to declare the fact.

* src/util/virscsi.h:
  - Remove virSCSIDeviceGetUsedBy
  - Change definition of virSCSIDeviceGetUsedBy and virSCSIDeviceListDel
  - Add virSCSIDeviceIsAvailable

* src/util/virscsi.c:
  - struct virSCSIDevice: Change "used_by" to be an array; Add
    "n_used_by" as the array count
  - virSCSIDeviceGetUsedBy: Removed
  - virSCSIDeviceFree: frees the "used_by" array
  - virSCSIDeviceSetUsedBy: Copy the domain name to avoid potential
    memory corruption
  - virSCSIDeviceIsAvailable: New
  - virSCSIDeviceListDel: Change the logic, for device which is already
    in the list, just remove the corresponding entry in "used_by". And
    since it's only used in one place, we can safely removing the code
    to find out the dev in the list first.
  - Copyright updating

* src/libvirt_private.sys:
  - virSCSIDeviceGetUsedBy: Remove
  - virSCSIDeviceIsAvailable: New

* src/qemu/qemu_hostdev.c:
  - qemuUpdateActiveScsiHostdevs: Check if the device existing before
    adding it to the list;
  - qemuPrepareHostdevSCSIDevices: Error out if the not all domains
    use the device as "shareable"; Also don't try to add the device
    to the activeScsiHostdevs list if it already there; And make
    more sensible error w.r.t the current "shareable" value in
    driver->activeScsiHostdevs.
  - qemuDomainReAttachHostScsiDevices: Change the logic according
    to the changes on helpers.

Signed-off-by: Osier Yang <jyang@redhat.com>
2014-01-30 15:46:24 +08:00
Roman Bogorodskiy
d779d218d4 maint: add configure checks for BSD CPU affinity
Check for presence of sys/cpuset.h header and cpuset_getaffinity()
in configure instead of just using #ifdef __FreeBSD__ for that code.
2014-01-29 12:11:48 -07:00
Michal Privoznik
122cd16982 Revert "networkAllocateActualDevice: Set QoS for bridgeless networks too"
This reverts commit 2996e6be19
and some parts of 2636dc8c4d.

The former one tried to implement QoS setting on bridgeless networks.
However, as discussed upstream [1], the patch is far away from being
useful in even a single case. The whole idea of network QoS is to have
aggregated limits over several interfaces. This patch is doing
completely the opposite when merging two QoS settings (from the network
and the domain interface) into one which is then set at the domain
interface itself, not the network.

The latter one is the test for the previous one. Now none of them makes
sense.

1: https://www.redhat.com/archives/libvir-list/2014-January/msg01441.html

Conflicts:
	tests/virnetdevbandwidthtest.c: New test has been introduced since
    then.
2014-01-29 19:01:19 +01:00
Michal Privoznik
550a2ceffb virCommand: Introduce virCommandSetDryRun
There are some units within libvirt that utilize virCommand API to run
some commands and deserve own unit testing. These units are, however,
not desired to be rewritten to dig virCommand API usage out. As a great
example virNetDevBandwidth could be used. The problem with the bandwidth
unit is: it uses virCommand API heavily. Therefore we need a mechanism
to not really run a command, but rather see its string representation
after which we can decide if the unit construct the correct sequence of
commands or not.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-01-29 18:01:36 +01:00
Peter Krempa
7076b4b72c snapshot: Add support for specifying snapshot disk backing type
Add support for specifying various types when doing snapshots. This will
later allow to do snapshots on network backed volumes. Disks of type
'volume' are not supported by snapshots (yet).

Also amend the test suite to check parsing of the various new disk
types that can now be specified.
2014-01-29 12:56:35 +01:00
Jim Fehlig
37564b471d xen: fix parsing xend http response
Commit df36af58 broke parsing of http response from xend.  The prior
use of atoi() would happily parse e.g. a string containing "200 OK\r\n",
whereas virStrToLong_i() will fail when called with a NULL end_ptr.
Change the calls to virStrToLong_i() to provide a non-NULL end_ptr.
2014-01-28 18:32:49 -07:00
Jiri Denemark
580ddf0d34 cpu: Try to use source CPU model in virConnectBaselineCPU
https://bugzilla.redhat.com/show_bug.cgi?id=1049391

When all source CPU XMLs contain just a single CPU model (with a
possibly varying set of additional feature elements),
virConnectBaselineCPU will try to use this CPU model in the computed
guest CPU. Thus, when used on just a single CPU (useful with
VIR_CONNECT_BASELINE_CPU_EXPAND_FEATURES), the result will not use a
different CPU model.

If the computed CPU uses the source model, set fallback mode to 'forbid'
to make sure the guest CPU will always be as close as possible to the
source CPUs.
2014-01-28 21:27:37 +01:00
Jiri Denemark
802f157e8c cpu: Fix VIR_CONNECT_BASELINE_CPU_EXPAND_FEATURES
https://bugzilla.redhat.com/show_bug.cgi?id=1049391

VIR_CONNECT_BASELINE_CPU_EXPAND_FEATURES flag for virConnectBaselineCPU
did not work if the resulting guest CPU would disable some features
present in its base model. This patch makes sure we won't try to add
such features twice.
2014-01-28 21:27:37 +01:00
Ján Tomko
1c1242a2ed Reword error message for oversized cpu time fields 2014-01-28 08:48:32 +01:00
Ján Tomko
61cf8d8461 Simplify linuxNodeGetCPUStats
Split out the repetitive code.
2014-01-28 08:48:31 +01:00
Roman Bogorodskiy
c022fbc9bb BSD: implement virProcess{Get,Set}Affinity
Implement virProcess{Get,Set}Affinity() using cpuset_getaffinity()
and cpuset_setaffinity() calls. Quick search showed that they are
only available on FreeBSD, so placed it inside existing #ifdef
blocks for FreeBSD instead of adding configure checks.
2014-01-27 09:51:55 -07:00
Pradipta Kr. Banerjee
c6320d3463 Add hw random number generator (/dev/hwrng) to cgroup ACL
Creating a qemu VM with /dev/hwrng as backend RNG device throws the
following error - "Could not open '/dev/hwrng': Permission denied"
This patch fixes the issue

Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2014-01-27 09:48:39 -07:00
Michal Privoznik
2996e6be19 networkAllocateActualDevice: Set QoS for bridgeless networks too
https://bugzilla.redhat.com/show_bug.cgi?id=1055484

Currently, libvirt's XML schema of network allows QoS to be defined for
every network even though it has no bridge. For instance:

<network>
    <name>vdsm-no-bridge</name>
    <forward mode='passthrough'>
      <interface dev='em1.10'/>
    </forward>
    <bandwidth>
        <inbound average='1000' peak='5000' burst='1024'/>
        <outbound average='1000' burst='1024'/>
    </bandwidth>
</network>

The bandwidth limitations can be, however, applied even on such
networks. In fact, they are going to be applied on the interface that
will be connected to the network on a domain startup. This approach,
however, has one limitation. With bridged networks, there are two points
where QoS can be set: bridge and domain interface. The lower limit of
the two is enforced then. For instance, if the interface has 10Mbps
average, but the network only 1Mbps, there's no way for interface to
transmit packets faster than the 1Mbps limit. With two points this is
enforced by kernel.  With only one point, we must combine both QoS
settings into one which is set afterwards. Look at
virNetDevBandwidthMinimal() and you'll understand immediately what I
mean.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-01-27 12:11:27 +01:00
Ján Tomko
5099f745e6 Add test for linuxNodeGetCPUStats
Check if cpu stats are read correctly from a sample
/proc/stat collected from a 24 CPU machine.
2014-01-27 11:04:02 +01:00
Ján Tomko
b3b44c572c Move test-local declarations to nodeinfopriv.h
linuxNodeInfoCPUPopulate is only used in the nodeinfo.c file
and in the test suite.
2014-01-27 11:04:02 +01:00
Oleg Strikov
29ea437e40 qemu: Enable 'host-passthrough' cpu mode for aarch64
This patch allows libvirt user to specify 'host-passthrough'
cpu mode while using qemu/kvm backend on aarch64.
It uses 'host' as a CPU model name instead of some other stub
(correct CPU detection is not implemented yet) to allow libvirt
user to specify 'host-model' cpu mode as well.

Signed-off-by: Oleg Strikov <oleg.strikov@canonical.com>

(crobinso: fix some indentation)
2014-01-25 12:10:02 -05:00
John Ferlan
46a0737e13 Block info query: Add check for transient domain
Currently the qemuDomainGetBlockInfo will return allocation == physical
for most backing stores. For a qcow2 block backed device it's possible
to return the highest lv extent allocated from qemu for an active guest.
That is a value where allocation != physical and one would hope be less.
However, if the guest is not running, then the code falls back to returning
allocation == physical. This turns out to be problematic for rhev which
monitors the size of the backing store. During a migration, before the
VM has been started on the target and while it is deemed inactive on the
source, there's a small window of time where the allocation is returned
as physical triggering the code to extend the file unnecessarily.

Since rhev uses transient domains and this is edge condition for a transient
domain, rather than returning good status and allocation == physical when
this "window of opportunity" exists, this patch will check for a transient
(or non persistent) domain and return a failure to the caller rather than
returning the defaults. For a persistent domain, the defaults will be
returned. The description for the virDomainGetBlockInfo has been updated
to describe the phenomena.
2014-01-24 11:37:18 -05:00
Gao feng
71f7d5840f qemu: remove memset params array to zero in qemuDomainGetPercpuStats
the array params is allocated by VIR_ALLOC_N in
remoteDispatchDomainGetCPUStats. it had been set
to zero. No need to reset it to zero again, and
this reset here is incorrect too, nparams * ncpus
is the array length not the size of params array.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2014-01-24 16:31:53 +08:00
Osier Yang
88ae5dc759 storage: Fix the memory leak
The return value of virGetFCHostNameByWWN is a strdup'ed string.
Also add comments to declare that the caller should take care of
freeing it.
2014-01-23 21:39:05 +08:00
Osier Yang
7519958735 util: Fix the indention
Left in the git cache without commit before pushing. Pushed under
build breaker and trivial rule.
2014-01-23 18:16:11 +08:00
Osier Yang
2b66504ded util: Add "shareable" field for virSCSIDevice struct
Unlike the host devices of other types, SCSI host device XML supports
"shareable" tag. This patch introduces it for the virSCSIDevice struct
for a later patch use (to detect if the SCSI device is shareable when
preparing the SCSI host device in QEMU driver).
2014-01-23 17:52:33 +08:00
Osier Yang
2340f0196f storage: Fix autostart of pool with "fc_host" type adapter
The "checkPool" is a bit different for pool with "fc_host"
type source adapter, since the vHBA it's based on might be
not created yet (it's created by "startPool", which is
involked after "checkPool" in storageDriverAutostart). So it
should not fail, otherwise the "autostart" of the pool will
fail either.

The problem is easy to reproduce:
    * Enable "autostart" for the pool
    * Restart libvirtd service
    * Check the pool's state
2014-01-23 17:50:29 +08:00
Bing Bu Cao
2310e631cd Fix buffer size in linuxNodeGetCPUstats
94f8205 added a space to the string but didn't change the buffer size.

Signed-off-by: Bing Bu Cao <mars@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2014-01-23 10:29:14 +01:00
Osier Yang
6b29eb848f storage: Add document for possible problem on volume detection
For pool which relies on remote resources, such as a "iscsi" type
pool, since how long it takes to export the corresponding devices
to host's sysfs is really depended, it could depend on the network
connection, it also could depend on the host's udev procedures. So
it's likely that the volumes are not able to be detected during pool
starting process, polling the sysfs doesn't work, since we don't
know how much time is best for the polling, and even worse, the
volumes could still be not detected or partly not detected even after
the polling.  So we end up with a documentation to prompt the fact,
in virsh manual.

And as a small improvement, let's explicitly say no LUNs found in
the debug log in that case.
2014-01-23 13:47:55 +08:00
Osier Yang
ae2860b4c6 util: Correct the NUMA node range checking
There are 2 issues here: First we shouldn't add "1" to the return
value of numa_max_node(), since the semanteme of the error message
was changed, it's not saying about the number of total NUMA nodes
anymore.  Second, the value of "bit" is the position of the first
bit which exceeds either numa_max_node() or NUMA_NUM_NODES, it can
be any number in the range, so saying "bigger than $bit" is quite
confused now. For example, assuming there is a NUMA machine which
has 10 NUMA nodes, and one specifies the "nodeset" as "0,5,88",
the error message will be like:

Nodeset is out of range, host cannot support NUMA node bigger than 88

It sounds like all NUMA node number less than 88 is fine, but
actually the maximum NUMA node number the machine supports is 9.

This patch fixes the issues by removing the addition with "1" and
simplifies the error message as "NUMA node $bit is out of range".
Also simplifies the comparision in the while loop by getting the
smaller one of numa_max_node() and NUMA_NUM_NODES up front.
2014-01-23 13:19:56 +08:00
Eric Blake
7f2d27d1e3 api: require write permission for guest agent interaction
I noticed that we allow virDomainGetVcpusFlags even for read-only
connections, but that with a flag, it can require guest agent
interaction.  It is feasible that a malicious guest could
intentionally abuse the replies it sends over the guest agent
connection to possibly trigger a bug in libvirt's JSON parser,
or withhold an answer so as to prevent the use of the agent
in a later command such as a shutdown request.  Although we
don't know of any such exploits now (and therefore don't mind
posting this patch publicly without trying to get a CVE assigned),
it is better to err on the side of caution and explicitly require
full access to any domain where the API requires guest interaction
to operate correctly.

I audited all commands that are marked as conditionally using a
guest agent.  Note that at least virDomainFSTrim is documented
as needing a guest agent, but that such use is unconditional
depending on the hypervisor (so the existing domain:fs_trim ACL
should be sufficient there, rather than also requirng domain:write).
But when designing future APIs, such as the plans for obtaining
a domain's IP addresses, we should copy the approach of this patch
in making interaction with the guest be specified via a flag, and
use that flag to also require stricter access checks.

* src/libvirt.c (virDomainGetVcpusFlags): Forbid guest interaction
on read-only connection.
(virDomainShutdownFlags, virDomainReboot): Improve docs on agent
interaction.
* src/remote/remote_protocol.x
(REMOTE_PROC_DOMAIN_SNAPSHOT_CREATE_XML)
(REMOTE_PROC_DOMAIN_SET_VCPUS_FLAGS)
(REMOTE_PROC_DOMAIN_GET_VCPUS_FLAGS, REMOTE_PROC_DOMAIN_REBOOT)
(REMOTE_PROC_DOMAIN_SHUTDOWN_FLAGS): Require domain:write for any
conditional use of a guest agent.
* src/xen/xen_driver.c: Fix clients.
* src/libxl/libxl_driver.c: Likewise.
* src/uml/uml_driver.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
* src/lxc/lxc_driver.c: Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-01-22 16:52:41 -07:00
Jean-Baptiste Rouault
bb85da2cb1 vbox: add support for v4.2.20+ and v4.3.4+
Bugs have been found in the VirtualBox API C bindings. These bugs have
been fixed in versions 4.2.20 and 4.3.4. However, the changes in the
C bindings are incompatible with the vbox_CAPI_v4_2.h and vbox_CAPI_v4_3.h
files which are bundled in libvirt source code.
This is why the following patch adds vbox_CAPI_v4_2_20.h and
vbox_CAPI_v4_3_4.h.

The actual underlying problem here is that until now,
libvirt assumed that VirtualBox API can only change between minor
versions (4.2 -> 4.3), but we have a case here where it changed
(or got fixed) between patch versions (4.2.18 -> 4.2.20).

This patch makes the VBOX_API_VERSION represent the full API
version number (i.e 4002 => 4002000) so there are specific version
numbers for Vbox 4.2.20 (4002020) and 4.3.4 (4003004)
2014-01-22 23:12:52 +01:00
Peter Krempa
7f0fd42741 qemu: Avoid crash in qemuDiskGetActualType
Libvirtd would crash if a domain contained an empty cdrom drive of
type='volume' as the disk def->srcpool member would be dereferenced. Fix
it by checking if the source pool is present before dereferencing it.

Also alter tests to catch this issue in the future.

Reported by: Kevin Shanahan
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1056328
2014-01-22 11:33:31 +01:00
Michael Chapman
881a2cff0d virtlockd: make re-exec more robust
- Use $XDG_RUNTIME_DIR for re-exec state file when running unprivileged.

- argv[0] may not contain a full path to the binary, however it should
  contain something that can be looked up in the PATH. Use execvp() to
  do path lookup on re-exec.

- As per list discussion [1], ignore --daemon on re-exec.

[1] https://www.redhat.com/archives/libvir-list/2013-December/msg00514.html

Signed-off-by: Michael Chapman <mike@very.puzzling.org>
2014-01-22 10:44:41 +01:00
Bing Bu Cao
94f8205359 linuxNodeGetCPUStats: Correctly handle cpu prefix
To retrieve node cpu statistics on Linux system, the
linuxNodeGetCPUstats function simply uses STRPREFIX() to match the cpuid
with the one read from /proc/stat. However, as the file is read line by
line it may happen, that some CPUs share the same prefix. So if user
requested stats for the first CPU, which is offline, then there's no
cpu1 in the stats file so the one that we match is cpu10. Which is
obviously wrong. Fortunately, the IDs are terminated by a space, so we
can utilize that.

Signed-off-by: Bing Bu Cao <mars@linux.vnet.ibm.com>
2014-01-21 17:24:03 +01:00
Peter Krempa
3d1e9e4779 qemu: snapshot: Forbid snapshots when backing is a scsi passthrough disk
https://bugzilla.redhat.com/show_bug.cgi?id=1034993

SCSI passthrough disks (<disk .. device="lun">) can't be used as backing
for snapshots. Currently with upstream qemu the vm crashes on such
attempt.

This patch adds a early check to catch an attempt to do such a snapshot
and rejects it right away. qemu will fix the issue but this will let us
control the error message.
2014-01-21 17:05:21 +01:00