Commit Graph

17251 Commits

Author SHA1 Message Date
Martin Kletzander
7e72ac7878 qemu: leave restricting cpuset.mems after initialization
When domain is started with numatune memory mode strict and the
nodeset does not include host NUMA node with DMA and DMA32 zones, KVM
initialization fails.  This is because cgroup restrict even kernel
allocations.  We are already doing numa_set_membind() which does the
same thing, only it does not restrict kernel allocations.

This patch leaves the userspace numa_set_membind() in place and moves
the cpuset.mems setting after the point where monitor comes up, but
before vcpu and emulator sub-groups are created.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
aa668fccf0 qemu: split out cpuset.mems setting
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
1c19d3e072 qemu: pass numa node binding preferences to qemu
Currently, we only bind the whole QEMU domain to memory nodes
specified in nodemask altogether.  That, however, doesn't make much
sense when one wants to control from where the memory for particular
guest nodes should be allocated.  QEMU allows us to do that by
specifying 'host-nodes' parameter for the 'memory-backend-ram' object,
so let's use that.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
001b9dc1dc qemu: enable disjoint numa cpu ranges
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
1a324c2f88 qemu: newer -numa parameter capability probing
When qemu switched to using OptsVisitor for -numa parameter, it did
two things in the same patch.  One of them is that the numa parameter
is now visible in "query-command-line-options", the second one is that
it enabled using disjoint cpu ranges for -numa specification.  This
will be used in later patch.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
ad064ec6e6 qemu: memory-backend-ram capability probing
The numa patch series in qemu adds "memory-backend-ram" object type by
which we can tell whether we can use such objects.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
7bc1db5a1d qemu: allow qmp probing for cmdline options without params
That can be lately achieved with by having .param == NULL in the
virQEMUCapsCommandLineProps struct.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
1a7be8c600 numatune: add support for per-node memory bindings in private APIs
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
a05c01521c conf, schema: add support for memnode elements
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
93e82727ec numatune: Encapsulate numatune configuration in order to unify results
There were numerous places where numatune configuration (and thus
domain config as well) was changed in different ways.  On some
places this even resulted in persistent domain definition not to be
stable (it would change with daemon's restart).

In order to uniformly change how numatune config is dealt with, all
the internals are now accessible directly only in numatune_conf.c and
outside this file accessors must be used.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
e764ec7ae3 numatune: unify numatune struct and enum names
Since there was already public virDomainNumatune*, I changed the
private virNumaTune to match the same, so all the uses are unified and
public API is kept:

s/vir\(Domain\)\?Numa[tT]une/virDomainNumatune/g

then shrunk long lines, and mainly functions, that were created after
that:

sed -i 's/virDomainNumatuneMemPlacementMode/virDomainNumatunePlacement/g'

And to cope with the enum name, I haad to change the constants as
well:

s/VIR_NUMA_TUNE_MEM_PLACEMENT_MODE/VIR_DOMAIN_NUMATUNE_PLACEMENT/g

Last thing I did was at least a little shortening of already long
name:

s/virDomainNumatuneDef/virDomainNumatune/g

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
293d5f21b6 numatune: create new module for numatune
There are many places with numatune-related code that should be put
into special numatune_conf and this patch creates a basis for that.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
992000e6d8 conf, schema: add 'id' field for cells
In XML format, by definition, order of fields should not matter, so
order of parsing the elements doesn't affect the end result.  When
specifying guest NUMA cells, we depend only on the order of the 'cell'
elements.  With this patch all older domain XMLs are parsed as before,
but with the 'id' attribute they are parsed and formatted according to
that field.  This will be useful when we have tuning settings for
particular guest NUMA node.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
775c46956e conf: purely a code movement
to ease the review of commits to follow.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
92ff464bbb qemu: remove useless error check
Excerpt from the virCommandAddArgBuffer() description: "Correctly
transfers memory errors or contents from buf to cmd."

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
cee22001d3 qemu: purely a code movement
to ease the review of commits to follow.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Michele Paolino
a14abd463a support for QEMU vhost-user
This patch adds support for the QEMU vhost-user feature to libvirt.
vhost-user enables the communication between a QEMU virtual machine
and other userspace process using the Virtio transport protocol.
It uses a char dev (e.g. Unix socket) for the control plane,
while the data plane based on shared memory.

The XML looks like:

<interface type='vhostuser'>
    <mac address='52:54:00:3b:83:1a'/>
    <source type='unix' path='/tmp/vhost.sock' mode='server'/>
    <model type='virtio'/>
</interface>

Signed-off-by: Michele Paolino <m.paolino@virtualopensystems.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-07-16 18:44:57 +02:00
Eric Blake
97c59b9c46 blockjob: wait for pivot to complete
https://bugzilla.redhat.com/show_bug.cgi?id=1119173 documents that
commit eaba79d was flawed in the implementation of the
VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag when it comes to completing
a blockcopy.  Basically, the qemu pivot action is async (the QMP
command returns immediately, but the user must wait for the
BLOCK_JOB_COMPLETE event to know that all I/O related to the job
has finally been flushed), but the libvirt command was documented
as synchronous by default.  As active block commit will also be
using this code, it is worth fixing now.

* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Don't skip wait
loop after pivot.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-07-16 07:23:24 -06:00
Eric Blake
a0b5ace28c util: forbid freeing const pointers
Now that we've finally fixed all the violators, it's time to
enforce that any pointer to a const object is never freed (it
is aliasing some other memory, where the non-const original
should be freed instead).  Alas, the code still needs a normal
vs. Coverity version, but at least we are still guaranteeing
that the macro call evaluates its argument exactly once.

I verified that we still get the following compiler warnings,
which in turn halts the build thanks to -Werror on gcc (hmm,
gcc 4.8.3's placement of the ^ for ?: type mismatch is a bit
off, but that's not our problem):

    int oops1 = 0;
    VIR_FREE(oops1);
    const char *oops2 = NULL;
    VIR_FREE(oops2);
    struct blah { int dummy; } oops3;
    VIR_FREE(oops3);

util/virauthconfig.c:159:35: error: pointer/integer type mismatch in conditional expression [-Werror]
     VIR_FREE(oops1);
                                   ^
util/virauthconfig.c:161:5: error: passing argument 1 of 'virFree' discards 'const' qualifier from pointer target type [-Werror]
     VIR_FREE(oops2);
     ^
In file included from util/virauthconfig.c:28:0:
util/viralloc.h:79:6: note: expected 'void *' but argument is of type 'const void *'
 void virFree(void *ptrptr) ATTRIBUTE_NONNULL(1);
      ^
util/virauthconfig.c:163:35: error: type mismatch in conditional expression
     VIR_FREE(oops3);
                                   ^

* src/util/viralloc.h (VIR_FREE): No longer cast away const.
* src/xenapi/xenapi_utils.c (xenSessionFree): Work around bogus
header.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-07-16 06:48:53 -06:00
Chunyan Liu
0b0c641b66 add nocow test case
Add file in storagevolxml2xmlin and storagevolxml2xmlout, let
storagevolxml2xmltest and storagevolschematest cover 'nocow'.
Add test case to storagevolxml2argvtest to cover 'nocow'.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
2014-07-16 13:35:26 +02:00
Chunyan Liu
a9fd30e633 storagevol: add nocow to vol xml
Add 'nocow' to storage volume xml so that user can have an option
to set NOCOW flag to the newly created volume. It's useful on btrfs
file system to enhance performance.

Btrfs has low performance when hosting VM images, even more when the guest
in those VM are also using btrfs as file system. One way to mitigate this
bad performance is to turn off COW attributes on VM files. Generally, there
are two ways to turn off COW on btrfs: a) by mounting fs with nodatacow,
then all newly created files will be NOCOW. b) per file. Add the NOCOW file
attribute. It could only be done to empty or new files.

This patch tries the second way, according to 'nocow' option, it could set
NOCOW flag per file:
for raw file images, handle 'nocow' in libvirt code; for non-raw file images,
pass 'nocow=on' option to qemu-img, and let qemu-img to handle that (requires
qemu-img version >= 2.1).

Signed-off-by: Chunyan Liu <cyliu@suse.com>
2014-07-16 13:35:20 +02:00
Michal Privoznik
607806f87f Fix const correctness
In many places we define a variable as a 'const char *' when in fact
we modify it just a few lines below. Or even free it. We should not do
that.

There's one exception though, in xenSessionFree() xenapi_utils.c. We
are freeing the xen_session structure which is defined in
xen/api/xen_common.h public header. The structure contains session_id
which is type of 'const char *' when in fact it should have been just
'char *'. So I'm leaving this unmodified, just noticing the fact in
comment.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-07-16 12:07:24 +02:00
Peter Krempa
70120e2f5d storage: fs: Don't fail volume update if backing store isn't accessible
When the backing store of a volume wasn't accessible while updating the
volume definition the call would fail altogether. In cases where we
currently (incorrectly) treat remote backing stores as local one this
might lead to strange errors.

Ignore the opening errors until we figure out how to track proper volume
metadata.
2014-07-16 11:42:52 +02:00
Peter Krempa
dc2943579f storage: fs: Properly parse backing store info
Use the backing store parser to properly create the information about a
volume's backing store. Unfortunately as the storage driver isn't
prepared to allow volumes backed by networked filesystems add a
workaround that will avoid changing the XML output.
2014-07-16 11:42:51 +02:00
Peter Krempa
cd4d547576 storage: fs: Process backing store data in virStorageBackendProbeTarget
Move the processing of the backend metadata directly to the helper
instead of passing it through arguments to the function.
2014-07-16 11:42:51 +02:00
Peter Krempa
9f20d6a56d storage: backend: fs: Touch up coding style
virStorageBackendFileSystemRefresh() used "cleanup" label just for error
exits and didn't meet libvirt's standard for braces in one case.
2014-07-16 11:42:51 +02:00
Peter Krempa
15213d1e5d storage: Track backing store of a volume in the target struct
As we have a nested pointer for storing the backing store of a volume
there's no need to store it in a separate struct.
2014-07-16 11:42:51 +02:00
Peter Krempa
c861750ee9 storage: backend: Fix formatting of function arguments 2014-07-16 11:42:51 +02:00
Ján Tomko
3103a9770f Fix assignment of comparison against zero
Assign the value we're comparing:
(val = func()) < 0
instead of assigning the comparison value:
(val = func() < 0)

Both were introduced along with the code,
the TLS tests by commit bd789df in 0.9.4
net events by commit de87691 in 1.2.2.

Note that the event id type fix is a no-op:
vshNetworkEventIdTypeFromString can only return
-1 (failure) and the event is never used or
0 (the only possible event) and the value of 0 < 0 is still 0.
2014-07-16 09:39:57 +02:00
Ján Tomko
d7dedc3650 Fix error on fs pool build failure
https://bugzilla.redhat.com/show_bug.cgi?id=1119592

Introduced by commit 62927dd v0.7.6.
2014-07-16 09:39:57 +02:00
Eric Blake
13228b854c spec: fix invalid syntax
Commit 20e01504 broke 'make rpm':

error: line 540: Unknown tag:     %elif 020 >= 12 || 0 >= 6

Apparently, even though shell has elif so that you can do a chain
of conditionals, the rpm spec file does not, and you have to nest
things instead.

* libvirt.spec.in: Convert %elif to proper nested %if.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-07-15 17:11:56 -06:00
Cédric Bosdonnat
9265f8ab67 Rework lxc apparmor profile
Rework the apparmor lxc profile abstraction to mimic ubuntu's container-default.
This profile allows quite a lot, but strives to restrict access to
dangerous resources.

Removing the explicit authorizations to bash, systemd and cron files,
forces them to keep the lxc profile for all applications inside the
container. PUx permissions where leading to running systemd (and others
tasks) unconfined.

Put the generic files, network and capabilities restrictions directly
in the TEMPLATE.lxc: this way, users can restrict them on a per
container basis.
2014-07-15 12:57:05 -06:00
Roman Bogorodskiy
61bbdbb94c Implement interface stats for BSD 2014-07-15 22:00:59 +04:00
Roman Bogorodskiy
5559a8b838 util: virstatslinux: make more generic
Rename linuxDomainInterfaceStats to virNetInterfaceStats in order
to allow adding platform specific implementations without
making consumer worrying about specific implementation to be used.

Also, rename util/virstatslinux.c to util/virstats.c so placing
other platform specific implementations into this file don't
look unexpected from the file name.
2014-07-15 22:00:59 +04:00
Chunyan Liu
2f97ea328f libxl: fix return value error Attach|DetachDeviceFlags
Code logic in libxlDomainAttachDeviceFlags and libxlDomainDetachDeviceFlags
is wrong with return value in error cases.

'ret' was being set to 0 if 'flags & VIR_DOMAIN_DEVICE_MODIFY_CONFIG' was
false. Then if something like virDomainDeviceDefParse() failed in the
VIR_DOMAIN_DEVICE_MODIFY_LIVE logic, the error would be reported but the
function would return success.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
2014-07-15 11:02:25 -06:00
Chunyan Liu
b0d2454023 libxl: support hotplug of <interface>
Add code to support attach/detaching a network device.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
2014-07-15 11:00:47 -06:00
Chunyan Liu
232cf2a45c libxl: add HOSTDEV type in libxlDomainDetachDeviceConfig
Missing HOSTDEV type in libxlDomainDetachDeviceConfig. Add it.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
2014-07-15 09:10:30 -06:00
Jiri Denemark
20e01504a1 spec: Update polkit dependencies for CVE-2013-4311
Use secured polkit on distros which provide it. However, RHEL-6 will
still allow for older polkit-0.93 rather than forcing polkit-0.96-5
which is not available in all RHEL-6 releases.
2014-07-15 16:34:53 +02:00
Peter Krempa
95d6aff787 qemu: blockcopy: Initialize correct source structure
4cc1f1a01f introduced a crash when doing a
block copy as virStorageSourceInitChainElement was called on
"disk->mirror" that is still NULL at that point instead of "mirror"
which temporarily holds the mirror source struct until it's fully
initialized. This resulted into a crash as a NULL was dereferenced.

Reported by: Shanzi Yu <shyu@redhat.com>
2014-07-15 10:31:36 +02:00
John Ferlan
54d4619cda GetBlockInfo: Use the correct path to qemuOpenFile
Commit id '3ea661de' refactored the code to use the 'disk->src->path'
instead of getting the path from virDomainDiskGetSource().  The one
call to qemuOpenFile() didn't use the disk source path, rather it used
the path as passed from the caller (in this case 'vda') - this caused
a failure with the virt-test/tp-libvirt as follows:

$ virsh domblkinfo virt-tests-vm1 vda
error: cannot stat file '/home/virt-test/shared/data/images/jeos-20-64.qcow2': Bad file descriptor

$
2014-07-14 13:19:28 -04:00
Eric Blake
58156f39ce capabilities: use bool instead of int
While preparing to add a capability for active commit, I noticed
that the existing code was abusing int for boolean values.

* src/conf/capabilities.h (_virCapsGuestFeature, _virCapsHost)
(virCapabilitiesNew, virCapabilitiesAddGuestFeature): Improve
types.
* src/conf/capabilities.c (virCapabilitiesNew)
(virCapabilitiesAddGuestFeature): Adjust signature.
* src/bhyve/bhyve_capabilities.c (virBhyveCapsBuild): Update
clients.
* src/esx/esx_driver.c (esxCapsInit): Likewise.
* src/libxl/libxl_conf.c (libxlMakeCapabilities): Likewise.
* src/lxc/lxc_conf.c (virLXCDriverCapsInit): Likewise.
* src/openvz/openvz_conf.c (openvzCapsInit): Likewise.
* src/parallels/parallels_driver.c (parallelsBuildCapabilities):
Likewise.
* src/phyp/phyp_driver.c (phypCapsInit): Likewise.
* src/qemu/qemu_capabilities.c (virQEMUCapsInit)
(virQEMUCapsInitGuestFromBinary): Likewise.
* src/security/virt-aa-helper.c (get_definition): Likewise.
* src/test/test_driver.c (testBuildCapabilities): Likewise.
* src/uml/uml_conf.c (umlCapsInit): Likewise.
* src/vbox/vbox_tmpl.c (vboxCapsInit): Likewise.
* src/vmware/vmware_conf.c (vmwareCapsInit): Likewise.
* src/xen/xen_hypervisor.c (xenHypervisorBuildCapabilities):
Likewise.
* src/xenapi/xenapi_driver.c (getCapsObject): Likewise.
* tests/qemucaps2xmltest.c (testGetCaps): Likewise.
* tests/testutils.c (virTestGenericCapsInit): Likewise.
* tests/testutilslxc.c (testLXCCapsInit): Likewise.
* tests/testutilsqemu.c (testQemuCapsInit): Likewise.
* tests/testutilsxen.c (testXenCapsInit): Likewise.
* tests/vircaps2xmltest.c (buildVirCapabilities): Likewise.
* tests/vircapstest.c (buildNUMATopology): Likewise.
* tests/vmx2xmltest.c (testCapsInit): Likewise.
* tests/xml2vmxtest.c (testCapsInit): Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-07-14 08:00:46 -06:00
Eric Blake
06cf86e94b docs: mention more about older capability feature bits
Our documentation for features was rather sparse; this fleshes out
more of the details for other existing capabilities (and cost me
some time trawling git history).

* docs/formatcaps.html.in: Document it feature bits.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-07-14 06:11:34 -06:00
Chunhe Li
33445ce844 openvswitch: Delete port if it exists while adding a new one
If the openvswitch service is stopped, and is followed by destroying a
VM, the openvswitch bridge translates into a state where it doesn't
recover the port configuration. While it successfully fetches data
from the internal DB, since the corresponding virtual interface does
not exists anymore the whole recovery process fails leaving restarted
VM with inability to connect to the bridge. The following set of
commands will trigger the problem:

virsh start vm
service openvswitch-switch stop
virsh destroy vm
service openvswitch-switch start
virsh start vm

Signed-off-by: Chunhe Li <lichunhe@huawei.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-07-14 12:49:30 +02:00
John Ferlan
1c89f6ebd4 virseclabel: Resolve Coverity FORWARD_NULL issue
Resolve issue introduced by commit id '13adf1b'
2014-07-14 05:44:20 -04:00
Michal Privoznik
da78351b57 virSecurityLabelDefParseXML: Rework
Instead of allocating the virSecurityLabelDef structure ourselves, we
can utilize virSecurityLabelDefNew which even sets the default values
for us.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-07-14 11:10:09 +02:00
Michal Privoznik
99c8d2e808 conf: Always format seclabel's model
https://bugzilla.redhat.com/show_bug.cgi?id=1113860

We've always done that. Well, until 990e46c45. Point is, if we don't
format model, we may lose a domain on libvirtd restart. If the
seclabel is implicit however, we should skip it's formatting.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-07-14 11:10:09 +02:00
Peter Krempa
6f04fb151b doc: Be more specific about semantics of _REUSE_EXT flag
Snapshots and block-copy have a flag that forces qemu to re-use existing
file. Our docs weren't exactly clear on what the existing file should
contain for this to actually work.

Re-word the docs a bit to state that the file needs to be pre-created in
the desired format and the backing chain metadata needs to be set prior
to handing it over to qemu.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1084360
2014-07-14 09:26:39 +02:00
Peter Krempa
500f80a595 doc: Document that snapshot name of block-backed disk isn't autogenerated
Libvirt generates external snapshot target file names for file backed
storage but not for block backed storage. Document the limitation.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1032363
2014-07-14 09:26:26 +02:00
Matthias Bolte
270969c4dd conf: Fix possible NULL dereference in virStorageVolTargetDefFormat
Commit dae1568c6c converted the perms
member of the virStorageVolTarget struct into a pointer to make it
optional. But virStorageVolTargetDefFormat did not check perms for
NULL before dereferencing it.
2014-07-11 17:00:46 -06:00
Cédric Bosdonnat
9b1e4cd503 aa-helper: adjust previous patch
Don't fail when there is nothing to do, as a tweak to the previous
patch regarding output of libvirt-UUID.files for LXC apparmor profiles

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-07-11 14:14:50 -06:00