12605 Commits

Author SHA1 Message Date
Chen Hanxiao
a86b6215a7 LXC: create a bind mount for sysfs when enable userns but disable netns
kernel commit 7dc5dbc879bd0779924b5132a48b731a0bc04a1e
forbid us doing a fresh mount for sysfs
when enable userns but disable netns.
This patch will create a bind mount in this senario.

Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
2014-07-23 15:09:09 +08:00
Peter Krempa
1e833899ce qemu: snapshot: Forbid taking/reverting snapshots in PMSUSPENDED state
Qemu doesn't currently support them and behaves strangely. Just forbid
them.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1079162
2014-07-22 10:22:35 +02:00
Peter Krempa
c71045a9cb qemu: snapshot: Forbid taking snapshot in invalid state
Similarly to 49a3a649a85f9d3d478be355aa8694bce889586a forbid creating
snapshots in domain states impossible to reach in qemu.
2014-07-22 10:22:35 +02:00
Eric Blake
72823b4443 build: fix build without numactl
Under ./configure --without-numactl but with numactl-devel installed,
the build fails with:

../../src/util/virnuma.c: In function 'virNumaNodeIsAvailable':
../../src/util/virnuma.c:407:5: error: implicit declaration of function 'numa_bitmask_isbitset' [-Werror=implicit-function-declaration]
     return numa_bitmask_isbitset(numa_nodes_ptr, node);
     ^

and other failures, all because the configure results for particular
functions were used without regard to whether libnuma was even being
linked in.

* src/util/virnuma.c (virNumaGetPages): Fix message typo.
(virNumaNodeIsAvailable): Correct build when not using numactl.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-07-21 12:50:00 -06:00
Roman Bogorodskiy
53939d58cb storage: logical: drop useless if
virStorageBackendLogicalCreateVol contains a piece like:

    if (vol->target.path != NULL) {
        /* A target path passed to CreateVol has no meaning */
        VIR_FREE(vol->target.path);
    }

The 'if' is useless here, but 'syntax-check' doesn't catch that
because of the comment, so drop the 'if'.
2014-07-21 21:34:14 +04:00
Roman Bogorodskiy
b5f57be2a2 Fix build on non-Linux platforms
Commit ef48a1b introduced virFindSCSIHostByPCI for Linux and
a stub for other platforms that returns -1 while the function
should return 'char *', so use 'return NULL' instead.

Commit fbd91d4 introduced virReadSCSIUniqueId with the third
argument 'int *result', however the stub for non-Linux patform
uses 'unsigned int *result', so change it to 'int *result'.

Pushed under the build breaker rule.
2014-07-21 21:26:00 +04:00
John Ferlan
ea37fb34a9 getAdapterName: Lookup stable scsi_host
If a parentaddr was provided in the XML, have getAdapterName lookup
the stable address.  This allows virStorageBackendSCSICheckPool() and
virStorageBackendSCSIRefreshPool() to automagically find the scsi_host
by its PCI address and unique_id
2014-07-21 12:55:11 -04:00
John Ferlan
ef48a1b613 scsi_host: Introduce virFindSCSIHostByPCI
Introduce a new function to parse the provided scsi_host parent address
and unique_id value in order to find the /sys/class/scsi_host directory
which will allow a stable SCSI host address

Add a test to scsihosttest to lookup the host# name by using the PCI address
and unique_id value
2014-07-21 12:55:11 -04:00
John Ferlan
f3271f4cb3 Add unique_id to nodedev output
Add an optional unique_id parameter to nodedev.  Allows for easier lookup
and display of the unique_id value in order to document for use with
scsi_host code.
2014-07-21 12:55:11 -04:00
John Ferlan
fbd91d496e virutil: Introduce virReadSCSIUniqueId
Introduce a new function to read the current scsi_host entry and return
the value found in the 'unique_id' file.

Add a 'scsihosttest' test (similar to the fchosttest, but incorporating some
of the concepts of the mocked pci test library) in order to read the
unique_id file like would be found in the /sys/class/scsi_host tree.
2014-07-21 12:55:11 -04:00
John Ferlan
aa9dac09b3 scsi_backend: Use existing LINUX_SYSFS_SCSI_HOST_PREFIX definition
Rather than supplying the path again in the formatting of the sysfs
scsi_host directory.
2014-07-21 12:55:10 -04:00
Osier Yang
a4bd62adc1 storage: Introduce parentaddr into virStoragePoolSourceAdapter
Between reboots and kernel reloads, the SCSI host number used for SCSI
storage pools may change requiring modification to the storage pool XML
in order to use a specific SCSI host adapter.

This patch introduces the "parentaddr" element and "unique_id" attribute
for the SCSI host adapter in order to uniquely identify the adapter
between reboots and kernel reloads. For now the goal is to only parse
and format the XML. Both will be required to be provided in order to
uniquely identify the desired SCSI host.

The new XML is expected to be as follows:

  <adapter type='scsi_host'>
    <parentaddr unique_id='3'>
      <address domain='0x0000' bus='0x00' slot='0x1f' func='0x2'/>
    </parentaddr>
  </adapter>

where "parentaddr" is the parent device of the SCSI host using the PCI
address on which the device resides and the value from the unique_id file
for the device. Both the PCI address and unique_id values will be used
to traverse the /sys/class/scsi_host/ directories looking at each link
to match the PCI address reformatted to the directory link format where
"domain🚌slot:function" is found.  Then for each matching directory
the unique_id file for the scsi_host will be used to match the unique_id
value in the xml.

For a PCI address listed above, this will be formatted to "0000:00:1f.2"
and the links in /sys/class/scsi_host will be used to find the host#
to be used for the 'scsi_host' device. Each entry is a link to the
/sys/bus/pci/devices directories, e.g.:

%  ls -al /sys/class/scsi_host/host2
lrwxrwxrwx. 1 root root 0 Jun  1 00:22 /sys/class/scsi_host/host2 -> ../../devices/pci0000:00/0000:00:1f.2/ata3/host2/scsi_host/host2

% cat /sys/class/scsi_host/host2/unique_id
3

The "parentaddr" and "name" attributes are mutually exclusive to identify
the SCSI host number. Use of the "parentaddr" element will be the preferred
mechanism.

This patch only supports to parse and format the XMLs. Later patches will
add code to find out the scsi host number.
2014-07-21 12:55:10 -04:00
Osier Yang
53f620568e virStoragePoolSourceAdapter: Refine the SCSI_HOST adapter name
Preparation for future patches by creating a scsi_host union. For now,
just the 'name' will be present.
2014-07-21 12:55:10 -04:00
John Ferlan
8d854e5b5b getAdapterName: check for SCSI_HOST
Rather than assume that NOT FC_HOST is SCSI_HOST, let's call them out
specifically. Makes it easier to find SCSI_HOST code/structs and ensures
something isn't missed in the future
2014-07-21 12:55:10 -04:00
Peter Krempa
6b1f9feccf node_device: HAL: Ignore return value of virStrToLong_ui
Commit 5df813177c3b609a8cf5db26ae94b26d4a40063d forgot to adjust a few
callers of virStrToLong_ui to ignore the returned value in some ancient
parts of the code.
2014-07-21 16:32:53 +02:00
Peter Krempa
5df813177c util: Check return value from virStrToLong* functions
We do so in the vast majority of places, so there's no problem of adding
the attribute to enforce it by the complier and fix a few leftover
places.

This was originally pointed out by Coverity as a recent change triggered
it's warning that our code checked the vast majority of returns from
virStrToLong_ui.
2014-07-21 15:20:59 +02:00
Peter Krempa
49a3a649a8 qemu: snapshot: Reject revertion from clearly bad states
Report errors on some states snapshots done by qemu should never reach
2014-07-21 11:09:53 +02:00
Peter Krempa
aa7e76a579 qemu: snapshot: Convert if-else switch to switch statement
Convert the target snapshot state selector to a switch statement
enumerating all possible values. This points out a few mistakes in the
original selector.

The logic of the code is preserved until later patches.
2014-07-21 11:00:11 +02:00
Roman Bogorodskiy
29e45ea15a bhyve: reconnect to domains after libvirtd restart
Try to reconnect to the running domains after libvirtd restart. To
achieve that, do:

 * Save domain state
  - Modify virBhyveProcessStart() to save domain state to the state
    dir
  - Modify virBhyveProcessStop() to cleanup the pidfile and the state

 * Detect if the state information loaded from the driver's state
   dir matches the actual state. Consider domain active if:
    - PID it points to exist
    - Process title of this PID matches the expected one with the
      domain name

   Otherwise, mark the domain as shut off.

Note: earlier development bhyve versions before FreeBSD 10.0-RELEASE
didn't set proctitle we expect, so the current code will not detect
it. I don't plan adding support for this unless somebody requests
this.
2014-07-18 21:07:35 +04:00
Peter Krempa
1f4933f0f4 qemu: snapshot: Forbid snapshots of iSCSI passthrough devices
As with the local SCSI passthrough devicesm qemu can't support snapshots
on those as the block ops are handled by the device. This is also true
for iSCSI backing of the disk. Remove the check for the local block
device and just forbid snapshot when the disk is of type 'lun'.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1073368
2014-07-18 17:20:51 +02:00
Michal Privoznik
5028160523 Kill last strto{l,ll,d} scouts
There's no need to use it since we have this shiny functions
that even checks for conversion and overflow errors.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-07-18 16:31:47 +02:00
Cédric Bosdonnat
7c10a77422 lxc conf2xml: convert lxc.network.name for veth networks 2014-07-18 14:26:03 +02:00
Cédric Bosdonnat
3ba0469ce6 lxc network configuration allows setting target container NIC name
LXC network devices can now be assigned a custom NIC device name on the
container side. For example, this is configured with:

    <interface type='network'>
      <source network='default'/>
      <guest dev="eth1"/>
    </interface>

In this example the network card will appear as eth1 in the guest.
2014-07-18 14:25:57 +02:00
John Ferlan
8a9f7cbecd storage: Disallow vol_wipe for sparse logical volumes
https://bugzilla.redhat.com/show_bug.cgi?id=1091866

Add a new boolean 'sparse'.  This will be used by the logical backend
storage driver to determine whether the target volume is sparse or not
(also known by a snapshot or thin logical volume). Although setting sparse
to true at creation could be seen as duplicitous to setting during
virStorageBackendLogicalMakeVol() in case there are ever other code paths
between Create and FindLVs that need to know about the volume be sparse.

Use the 'sparse' in a new virStorageBackendLogicalVolWipe() to decide whether
to attempt to wipe the logical volume or not. For now, I have found no
means to wipe the volume without writing to it. Writing to the sparse
volume causes it to be filled. A sparse logical volume is not completely
writeable as there exists metadata which if overwritten will cause the
sparse lv to go INACTIVE which means pool-refresh will not find it.
Access to whatever lvm uses to manage data blocks is not provided by
any API I could find.
2014-07-17 16:28:59 -04:00
John Ferlan
10087386b9 storage: Convert 'building' into a bool
Rather than a unsigned int, use a 'bool' since that's how it was used.
2014-07-17 16:28:50 -04:00
Geoff Hickey
325f98aa75 esx: Fix a comment about VSphere versions
Update the VSphere version comment in esx_vi.c for ESX 5.1 and 5.5.
2014-07-17 21:19:42 +02:00
Roman Bogorodskiy
479ef260d8 Fix build by dropping redefined typedefs
Commit 93e82727 introduced numatune_conf.h file that contains
typedefs already defined in domain_conf.h, such as:

 - virDomainNumatune
 - virDomainNumatunePtr
 - virDomainDef
 - virDomainDefPtr

As numatune_conf.h is included by domain_conf.h, clang
complains about redefinition of typedef and the build fails.

In order to fix it, drop typedefs already defined by numatume_conf.h
from domain_conf.h.
2014-07-17 21:53:43 +04:00
Ján Tomko
490bf29d50 Log an error when we fail to set the COW attribute
Coverity complains about the return value of ioctl not being checked.

Even though we carry on when this fails (just like qemu-img does),
we can log an error.
2014-07-17 14:32:29 +02:00
Peter Krempa
11d28050c5 storage: Split out volume wiping as separate backend function
For non-local storage drivers we can't expect to use the "scrub" tool to
wipe the volume. Split the code into a separate backend function so that
we can add protocol specific code later.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1118710
2014-07-17 10:12:34 +02:00
Peter Krempa
4d799b65cd storage: wipe: Move helper code into storage backend
The next patch will move the storage volume wiping code into the
individual backends. This patch splits out the common code to wipe a
local volume into a separate backend helper so that the next patch is
simpler.
2014-07-17 10:12:34 +02:00
Geoff Hickey
861eced6f4 esx: Fix a bug in the XML code for storage pools
For ESX, the code that builds XML descriptions for attached storage pools was
not setting the host count to anything when it returned a host name.
2014-07-16 17:26:23 -06:00
Martin Kletzander
7e72ac7878 qemu: leave restricting cpuset.mems after initialization
When domain is started with numatune memory mode strict and the
nodeset does not include host NUMA node with DMA and DMA32 zones, KVM
initialization fails.  This is because cgroup restrict even kernel
allocations.  We are already doing numa_set_membind() which does the
same thing, only it does not restrict kernel allocations.

This patch leaves the userspace numa_set_membind() in place and moves
the cpuset.mems setting after the point where monitor comes up, but
before vcpu and emulator sub-groups are created.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
aa668fccf0 qemu: split out cpuset.mems setting
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
1c19d3e072 qemu: pass numa node binding preferences to qemu
Currently, we only bind the whole QEMU domain to memory nodes
specified in nodemask altogether.  That, however, doesn't make much
sense when one wants to control from where the memory for particular
guest nodes should be allocated.  QEMU allows us to do that by
specifying 'host-nodes' parameter for the 'memory-backend-ram' object,
so let's use that.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
001b9dc1dc qemu: enable disjoint numa cpu ranges
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
1a324c2f88 qemu: newer -numa parameter capability probing
When qemu switched to using OptsVisitor for -numa parameter, it did
two things in the same patch.  One of them is that the numa parameter
is now visible in "query-command-line-options", the second one is that
it enabled using disjoint cpu ranges for -numa specification.  This
will be used in later patch.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
ad064ec6e6 qemu: memory-backend-ram capability probing
The numa patch series in qemu adds "memory-backend-ram" object type by
which we can tell whether we can use such objects.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
7bc1db5a1d qemu: allow qmp probing for cmdline options without params
That can be lately achieved with by having .param == NULL in the
virQEMUCapsCommandLineProps struct.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
1a7be8c600 numatune: add support for per-node memory bindings in private APIs
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
a05c01521c conf, schema: add support for memnode elements
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
93e82727ec numatune: Encapsulate numatune configuration in order to unify results
There were numerous places where numatune configuration (and thus
domain config as well) was changed in different ways.  On some
places this even resulted in persistent domain definition not to be
stable (it would change with daemon's restart).

In order to uniformly change how numatune config is dealt with, all
the internals are now accessible directly only in numatune_conf.c and
outside this file accessors must be used.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
e764ec7ae3 numatune: unify numatune struct and enum names
Since there was already public virDomainNumatune*, I changed the
private virNumaTune to match the same, so all the uses are unified and
public API is kept:

s/vir\(Domain\)\?Numa[tT]une/virDomainNumatune/g

then shrunk long lines, and mainly functions, that were created after
that:

sed -i 's/virDomainNumatuneMemPlacementMode/virDomainNumatunePlacement/g'

And to cope with the enum name, I haad to change the constants as
well:

s/VIR_NUMA_TUNE_MEM_PLACEMENT_MODE/VIR_DOMAIN_NUMATUNE_PLACEMENT/g

Last thing I did was at least a little shortening of already long
name:

s/virDomainNumatuneDef/virDomainNumatune/g

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
293d5f21b6 numatune: create new module for numatune
There are many places with numatune-related code that should be put
into special numatune_conf and this patch creates a basis for that.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
992000e6d8 conf, schema: add 'id' field for cells
In XML format, by definition, order of fields should not matter, so
order of parsing the elements doesn't affect the end result.  When
specifying guest NUMA cells, we depend only on the order of the 'cell'
elements.  With this patch all older domain XMLs are parsed as before,
but with the 'id' attribute they are parsed and formatted according to
that field.  This will be useful when we have tuning settings for
particular guest NUMA node.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
775c46956e conf: purely a code movement
to ease the review of commits to follow.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
92ff464bbb qemu: remove useless error check
Excerpt from the virCommandAddArgBuffer() description: "Correctly
transfers memory errors or contents from buf to cmd."

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
cee22001d3 qemu: purely a code movement
to ease the review of commits to follow.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Michele Paolino
a14abd463a support for QEMU vhost-user
This patch adds support for the QEMU vhost-user feature to libvirt.
vhost-user enables the communication between a QEMU virtual machine
and other userspace process using the Virtio transport protocol.
It uses a char dev (e.g. Unix socket) for the control plane,
while the data plane based on shared memory.

The XML looks like:

<interface type='vhostuser'>
    <mac address='52:54:00:3b:83:1a'/>
    <source type='unix' path='/tmp/vhost.sock' mode='server'/>
    <model type='virtio'/>
</interface>

Signed-off-by: Michele Paolino <m.paolino@virtualopensystems.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-07-16 18:44:57 +02:00
Eric Blake
97c59b9c46 blockjob: wait for pivot to complete
https://bugzilla.redhat.com/show_bug.cgi?id=1119173 documents that
commit eaba79d was flawed in the implementation of the
VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag when it comes to completing
a blockcopy.  Basically, the qemu pivot action is async (the QMP
command returns immediately, but the user must wait for the
BLOCK_JOB_COMPLETE event to know that all I/O related to the job
has finally been flushed), but the libvirt command was documented
as synchronous by default.  As active block commit will also be
using this code, it is worth fixing now.

* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Don't skip wait
loop after pivot.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-07-16 07:23:24 -06:00
Eric Blake
a0b5ace28c util: forbid freeing const pointers
Now that we've finally fixed all the violators, it's time to
enforce that any pointer to a const object is never freed (it
is aliasing some other memory, where the non-const original
should be freed instead).  Alas, the code still needs a normal
vs. Coverity version, but at least we are still guaranteeing
that the macro call evaluates its argument exactly once.

I verified that we still get the following compiler warnings,
which in turn halts the build thanks to -Werror on gcc (hmm,
gcc 4.8.3's placement of the ^ for ?: type mismatch is a bit
off, but that's not our problem):

    int oops1 = 0;
    VIR_FREE(oops1);
    const char *oops2 = NULL;
    VIR_FREE(oops2);
    struct blah { int dummy; } oops3;
    VIR_FREE(oops3);

util/virauthconfig.c:159:35: error: pointer/integer type mismatch in conditional expression [-Werror]
     VIR_FREE(oops1);
                                   ^
util/virauthconfig.c:161:5: error: passing argument 1 of 'virFree' discards 'const' qualifier from pointer target type [-Werror]
     VIR_FREE(oops2);
     ^
In file included from util/virauthconfig.c:28:0:
util/viralloc.h:79:6: note: expected 'void *' but argument is of type 'const void *'
 void virFree(void *ptrptr) ATTRIBUTE_NONNULL(1);
      ^
util/virauthconfig.c:163:35: error: type mismatch in conditional expression
     VIR_FREE(oops3);
                                   ^

* src/util/viralloc.h (VIR_FREE): No longer cast away const.
* src/xenapi/xenapi_utils.c (xenSessionFree): Work around bogus
header.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-07-16 06:48:53 -06:00