Commit Graph

18555 Commits

Author SHA1 Message Date
John Ferlan
f462f9ad6e storage: Clean up return value checking
Rather than special casing the VIR_STORAGE_BLKID_PROBE_UNKNOWN after
calling virStorageBackendBLKIDFindPart, just allow the switch statement
handle setting ret = -2.
2017-01-14 10:13:05 -05:00
John Ferlan
d1f5dfc416 storage: Alter logic when both BLKID and PARTED unavailable
If neither BLKID or PARTED is available and we're not writing, then
just return 0 which allows the underlying storage pool to generate
a failure. If both are unavailable and we're writing, then generate
a more generic error message.
2017-01-14 10:13:05 -05:00
Collin L. Walling
e8a43f1995 qemu-capabilities: Fix query-cpu-model-expansion on s390 with older kernel
When running on s390 with a kernel that does not support cpu model checking and
with a Qemu new enough to support query-cpu-model-expansion, the gathering of qemu
capabilities will fail. Qemu responds to the query-cpu-model-expansion qmp
command with an error because the needed kernel ioct does not exist. When this
happens a guest cannot even be defined due to missing qemu capabilities data.

This patch fixes the problem by silently ignoring generic errors stemming from
calls to query-cpu-model-expansion.

Reported-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
2017-01-13 16:55:58 +01:00
Michal Privoznik
93a062c3b2 qemu: Copy SELinux labels for namespace too
When creating new /dev/* for qemu, we do chown() and copy ACLs to
create the exact copy from the original /dev. I though that
copying SELinux labels is not necessary as SELinux will chose the
sane defaults. Surprisingly, it does not leaving namespace with
the following labels:

crw-rw-rw-. root root system_u:object_r:tmpfs_t:s0     random
crw-------. root root system_u:object_r:tmpfs_t:s0     rtc0
drwxrwxrwt. root root system_u:object_r:tmpfs_t:s0     shm
crw-rw-rw-. root root system_u:object_r:tmpfs_t:s0     urandom

As a result, domain is unable to start:

error: internal error: process exited while connecting to monitor: Error in GnuTLS initialization: Failed to acquire random data.
qemu-kvm: cannot initialize crypto: Unable to initialize GNUTLS library: Failed to acquire random data.

The solution is to copy the SELinux labels as well.

Reported-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-13 14:45:52 +01:00
Peter Krempa
a80957c518 Revert "storage: Validate the device formats at logical startup"
The check is pointless since LVM is capable to detect it's own members
and the check is flawed as it would fail if neither libblkid nor parted
is installed.

We don't really need to babysit LVM in this way.

This reverts commit cb38b6cbc7.
2017-01-13 09:28:28 +01:00
Peter Krempa
9538dff96f Revert "storage: For FS pool check for properly formatted target volume"
The check does not work properly (crashes) with netfs filesystems and
also checking that a device is not empty when attempting to mount a
filesystem is not very usefull since the mount will fail anyways.

As the code would improve only a very minor corner case I don't really
see a reason to have this code at all.

This code would also fail if libvirt is compiled without support for
blkid and without parted.

This reverts commit a11fd69735.
2017-01-13 09:28:28 +01:00
Jim Fehlig
ecb587e4ca libxl: always enable pae for x86_64 HVM
For HVM domains, pae is only set in libxl_domain_build_info when
explicitly specified in the hypervisor <features> config. This is
fine for i686 machines, but is incorrect behavior for x86_64 machines
where pae must always be enabled. See the following discussion for
additional details

https://www.redhat.com/archives/libvir-list/2017-January/msg00254.html
2017-01-12 18:42:39 -07:00
Jiri Denemark
19e06cfa25 qemu: Ignore non-boolean CPU model properties
The query-cpu-model-expansion is currently implemented for s390(x) only
and all CPU properties it returns are booleans. However, x86
implementation will report more types of properties. Without making the
code more tolerant older libvirt would fail to probe newer QEMU
versions.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2017-01-12 11:58:25 +01:00
Jiri Denemark
ec23791517 qemu: Don't check CPU model property key
The qemuMonitorJSONParseCPUModelProperty function is a callback for
virJSONValueObjectForeachKeyValue and is called for each key/value pair,
thus it doesn't really make sense to check whether key is NULL.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2017-01-12 11:58:25 +01:00
Michal Privoznik
cbc45525cb qemuDomainCreateDevice: Canonicalize paths
So far the decision whether /dev/* entry is created in the qemu
namespace is really simple: does the path starts with "/dev/"?
This can be easily fooled by providing path like the following
(for any considered device like disk, rng, chardev, ..):

  /dev/../var/lib/libvirt/images/disk.qcow2

Therefore, before making the decision the path should be
canonicalized.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 18:08:13 +01:00
Michal Privoznik
49f326edc0 qemu: Use namespaces iff available on the host kernel
So far the namespaces were turned on by default unconditionally.
For all non-Linux platforms we provided stub functions that just
ignored whatever namespaces setting there was in qemu.conf and
returned 0 to indicate success. Moreover, we didn't really check
if namespaces are available on the host kernel.

This is suboptimal as we might have ignored user setting.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 18:07:43 +01:00
Michal Privoznik
41816751a7 util: Introduce virFileMoveMount
This is a simple wrapper over mount(). However, not every system
out there is capable of moving a mount point. Therefore, instead
of having to deal with this fact in all the places of our code we
can have a simple wrapper and deal with this fact at just one
place.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 18:06:30 +01:00
Michal Privoznik
cd32783cd4 lxc_container: Drop userns_supported
This is unnecessary wrapper around virProcessNamespaceAvailable().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 18:05:03 +01:00
Michal Privoznik
083fcd06d3 lxc: Move lxcContainerAvailable to virprocess
Other drivers (like qemu) would like to know if the namespaces
are available therefore it makes sense to move this function to
a shared module.

At the same time, this function had some default namespaces that
are checked with every call. It is not necessary - let callers
pass just those namespaces they are interested in.

With the move the function is renamed to
virProcessNamespaceAvailable.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 18:02:35 +01:00
Michal Privoznik
2ff8c30548 qemuDomainSetupAllInputs: Update debug message
Due to a copy-paste error, the debug message reads:

  Setting up disks

It should have been:

  Setting up inputs.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 17:39:24 +01:00
Cédric Bosdonnat
162bb0e7b1 libxl: fix usb inputs loop error
List indexes where mixed up in the code looping over the USB
input devices.
2017-01-11 17:07:25 +01:00
Pino Toscano
1a5de3fe2e remote: do not check for an existing config dir
When composing the path to the default known_hosts file (for the libssh
and libssh2 drivers), do not check whether the configuration directory
(determined by virGetUserConfigDirectory()) exists: both the drivers can
handle non-existing files, and are able to create them (and their
directories) in that case.

This adds a small behaviour change: before, the key for an unknown host,
and manually accepted, was saved only if the configuration directory
existed -- a bit incoherent behaviour though.
2017-01-11 13:38:04 +01:00
Pino Toscano
45c4a70c70 remote: fix logic for known_hosts and keyfile checks
If any of them is specified for the libssh and libssh2 drivers, there is
no need to depend on checks based on other paths: in particular, a
specified path for known_hosts was ignored if the local config directory
could not be determined, and the path for keyfile was ignored if the
home could not be determined.

Instead, lazily determine and use these two paths only in case they are
needed.
2017-01-11 13:37:45 +01:00
Pino Toscano
408a1ce5f8 rpc: libssh: allow a NULL known_hosts file
Make sure that virNetLibsshSessionSetHostKeyVerification accepts a NULL
value for the path to the known_hosts file:
- call ssh_options_set(SSH_OPTIONS_KNOWNHOSTS) anyway, using /dev/null,
  otherwise libssh will use its default path
- do not call ssh_write_knownhost when no known hosts file was set

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1406457
2017-01-11 13:37:24 +01:00
Laine Stump
5949b53aec conf: eliminate virDomainPCIAddressReleaseSlot() in favor of ...Addr()
Surprisingly there was a virDomainPCIAddressReleaseAddr() function
already, but it was completely unused. Since we don't reserve entire
slots at once any more, there is no need to release entire slots
either, so we just replace the single call to
virDomainPCIAddressReleaseSlot() with a call to
virDomainPCIAddressReleaseAddr() and remove the now unused function.

The keen observer may be concerned that ...Addr() doesn't call
virDomainPCIAddressValidate(), as ...Slot() did. But really the
validation was pointless anyway - if the device hadn't been suitable
to be connected at that address, it would have failed validation
before every being reserved in the first place, so by definition it
will pass validation when it is being unplugged. (And anyway, even if
something "bad" happened and we managed to have a device incorrectly
at the given address, we would still want to be able to free it up for
use by a device that *did* validate properly).
2017-01-11 05:00:34 -05:00
Laine Stump
6cc2014202 qemu: rename qemuDomainPCIAddressReserveNextSlot() to ...Addr()
This function doesn't actually reserve an entire slot any more, it
reserves a single PCI address, so this name is more appropriate.
2017-01-11 05:00:08 -05:00
Laine Stump
c5aea19d56 qemu: remove qemuDomainPCIAddressReserveNextAddr()
This function is only called in two places, and the function itself is
just adding a single argument and calling
virDomainPCIAddressReserveNextAddr(), so we can remove it and instead
call virDomainPCIAddressReserveNextAddr() directly. (The main
motivation for doing this is to free up the name so that
qemuDomainPCIAddressReserveNextSlot() can be renamed in the next
patch, as its current name is now inaccurate and misleading).
2017-01-11 04:59:42 -05:00
Laine Stump
27b0f971c4 conf: rename virDomainPCIAddressReserveSlot() to ...Addr()
This function doesn't actually reserve an entire slot any more, it
reserves a single PCI address, so this name is more appropriate.
2017-01-11 04:58:32 -05:00
Laine Stump
640ce18679 conf: rename virDomainPCIAddressReserveAddr() to ...Internal()
This is in preparation for renaming virDomainPCIAddressReserveSlot()
to virDomainPCIAddressReserveAddr(), which is a better description of
what it does.
2017-01-11 04:57:06 -05:00
Laine Stump
24c8c47230 conf: make virDomainPCIAddressReserveAddr() a static function
It is now only used in domain_addr.c.
2017-01-11 04:55:43 -05:00
Laine Stump
905859a6e5 qemu: replace virDomainPCIAddressReserveAddr with virDomainPCIAddressReserveSlot
All occurences of the former use fromConfig=true, and that's exactly
how virDomainPCIAddressReserveSlot() calls
virDomainPCIaddressReserveAddr(), so just use *Slot() so that *Addr()
can be made static to conf/domain_addr.c (both functions will be
renamed in upcoming patches).
2017-01-11 04:55:06 -05:00
Laine Stump
43f8147749 conf: eliminate virDomainPCIAddressReserveNextSlot()
Since we don't actually reserve an entire slot at a time anymore, the
name of this function is just confusing, and it's almost identical in
operation to virDomainPCIAddressReserveNextAddr() anyway, so remove
the *Slot() function and replace calls to it with calls to *Addr(...,
-1).
2017-01-11 04:53:48 -05:00
Laine Stump
e97fab2665 conf: rename virDomainPCIAddressGetNextSlot() to ...GetNextAddr()
With the advent of VIR_PCI_CONNECT_AGGREGATE_SLOT, the new name is
more appropriate, since the address returned may be another address
on the same slot as last time, not necessarily a new slot.
2017-01-11 04:52:19 -05:00
Laine Stump
b59bbdba4b conf: fix fromConfig argument to virDomainPCIAddressValidate()
fromConfig should be true if the caller wants
virDomainPCIAddressValidate() to loosen restrictions on its
interpretation of the pciConnectFlags. In particular, either
PCI_DEVICE or PCIE_DEVICE will be counted as equivalent to both, and
HOTPLUG will be ignored. In a few cases where libvirt was manually
overriding automatic address assignment, it was setting fromConfig to
false when validating the hardcoded manual override. This patch
changes those to fromConfig=true as a preemptive strike against any
future bugs that might otherwise surface.
2017-01-11 04:51:54 -05:00
Laine Stump
79901543b9 conf: fix fromConfig argument to virDomainPCIAddressReserveAddr()
Although setting virDomainPCIAddressReserveAddr()'s fromConfig=true is
correct when a PCI addres is coming from a domain's config, the *true*
purpose of the fromConfig argument is to lower restrictions on what
kind of device can plug into what kind of controller - if fromConfig
is true, then a PCIE_DEVICE can plug into a slot that is marked as
only compatible with PCI_DEVICE (and vice versa), and the HOTPLUG flag
is ignored.

For a long time there have been several calls to
virDomainPCIAddressReserveAddr() that have fromConfig incorrectly set
to false - it's correct that the addresses aren't coming from user
config, but they are coming from hardcoded exceptions in libvirt that
should, if anything, pay *even less* attention to following the
pciConnectFlags (under the assumption that the libvirt programmer knew
what they were doing).

See commit b87703cf7 for an example of an actual bug caused by the
incorrect setting of the "fromConfig" argument to
virDomainPCIAddressReserveAddr(). Although they haven't resulted in
any reported bugs, this patch corrects all the other incorrect
settings of fromConfig in calls to virDomainPCIAddressReserveAddr().
2017-01-11 04:47:12 -05:00
Laine Stump
147ebe6ddf conf: aggregate multiple pcie-root-ports onto a single slot
Set the VIR_PCI_CONNECT_AGGREGATE_SLOT flag for pcie-root-ports so
that they will be assigned to all the functions on a slot.

Some qemu test case outputs had to be adjusted due to the
pcie-root-ports now being put on multiple functions.
2017-01-11 04:45:57 -05:00
Laine Stump
48d39cf96d conf: aggregate multiple devices on a slot when assigning PCI addresses
If a PCI device has VIR_PCI_CONNECT_AGGREGATE_SLOT set in its
pciConnectFlags, then during address assignment we allow multiple
instances of this type of device to be auto-assigned to multiple
functions on the same device. A slot is used for aggregating multiple
devices only if the first device assigned to that slot had
VIR_PCI_CONNECT_AGGREGATE_SLOT set. but any device types that have
AGGREGATE_SLOT set might be mix/matched on the same slot.

(NB: libvirt should never set the AGGREGATE_SLOT flag for a device
type that might need to be hotplugged. Currently it is only planned
for pcie-root-port and possibly other PCI controller types, and none
of those are hotpluggable anyway)

There aren't yet any devices that use this flag. That will be in a
later patch.
2017-01-11 04:43:22 -05:00
Laine Stump
8f4008713a qemu: use virDomainPCIAddressSetAllMulti() to set multi when needed
If there are multiple devices assigned to the different functions of a
single PCI slot, they will not work properly if the device at function
0 doesn't have its "multi" attribute turned on, so it makes sense for
libvirt to turn it on during PCI address assignment. Setting multi
then assures that the new setting is stored in the config (so it will
be used next time the domain is started), preventing any potential
problems in the case that a future change in the configuration
eliminates the devices on all non-0 functions (multi will still be set
for function 0 even though it is the only function in use on the slot,
which has no useful purpose, but also doesn't cause any problems).

(NB: If we were to instead just decide on the setting for
multifunction at runtime, a later removal of the non-0 functions of a
slot would result in a silent change in the guest ABI for the
remaining device on function 0 (although it may seem like an
inconsequential guest ABI change, it *is* a guest ABI change to turn
off the multi bit).)
2017-01-11 04:42:08 -05:00
Laine Stump
3c1a0fc27d conf: new function virDomainPCIAddressSetAllMulti()
This utility function iterates through all devices looking for any
with a PCI address that has function != 0 (which implies that multiple
functions are in use on that slot), then uses an inner iterator to
find the device that's on function 0 of that same slot and sets the
"multi" in its virDomainDeviceInfo (as long as it hasn't already been
set explicitly by someone who presumably has better information than
we do).

It isn't yet called from anywhere, so will have no functional effect.
2017-01-11 04:40:24 -05:00
Laine Stump
66e0b08d34 conf: start search for next unused PCI address at same slot as previous find
There is a very slight time advantage to beginning the search for the
next unused PCI address at the slot *after* the previous find (which
is now used), but if we do that, we will miss allocating the other
functions of the same slot (when we implement a
VIR_PCI_CONNECT_AGGREGATE_SLOT flag to support that).
2017-01-11 04:39:08 -05:00
Laine Stump
99bf66f3fa conf: eliminate repetitive code in virDomainPCIAddressGetNextSlot()
virDomainPCIAddressGetNextSlot() starts searching from the last
allocated address and goes to the end of all the buses, then goes back
to the first bus and searches from there up to the starting point (in
case any address has been freed since the last time an address was
allocated. The loops for these two are almost, but not exactly, the
same, so they have remained as separate loops with the same code
inside the loop. To lessen maintenance headaches, the identical code
has been moved out into the function
virDomainPCIAddressFindUnusedFunctionOnBus(), which is called in place
of the loop contents.
2017-01-11 04:38:04 -05:00
Laine Stump
9ff9d9f5a9 conf: eliminate concept of "reserveEntireSlot"
setting reserveEntireSlot really accomplishes nothing - instead of
going to the trouble of computing the value for reserveEntireSlot and
then possibly setting *all* functions of the slot as in-use, we can
just set the in-use bit only for the specific function being used by a
device.  Later we will know from the context (the PCI connect flags,
and whether we are reserving a specific address or asking for "the
next available") whether or not it is okay to allocate other functions
on the same slot.

Although it's not used yet, we allow specifying "-1" for the function
number when looking for the "next available slot" - this is going to
end up meaning "return the lowest available function in the slot, but
since we currently only provide a function from an otherwise unused
slot, "-1" ends up meaning "0".
2017-01-11 04:36:34 -05:00
Laine Stump
9838cad9cd conf: use struct instead of int for each slot in virDomainPCIAddressBus
When keeping track of which functions of which slots are allocated, we
will need to have more information than just the current bitmap with a
bit for each function that is currently stored for each slot in a
virDomainPCIAddressBus. To prepare for adding more per-slot info, this
patch changes "uint8_t slots" into "virDomainPCIAddressSlot slot", which
currently has a single member named "functions" that serves the same
purpose previously served directly by "slots".
2017-01-11 04:29:48 -05:00
Cédric Bosdonnat
a30b08b717 libxl: define a per-domain logger.
libxl doesn't provide a way to write one log for each domain. Thus
we need to demux the messages. If our logger doesn't know to which
domain to attribute a message, then it will write it to the default
log file.

Starting with Xen 4.9 (commit f9858025 and following), libxl will
write the domain ID in an easy to grab manner. The logger introduced
by this commit will use it to demux the libxl log messages.

Thanks to the default log file, this logger will also work with older
versions of Xen.
2017-01-11 09:32:47 +01:00
Dawid Zamirski
73c6f16baf vbox: consolidate vbox IID structures.
* remove _vboxIID_v2_x and _vboxIID_v3_x structs and repalce with one
  _vboxIID as all supprted vbox versions have the same IID structure.
* remove vboxIIDUnion that was used to abstract version depended IID
  differences.
* remove IID_MEMBER macro and use the new vboxIID directly.
2017-01-10 19:20:08 -05:00
Dawid Zamirski
3628891789 vbox: fix _displayTakeScreenShotPNGToArray
This function was not implemented for vbox 5+ which removed
TakeScreenShotPNGToArray but provides TakeScreenShotToArray with
BitmapFormat_PNG argument which is the same thing.
2017-01-10 19:20:07 -05:00
Dawid Zamirski
5a5c6de3a3 vbox: IVRDxServer to IVRDEServer.
The IVRDxServer was used because vbox < 4 used to have IVRDPServer
whereas vbox >= 4 has IVRDEServer. Now that support for legacy
versions is being removed, we can use IVRDEServer.
2017-01-10 19:20:06 -05:00
Dawid Zamirski
f2f70c21d0 vbox: remove code dealing with oldMediumInterface
* removed oldMediumInterface flag and related code that was used for
  vbox 2.x
* remove accelerate2DVideo and networkRemoveInterface flags which were
  also conditionals for handling legacy vbox versions.
2017-01-10 19:20:06 -05:00
Dawid Zamirski
1d963578e8 vbox: remove domain events support.
this was implemented only for vbox 3 series and was mostly stubs
anyway.
2017-01-10 19:20:06 -05:00
Dawid Zamirski
374422ea1c vbox: remove getMachineForSession flag.
* the getMachineForSession is always true for 4.0+. This also means that
  checkflag argument in openSessionForMachine no longer has any meaning
  because it was or'ed with getMachineForSession (always true)
* remove supportScreenshot flag - vbox 4.0+ supports it
* remove detachDevicesExplicitly flag only relevant for < 4.0
2017-01-10 19:19:49 -05:00
Dawid Zamirski
d7f369b571 vbox: do not use IHardDisk anymore.
VirtualBox 4.0+ uses IMedium and IHardDisk is no longer used, so

* remove typef IMedium IHardDisk
* merge UIHardDisk into UIMedium
* update all references accordingly
2017-01-10 19:19:49 -05:00
Dawid Zamirski
c7c286c6bd vbox: remove _vboxAttachDrivesOld
and fold vboxAttachDrivesNew into vboxAttachDrives
2017-01-10 19:19:49 -05:00
Dawid Zamirski
c8d7e90fd6 vbox: remove code for old API versions.
This removes most of the code wrapped in VBOX_API_VERSION < 4000000
preprocessor checks. Those are the ones that can be safely removed
without needing to update driver code to accomodate it.
2017-01-10 19:19:46 -05:00
Dawid Zamirski
655a99f166 vbox: remove calls to *InstallUniformedAPI macros.
That is, for versions older than 4.0. Also do not try to include
headers for those old versions.
2017-01-10 19:14:53 -05:00
Dawid Zamirski
7f10ac33e9 vbox: remove SDK header files for vbox 3 and older.
* delete SDK header files for vbox older than 4.0
* delete .c files for vbox older than 4.0
* update vbox_XPCOMCGlue to use oldest supported header file, that is 4.0
  going forward.
* remove deleted files from Makefile.am
2017-01-10 19:14:33 -05:00
Michal Privoznik
3027bacf95 virSecuritySELinuxSetFileconHelper: Fix build with broken selinux.h
There are still some systems out there that have broken
setfilecon*() prototypes. Instead of taking 'const char *tcon' it
is taking 'char *tcon'. The function should just set the context,
not modify it.

We had been bitten with this problem before which resulted in
292d3f2d and subsequently b109c09765. However, with one my latest
commits (4674fc6afd) I've changed the type of @tcon variable to
'const char *' which results in build failure on the systems from
above.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 19:23:49 +01:00
Michal Privoznik
269589146c qemu_domain: Move qemuDomainGetPreservedMounts
This function is used only from code compiled on Linux. Therefore
on non-Linux platforms it triggers compilation error:

../../src/qemu/qemu_domain.c:209:1: error: unused function 'qemuDomainGetPreservedMounts' [-Werror,-Wunused-function]

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 19:23:49 +01:00
Peter Krempa
b469853812 qemu: blockjob: Fix locking of block copy/active block commit
For the blockjobs, where libvirt is able to track the state internally
we can fix locking of images we can remove the appropriate locks.

Also when doing a pivoting operation we should not acquire the lock on
any of those images since both are actually locked already.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1302168
2017-01-10 19:12:19 +01:00
Peter Krempa
f61e40610d qemu: snapshot: Properly handle image locking
Images that became the backing chain of the current image due to the
snapshot need to be unlocked in the lock manager. Also if qemu was
paused during the snapshot the current top level images need to be
released until qemu is resumed so that they can be acquired properly.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1191901
2017-01-10 19:12:19 +01:00
Peter Krempa
cbb4d229de qemu: snapshot: Refactor snapshot rollback on failure
The code at first changed the definition and then rolled it back in case
of failure. This was ridiculous. Refactor the code so that the image in
the definition is changed only when the snapshot is successful.

The refactor will also simplify further fix of image locking when doing
snapshots.
2017-01-10 19:12:19 +01:00
Peter Krempa
7456c4f5f0 qemu: snapshot: Don't redetect backing chain after snapshot
Libvirt is able to properly model what happens to the backing chain
after a snapshot so there's no real need to redetect the data.
Additionally with the _REUSE_EXT flag this might end up in redetecting
wrong data if the user puts wrong backing chain reference into the
snapshot image.
2017-01-10 19:12:19 +01:00
Jim Fehlig
a05e2570c9 libxl: implement virDomainGetMaxVcpus
The libxl driver already supports getting maximum vcpu count via
libxlDomainGetVcpusFlags, allowing to trivially implement
virDomainGetMaxVcpus.
2017-01-10 11:07:08 -07:00
John Ferlan
bdd371c5c5 storage: Fix storage_backend probing when PARTED not installed.
Commit id 'a48c674f' caused problems for systems without PARTED installed.

So move the PARTED probing code back to storage_backend_disk.c and create
a shim within storage_backend.c to call it if WITH_STORAGE_DISK is true;
otherwise, just return -1 with the error.
2017-01-10 10:20:17 -05:00
John Ferlan
cb38b6cbc7 storage: Validate the device formats at logical startup
At startup time, rather than blindly trusting the target devices are
still properly formatted, let's check to make sure the pool's target
devices are all properly formatted before attempting to start the pool.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
f573f84eb7 storage: Add overwrite flag checking for logical pool
https://bugzilla.redhat.com/show_bug.cgi?id=1373711

Add support and documentation for the [NO_]OVERWRITE flags for the
logical backend.

Update virsh.pod with a description of the process for usage of
the flags and building of the pool's volume group.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
d5cc5f8997 storage: Extract logical device initialize into a helper
Make the remaining code a bit cleaner.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
71a08b5a5a storage: Clean up logical pool devices on build failure
If the build fails, then we need to ensure that we've run pvremove
on any devices which we've run pvcreate on; otherwise, a subsequent
build could fail since running pvcreate twice on a device requires
special force arguments.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
a4cb4a74f9 storage: Adjust disk label found to match labels
Currently as long as the disk is formatted using a known parted format
type, the algorithm is happy to continue. However, that leaves a scenario
whereby a disk formatted using "pc98" could be used by a pool that's defined
using "dvh" (or vice versa). Alter the check to be match and different
and adjust the caller.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
a48c674fba storage: Move and rename disk backend label checking
Rather than have the Disk code having to use PARTED to determine if
there's something on the device, let's use the virStorageBackendDeviceProbe.
and only fallback to the PARTED probing if the BLKID code isn't built in.

This will also provide a mechanism for the other current caller (File
System Backend) to utilize a PARTED parsing algorithm in the event that
BLKID isn't built in to at least see if *something* exists on the disk
before blindly trying to use. The PARTED error checking will not find
file system types, but if there is a partition table set on the device,
it will at least cause a failure.

Move virStorageBackendDiskValidLabel and virStorageBackendDiskFindLabel
to storage_backend and rename/rework the code to fit the new model.

Update the virsh.pod description to provide a more generic description
of the process since we could now use either blkid or parted to find
data on the target device.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
a11fd69735 storage: For FS pool check for properly formatted target volume
Prior to starting up, let's be sure the target volume device is
formatted as we expect; otherwise, inhibit the start.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
19ced38f1c storage: Add writelabel bool for virStorageBackendDeviceProbe
It's possible that the API could be called from a startup path in
order to check whether the label on the device matches what our
format is. In order to handle that condition, add a 'writelabel'
boolean to the API in order to indicate whether a write or just
read is about to happen.

This alters two "error" conditions that would care about knowing.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
a22e1a0032 storage: Add partition type checks for BLKID probing
A device may be formatted using some sort of disk partition format type.
We can check that using the blkid_ API's as well - so alter the logic to
allow checking the device for both a filesystem and a disk partition.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
f23d4bbce3 storage: Fix implementation of no-overwrite for file system backend
https://bugzilla.redhat.com/show_bug.cgi?id=1363586

Commit id '27758859' introduced the "NO_OVERWRITE" flag check for
file system backends; however, the implementation, documentation,
and algorithm was inconsistent. For the "flag" description for the
API the flag was described as "Do not overwrite existing pool";
however, within the storage backend code the flag is described
as "it probes to determine if filesystem already exists on the
target device, renurning an error if exists".

The code itself was implemented using the paradigm to set up the
superblock probe by creating a filter that would cause the code
to only search for the provided format type. If that type wasn't
found, then the algorithm would return success allowing the caller
to format the device. If the format type already existed on the
device, then the code would fail indicating that the a filesystem
of the same type existed on the device.

The result is that if someone had a file system of one type on the
device, it was possible to overwrite it if a different format type
was specified in updated XML effectively trashing whatever was on
the device already.

This patch alters what NO_OVERWRITE does for a file system backend
to be more realistic and consistent with what should be expected when
the caller requests to not overwrite the data on the disk.

Rather than filter results based on the expected format type, the
code will allow success/failure be determined solely on whether the
blkid_do_probe calls finds some known format on the device. This
adjustment also allows removal of the virStoragePoolProbeResult
enum that was under utilized.

If it does find a formatted file system different errors will be
generated indicating a file system of a specific type already exists
or a file system of some other type already exists.

In the original virsh support commit id 'ddcd5674', the description
for '--no-overwrite' within the 'pool-build' command help output
has an ambiguous "of this type" included in the short description.
Compared to the longer description within the "Build a given pool."
section of the virsh.pod file it's more apparent that the meaning
of this flag would cause failure if a probe of the target already
has a filesystem.

So this patch also modifies the short description to just be the
antecedent of the 'overwrite' flag, which matches the API description.
This patch also modifies the grammar in virsh.pod for no-overwrite
as well as reworking the paragraph formats to make it easier to read.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
553d21da6c storage: Introduce virStorageBackendDeviceIsEmpty
Rename virStorageBackendFileSystemProbe and to virStorageBackendBLKIDFindFS
and move to the more common storage_backend module.

Create a shim virStorageBackendDeviceIsEmpty which will make the call
to the virStorageBackendBLKIDFindFS and check the return value.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
Michal Privoznik
406e390962 qemu: Drop qemuDomainDeleteNamespace
After previous commits, this function is no longer needed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:57 +01:00
Michal Privoznik
5d198c2b2c qemuDomainCreateNamespace: move mkdir to qemuDomainBuildNamespace
Again, there is no need to create /var/lib/libvirt/$domain.*
directories in CreateNamespace(). It is sufficient to create them
as soon as we need them which is in BuildNamespace. This way we
don't leave them around for the whole lifetime of domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:57 +01:00
Michal Privoznik
5d30057695 qemuDomainGetPreservedMounts: Do not special case /dev
The c1140eb9e got me thinking. We don't want to special case /dev
in qemuDomainGetPreservedMounts(), but in all other places in the
code we special case it anyway. I mean,
/var/run/libvirt/$domain.dev path is constructed separately just
so that it is not constructed here. It makes only a little sense
(if any at all).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:57 +01:00
Michal Privoznik
40ebbf72d5 qemuDomainCreateNamespace: s/unlink/rmdir/
If something goes wrong in this function we try a rollback. That
is unlink all the directories we created earlier. For some weird
reason unlink() was called instead of rmdir().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:57 +01:00
Michal Privoznik
095f042ed6 qemu: Use transactions from security driver
So far if qemu is spawned under separate mount namespace in order
to relabel everything it needs an access to the security driver
to run in that namespace too. This has a very nasty down side -
it is being run in a separate process, so any internal state
transition is NOT reflected in the daemon. This can lead to many
sleepless nights. Therefore, use the transaction APIs so that
libvirt developers can sleep tight again.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:11 +01:00
Michal Privoznik
4674fc6afd security_selinux: Implement transaction APIs
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 12:50:00 +01:00
Michal Privoznik
67232478db security_dac: Implement transaction APIs
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 12:50:00 +01:00
Michal Privoznik
95576b4df0 security driver: Introduce transaction APIs
With our new qemu namespace code in place, the relabelling of
devices is done not as good is it could: a child process is
spawned, it enters the mount namespace of the qemu process and
then runs desired API of the security driver.

Problem with this approach is that internal state transition of
the security driver done in the child process is not reflected in
the parent process. While currently it wouldn't matter that much,
it is fairly easy to forget about that. We should take the extra
step now while this limitation is still fresh in our minds.

Three new APIs are introduced here:
  virSecurityManagerTransactionStart()
  virSecurityManagerTransactionCommit()
  virSecurityManagerTransactionAbort()

The Start() is going to be used to let security driver know that
we are starting a new transaction. During a transaction no
security labels are actually touched, but rather recorded and
only at Commit() phase they are actually updated. Should
something go wrong Abort() aborts the transaction freeing up all
memory allocated by transaction.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 12:49:59 +01:00
Michal Privoznik
39779eb195 security_dac: Resolve virSecurityDACSetOwnershipInternal const correctness
The code at the very bottom of the DAC secdriver that calls
chown() should be fine with read-only data. If something needs to
be prepared it should have been done beforehand.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 12:49:59 +01:00
Andrea Bolognani
1d8454639f qemu: Use virtio-pci by default for mach-virt guests
virtio-pci is the way forward for aarch64 guests: it's faster
and less alien to people coming from other architectures.
Now that guest support is finally getting there (Fedora 24,
CentOS 7.3, Ubuntu 16.04 and Debian testing all support
virtio-pci out of the box), we'd like to start using it by
default instead of virtio-mmio.

Users and applications can already opt-in by explicitly using

  <address type='pci'/>

inside the relevant elements, but that's kind of cumbersome and
requires all users and management applications to adapt, which
we'd really like to avoid.

What we can do instead is use virtio-mmio only if the guest
already has at least one virtio-mmio device, and use virtio-pci
in all other situations.

That means existing virtio-mmio guests will keep using the old
addressing scheme, and new guests will automatically be created
using virtio-pci instead. Users can still override the default
in either direction.

Existing tests such as aarch64-aavmf-virtio-mmio and
aarch64-virtio-pci-default already cover all possible
scenarios, so no additions to the test suites are necessary.
2017-01-10 12:33:53 +01:00
Peter Krempa
a946ea1a33 qemu: setvcpus: Properly coldplug vcpus when hotpluggable vcpus are present
When coldplugging vcpus to a VM that already has a few hotpluggable
vcpus the code might generate invalid configuration as
non-hotpluggable cpus need to be clustered starting from vcpu 0.

This fix forces the added vcpus to be hotpluggable in such case.

Fixes a corner case described in:
https://bugzilla.redhat.com/show_bug.cgi?id=1370357
2017-01-10 10:47:06 +01:00
Nitesh Konkar
ae16c95f1b perf: Add cache_l1d perf event support
This patch adds support and documentation for
a generalized hardware cache event called cache_l1d
perf event.

Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
2017-01-09 18:15:31 -05:00
Jiri Denemark
dc2bfdc815 Update remote_protocol-structs for new events
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2017-01-09 19:53:55 +01:00
Daniel P. Berrange
42241208d9 secret: add support for value change events
Emit an event whenever a secret value changes

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 16:42:04 +00:00
Daniel P. Berrange
06fcee63cf secret: add support for lifecycle events
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 15:53:49 +00:00
Daniel P. Berrange
3b7bd6e540 remote: implement secret lifecycle event APIs
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 15:53:49 +00:00
Daniel P. Berrange
bd300b7194 conf: simplify internal virSecretDef handling of usage
The public virSecret object has a single "usage_id" field
but the virSecretDef object has a different 'char *' field
for each usage type, but the code all assumes every usage
type has a corresponding single string. Get rid of the
pointless union in virSecretDef and just use "usage_id"
everywhere. This doesn't impact public XML format, only
the internal handling.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 15:53:49 +00:00
Daniel P. Berrange
df740caf54 conf: add secret event handling
Add helper APIs / objects for managing secret events

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 15:53:48 +00:00
Daniel P. Berrange
34fd3caabf Introduce secret lifecycle event APIs
Add public APIs to allow applications to watch for define and
undefine of secret objects.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 15:53:48 +00:00
Daniel P. Berrange
89283c138e remote: fix struct for device removal failed event
The handler for the device removal failed event was using
the struct for the device added event. Fortunately the
layout was the same, so this was harmless.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 15:53:46 +00:00
Daniel P. Berrange
c50070173d Add domain event for metadata changes
When changing the metadata via virDomainSetMetadata, we now
emit an event to notify the app of changes. This is useful
when co-ordinating different applications read/write of
custom metadata.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 15:53:00 +00:00
Daniel P. Berrange
f1e48297cf cgroup: add virCgroupAddMachineTask stub for win32
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 14:27:34 +00:00
Daniel P. Berrange
44f79a0bd0 lxc: ensure libvirt_lxc and qemu-nbd move into systemd machine slice
Currently when spawning containers with systemd, the container PID 1
will get moved into the systemd machine slice. Libvirt then manually
moves the libvirt_lxc and qemu-nbd processes into the cgroups associated
with the slice, but skips the systemd controller cgroup. This means that
from systemd's POV, libvirt_lxc and qemu-nbd are still part of the
libvirtd.service unit.

On systemctl daemon-reload, it will notice that libvirt_lxc & qemu-nbd
are in the libvirtd.service unit for the systemd controller, but in the
machine cgroups for resources. Systemd will thus move them back into
the libvirtd.service resource cgroups next time libvirtd is restarted.
This causes libvirtd to kill off the container due to incorrect cgroup
placement.

The solution is to ensure that when moving libvirt_lxc & qemu-nbd, we
also move the systemd cgroup controller placement. Normally this is
not something we ever want todo, but this is a special case as we are
intentionally wanting to move them to a different systemd unit.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 12:46:52 +00:00
Michal Privoznik
65fb0b79f7 security_selinux: s/virSecuritySELinuxSecurity/virSecuritySELinux/
It doesn't make much sense to have two different prefix for
functions within the same driver.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-09 09:17:42 +01:00
Michal Privoznik
981d979047 virutil: Provide non-linux impl for virGetFCHostNameByFabricWWN
Currently, there's only linux implementation for
virGetFCHostNameByFabricWWN(). Since the symbol is exported in
our private symbols we ought to have implementation for other
platforms too. This also triggers compilation error on FreeBSD:

../src/.libs/libvirt_driver_storage_impl.a(libvirt_driver_storage_impl_la-storage_backend_scsi.o): In function `createVport':
/usr/home/jenkins/libvirt-master/systems/libvirt-freebsd/build/src/../../src/storage/storage_backend_scsi.c:740: undefined reference to `virGetFCHostNameByFabricWWN'

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-09 09:13:41 +01:00
Maxim Nestratov
af78cb0486 qemu: Allow to specify pit timer tick policy=discard
Separate out the "policy=discard" into it's own specific
qemu command line.

We'll rename "kvm-pit-device" test case to be "kvm-pit-discard"
since it has the syntax we'd be using.

Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
2017-01-06 18:27:06 -05:00
Maxim Nestratov
ef5c8bb412 qemu: Fix pit timer tick policy=delay
By a mistake, for the VIR_DOMAIN_TIMER_TICKPOLICY_DELAY qemu
command line creation, 'discard' was used instead of 'delay'
in commit id '1569fa14'.

Test "kvm-pit-delay" is fixed accordingly to show the correct
option being generated.

Remove the (now) redundant kvm-pit-device tests. As it turns
out there is no need to specify both QEMU_CAPS_NO_KVM_PIT and
QEMU_CAPS_KVM_PIT_TICK_POLICY since they are mutually exclusive
and "kvm-pit-device" becomes just the same as "kvm-pit-delay".

Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
2017-01-06 18:27:06 -05:00
John Ferlan
78be2e8b74 iscsi: Add parent wwnn/wwpn or fabric capability for createVport
https://bugzilla.redhat.com/show_bug.cgi?id=1349696

As it turns out using only the 'parent' to achieve the goal of a
consistent vHBA parent has issues with reboots where the scsi_hostX
parent could change to scsi_hostY causing either failure to create
the vHBA or usage of the wrong HBA for our vHBA.

Thus add the ability to search for the "parent" by the parent wwnn/
wwpn values or just a fabric_name if someone only cares to ensure
usage of the same SAN for the vHBA.
2017-01-06 17:15:34 -05:00
John Ferlan
8366cb0a20 util: Introduce virGetFCHostNameByFabricWWN
Create a utility routine in order to read the scsi_host fabric_name files
looking for a match to a passed fabric_name
2017-01-06 17:15:34 -05:00
John Ferlan
bb74a7ffeb conf: Add more fchost search fields for storage pool vHBA creation
Add new fields to the fchost structure to allow creation of a vHBA via
the storage pool when a parent_wwnn/parent_wwpn or parent_fabric_wwn is
supplied in the storage pool XML.
2017-01-06 17:15:34 -05:00
John Ferlan
2b13361bc7 nodedev: Add the ability to create vHBA by parent wwnn/wwpn or fabric_wwn
https://bugzilla.redhat.com/show_bug.cgi?id=1349696

When creating a vHBA, the process is to feed XML to nodeDeviceCreateXML
that lists the <parent> scsi_hostX to use to create the vHBA. However,
between reboots, it's possible that the <parent> changes its scsi_hostX
to scsi_hostY and saved XML to perform the creation will either fail or
create a vHBA using the wrong parent.

So add the ability to provide "wwnn" and "wwpn" or "fabric_wwn" to
the <parent> instead of a name of the scsi_hostN that is the parent.
The allowed XML will thus be:

  <parent>scsi_host3</parent>  (current)

or

  <parent wwnn='$WWNN' wwpn='$WWPN'/>

or

  <parent fabric_wwn='$WWNN'/>

Using the wwnn/wwpn or fabric_wwn ensures the same 'scsi_hostN' is
selected between hardware reconfigs or host reboots. The fabric_wwn
Using the wwnn/wwpn pair will provide the most specific search option,
while fabric_wwn will at least ensure usage of the same SAN, but maybe
not the same scsi_hostN.

This patch will add the new fields to the nodedev.rng for input purposes
only since the input XML is essentially thrown away, no need to Format
the values since they'd already be printed as part of the scsi_host
data block.

New API virNodeDeviceGetParentHostByWWNs will take the parent "wwnn" and
"wwpn" in order to search the list of devices for matching capability
data fields wwnn and wwpn.

New API virNodeDeviceGetParentHostByFabricWWN will take the parent "fabric_wwn"
in order to search the list of devices for matching capability data field
fabric_wwn.
2017-01-06 17:14:12 -05:00