Commit Graph

19199 Commits

Author SHA1 Message Date
Pavel Hrdina
c96bd78e4e conf: move iothread XML validation from qemu_command
This will ensure that IOThreads are properly validated while
a domain is defined.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2017-02-20 18:42:24 +01:00
Pavel Hrdina
5b37115c3c qemu_process: remove unnecessary iothread check
The situation covered by the removed code will not ever happen.
This code is called only while starting a new QEMU process where
the capabilities where already checked and while attaching to
existing QEMU process where we don't even detect the iothreads.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2017-02-20 18:41:51 +01:00
Pavel Hrdina
7e3dd50650 qemu_process: move capabilities check for iothreads
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2017-02-20 18:41:30 +01:00
Pavel Hrdina
caf66e0196 qemu_driver: check invalid iothread_id before we do anything else
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2017-02-20 18:41:06 +01:00
Pavel Hrdina
a4a1ad2066 conf: display all iothread ids in the XML if one of them is not generated
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2017-02-20 18:40:54 +01:00
Pavel Hrdina
3fc6512a3d conf: move iothread parse code into its own function
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2017-02-20 17:30:55 +01:00
Pavel Hrdina
875b77821f conf: remove redundant iothreads variable
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2017-02-20 17:30:55 +01:00
Pavel Hrdina
2b5dcda7a9 conf: fix indentation
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2017-02-20 17:30:54 +01:00
Peter Krempa
b4c7310633 Disallow inclusion of files from src/conf into src/utils
The utils code should stay separated from other code (except for very
well justified cases). Unfortunately commit 272769becc
made it trivial to break the separation (and not get slapped by the
syntax-check rule) by adding -I src/conf to the CFLAGS for utils.

Remove this shortcut and except the two offenders from the syntax check
so that the codebase can be kept separated.
2017-02-20 15:12:07 +01:00
Marc Hartmayer
3427b36cc1 node_device: Check return value for udev_new()
The comment was actually wrong as
https://www.freedesktop.org/software/systemd/man/udev_new.html#
mentions that on failure NULL is returned.  Also the same return value
is checked in src/interface/interface_backend_udev.c already.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
2017-02-20 14:44:27 +01:00
Michal Privoznik
5c74cf1f44 qemu: Allow @rendernode for virgl domains
When enabling virgl, qemu opens /dev/dri/render*. So far, we are
not allowing that in devices CGroup nor creating the file in
domain's namespace and thus requiring users to set the paths in
qemu.conf. This, however, is suboptimal as it allows access to
ALL qemu processes even those which don't have virgl configured.
Now that we have a way to specify render node that qemu will use
we can be more cautious and enable just that.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-20 10:44:22 +01:00
Michal Privoznik
1bb787fdc9 qemuDomainGetHostdevPath: Report /dev/vfio/vfio less frequently
So far, qemuDomainGetHostdevPath has no knowledge of the reasong
it is called and thus reports /dev/vfio/vfio for every VFIO
backed device. This is suboptimal, as we want it to:

a) report /dev/vfio/vfio on every addition or domain startup
b) report /dev/vfio/vfio only on last VFIO device being unplugged

If a domain is being stopped then namespace and CGroup die with
it so no need to worry about that. I mean, even when a domain
that's exiting has more than one VFIO devices assigned to it,
this function does not clean /dev/vfio/vfio in CGroup nor in the
namespace. But that doesn't matter.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-20 07:21:59 +01:00
Michal Privoznik
b8e659aa98 qemuDomainGetHostdevPath: Create /dev/vfio/vfio iff needed
So far, we are allowing /dev/vfio/vfio in the devices cgroup
unconditionally (and creating it in the namespace too). Even if
domain has no hostdev assignment configured. This is potential
security hole. Therefore, when starting the domain (or
hotplugging a hostdev) create & allow /dev/vfio/vfio too (if
needed).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-20 07:21:58 +01:00
Michal Privoznik
9d92f533f8 qemuSetupHostdevCgroup: Use qemuDomainGetHostdevPath
Since these two functions are nearly identical (with
qemuSetupHostdevCgroup actually calling virCgroupAllowDevicePath)
we can have one function call the other and thus de-duplicate
some code.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-20 07:21:58 +01:00
Michal Privoznik
60ddceff8f qemu_cgroup: Kill qemuSetupHostSCSIVHostDeviceCgroup
There's no need for this function. Currently it is passed as a
callback to virSCSIVHostDeviceFileIterate(). However, SCSI host
devices have just one file path. Therefore we can mimic approach
used in qemuDomainGetHostdevPath() to get path and call
virCgroupAllowDevicePath() directly.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-20 07:21:58 +01:00
Michal Privoznik
7bb01ed3cd qemu_cgroup: Kill qemuSetupHostSCSIDeviceCgroup
There's no need for this function. Currently it is passed as a
callback to virSCSIDeviceFileIterate(). However, SCSI devices
have just one file path. Therefore we can mimic approach used in
qemuDomainGetHostdevPath() to get path and call
virCgroupAllowDevicePath() directly.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-20 07:21:58 +01:00
Michal Privoznik
4d7d1c4bc3 qemu_cgroup: Kill qemuSetupHostUSBDeviceCgroup
There's no need for this function. Currently it is passed as a
callback to virUSBDeviceFileIterate(). However, USB devices have
just one file path. Therefore we can mimic approach used in
qemuDomainGetHostdevPath() to get path and call
virCgroupAllowDevicePath() directly.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-20 07:21:58 +01:00
Pavel Hrdina
165c76acd0 util: virvhba: fix typo that breaks build on non-linux systems
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2017-02-19 15:47:27 +01:00
John Ferlan
f3b1b98121 tests: Add createVHBAByNodeDevice-parent-fabric-wwn to fchosttest
Add a test that allows providing the parent fabric_wwn in the input XML
in order to create the vHBA.

This also fixes a mixed setting of the fabric_wwn field from the read
test driver XML strings.
2017-02-19 06:45:09 -05:00
John Ferlan
7ad479d0bd nodedev: Rework virNodeDeviceGetParentHost
Rework the code to perform the various searches by parent, parent_wwnn/
parent_wwpn, parent_fabric_wwn, or vport capable in order to return the
'parent_host' number that is vHBA capable.

The former virNodeDeviceGetParentHost is renamed to add the ByParent
on it fixes an issue where if no parent was supplied in the XML to
create the vHBA, then virNodeDeviceFindByName was called with a NULL
second parameter which had bad results.

The reworked code will make the various calls to fetch the NPIV host
by the passed parameter options or if none are provided find a vport
capable NPIV HBA to perform the create. If the call is from the delete
path, then this option won't be allowed.

Each of virNodeDeviceGetParentHostBy* functions is now static, so
remove them external definitions.

A secondary benefit of this is the test_driver now can make use of
the new API to add some new tests to test the various creation options.
2017-02-19 06:45:09 -05:00
John Ferlan
ccb0d6e342 nodedev: Keep the node device lock longer in nodeDeviceDestroy
While perhaps improbable, it could be possible that after finding our
object that another thread running essentially in parallel could attempt
to delete the same vHBA.

So rather than dropping the lock right after finding the object, keep
the lock around while we drop the object lock and work on deleting the
object. Once the delete occurs we can safely drop the driver lock again.

Cleanup some of the usage of cleanup instead out for the goto label.
2017-02-19 06:45:09 -05:00
John Ferlan
03346def06 util: Move scsi_host specific functions from virutil
Create a virscsihost.c and place the functions there. That removes the
last #ifdef __linux__ from virutil.c.

Take the opporunity to also change the function names and in one case
the parameters slightly
2017-02-19 06:45:09 -05:00
John Ferlan
d2d74a986d util: Replace virStoragePoolGetVhbaSCSIHostParent
Use the new virNodeDeviceGetParentName instead. Modify the callers to
build the node device scsi_host# name string in order to call the new
function so that proper lookup occurs.
2017-02-19 06:45:09 -05:00
John Ferlan
aa6aa624ad nodedev: Introduce virNodeDeviceGetParentName
Create a function which takes a node device "name" entry to lookup
and returns a string containing the parent name for the node device.
2017-02-19 06:45:09 -05:00
John Ferlan
16416816c1 util: Create a new virvhba module and move/rename API's
Rather than have them mixed in with the virutil apis, create a separate
virvhba.c module and move the vHBA related calls into there. Soon there
will be more added.

Also modify the names of the functions and some arguments to be more
indicative of what is really happening. Adjust the callers respectively.

While I was changing fchosttest, rather than the non-descriptive names
test1...test6, rename them to match what the test is doing.
2017-02-19 06:45:09 -05:00
John Ferlan
8729ce56fe tests: Create a more realistic vHBA
Modify the code to react more like a real HBA -> vHBA creation.

Currently the code would just modify the input XML definition to
set the name to a wwpn and then modify the scsi_host capability
entry for the defintion to change the scsi_host# and unique_id
before adding that into the node device.

This patch does things a bit better. It finds and copies a known
existing vHBA (scsi_host11) in the node_device database and modifies
that definition to change the name to scsi_host12 and set the wwnn/
wwpn to what the input XML would expect before adding the def to the
node device object list.

Then rather than create a returned "dev" using the (poorly) mocked
name - perform the lookup using the new device name.
2017-02-19 06:45:09 -05:00
John Ferlan
0869d9b333 test: Add helper to create vHBA for testNodeDeviceCreateXML
Rather than inline the dummy creation of a vHBA to add to the node
devices - create a helper to do that work.

Also just tidy up a couple of things while at it...
2017-02-19 06:45:09 -05:00
John Ferlan
5c2ff641e1 test: Add new NPIV capable HBA and a vHBA
Predefine a second NPIV capable HBA as well as a vHBA using the first
NPIV capable HBA. This will allow for a mechanism to perform more
realistic create vHBA testing.
2017-02-19 06:45:09 -05:00
John Ferlan
779e49054a tests: Alter test_driver HBA name/data to be closer to reality
Alter "test-scsi-host-vport" to be "scsi_host1" to match the real
environment. This is the vport capable HBA - IOW the NPIV device.
Add more fields to scsi_host1 as well.

Alter the XML being used by the objecttest to create a vHBA in order
to match the scsi_host1 parent name and to use validateable wwnn/wwpn.
This will allow for realistic testing.
2017-02-19 06:45:09 -05:00
Roman Bogorodskiy
d3ffa0ece8 nodedev: fix build with clang
Build fails with:

conf/node_device_conf.c:825:62: error: comparison of unsigned enum expression < 0 is always false [-Werror,-Wtautological-compare]
    if ((data->drm.type = virNodeDevDRMTypeFromString(type)) < 0) {
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~
conf/node_device_conf.c:1801:59: error: comparison of unsigned enum expression < 0 is always false [-Werror,-Wtautological-compare]
        if ((type = virNodeDevDevnodeTypeFromString(tmp)) < 0) {
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~
2 errors generated.

Fix by using intermediate variable to store the result similarly
to how it's done for other FromString* calls.
2017-02-18 17:49:27 +04:00
Michal Privoznik
78c018693b nodedev: Introduce new drm cap
After 7f1bdec5fa our nodedev driver is capable of
determining DRM devices (DRM stands for Direct Render Manager not
Digital rights management). There is still one bit missing
though: virConnectListAllNodeDevices() is capable of listing
either all devices or just those with specified capability. Well,
DRM capability is missing there.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-17 16:09:15 +01:00
Marc-André Lureau
e5bda10141 qemu: add rendernode argument
Add a new attribute 'rendernode' to <gl> spice element.

Give it to QEMU if qemu supports it (queued for 2.9).

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-17 15:47:58 +01:00
Marc-André Lureau
7f1bdec5fa nodedev: add drm capability
Add a new 'drm' capability for Direct Rendering Manager (DRM) devices,
providing device type information.

Teach the udev backend to populate those devices.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-17 15:47:58 +01:00
Marc-André Lureau
14a3e7ab5c nodedev: parse <path>
This should have been added with c4a4603de (or 0bdefd9b04).

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-17 15:47:58 +01:00
Marc-André Lureau
0809508ed2 nodedev: add <devnode> paths
Add new <devnode> top-level <device> element, that list the associated
/dev files. Distinguish the main /dev name from symlinks with a 'type'
attribute of value 'dev' or 'symlink'.

Update a test to check XML schema, and actually add it to the test list
since it was missing.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-17 15:47:58 +01:00
Marc-André Lureau
7f64435307 nodedev: fix extra space in dump
This is a cosmetic change, shouldn't change XML parsing, and doesn't
break any test.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-17 15:47:58 +01:00
John Ferlan
4337bc57be nodedev: Return the parent for a virNodeDevicePtr struct
When the 'parent' was added to the virNodeDevicePtr structure
by commit id 'e8a4ea75a' the 'parent' field was not properly filled
in when a virGetNodeDevice call was made within driver/config code.
Only the device name was ever filled in. Fetching the parent required
a second trip via virNodeDeviceGetParent into the node device lookup
code was required in order to retrieve the specific parent field (and
still the parent field was never filled in although it was free'd).

Since we have the data when we initially call virGetNodeDevice from
within driver/node_config code - let's just fill in the parent field
as well for anyone that wants it without requiring another trip into
the node_device lookup just to get the parent.

This will allow API's such as virConnectListAllNodeDevices,
virNodeDeviceLookupByName, and virNodeDeviceLookupSCSIHostByWWN
to retrieve both name and parent in the returned virNodeDevicePtr.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-02-16 12:23:04 -05:00
Joao Martins
1e6bc3c581 libxl: fix coverity issues introduced by 6a95edf
As discussed here [0][1] Coverity reported two issues:

- On libxlDomainMigrationPrepareTunnel3 @@mig will be leaked on failures
after sucessfull call libxlDomainMigrationPrepareAny hence we free it.

Setting mig = NULL after @mig is assigned plus adding libxlMigrationCookieFree
on error paths addresses the issue. In case virThreadCreate fails,
unref of args frees the cookie on dispose function (libxlMigrationDstArgsDispose)

- On libxlMigrationStartTunnel @tc would be leaked.

Fixed by correctly saving the newly allocated @tc onto @tnl such that
libxlMigrationStopTunnel would free it up.

[0] https://www.redhat.com/archives/libvir-list/2017-February/msg00791.html
[1] https://www.redhat.com/archives/libvir-list/2017-February/msg00833.html

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
2017-02-16 12:19:35 -05:00
Michal Privoznik
1d9ab0f04a qemu: Allow empty script path to <interface/>
Before 9c17d665fd (v1.3.2 - I know, right?) it was possible to
have the following interface configuration:

  <interface type='ethernet'/>
    <script path=''/>
  </interface>

This resulted in -netdev tap,script=,.. Fortunately, qemu helped
us to get away with this as it just ignored the empty script
path. However, after the commit mentioned above it's libvirtd
who is executing the script. Unfortunately without special
case-ing empty script path.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-16 17:39:34 +01:00
Ján Tomko
76fd798191 Validate required CPU features even for host-passthrough
Commit adff345 allowed enabling features with -cpu host
without ajdusting the validity checks on domain startup
and migration.
2017-02-16 15:22:49 +01:00
Nitesh Konkar
5729746543 Ensure disk names follow the disk name regex
Currently disk names do not follow the
(regex) /^[fhv]d[a-z]+[0-9]*$/ completely
and hence one can assign disk names like
vd2 etc. This patch ensures that the
disk names follow the regex mentioned.
This patch also adds a testcase.

Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
2017-02-16 09:59:13 +01:00
Jim Fehlig
2dc1cf19db libxl: fix potential double free in libxlDriverGetDom0MaxmemConf
Commit 4ab0c959 fixed a memory leak in libxlDriverGetDom0MaxmemConf
but introduced a potential double free of mem_tokens

*** Error in `/usr/sbin/libvirtd': double free or corruption (out):
    0x00007fffc808cfd0 ***

Avoid double free by setting mem_tokens to NULL after calling
virStringListFree.
2017-02-15 18:24:58 -07:00
Bob Liu
6a95edf9ab libxl: add tunnelled migration support
Tunnelled migration doesn't require any extra network connections
beside the libvirt daemon.  It's capable of strong encryption and the
default option of openstack-nova.

This patch adds the tunnelled migration(Tunnel3params) support to
libxl.  On the source side, the data flow is:

 * libxlDoMigrateSend() -> pipe libxlTunnel3MigrationFunc() polls pipe
 * out and then write to dest stream.

While on the destination side:
 * Stream -> pipe -> 'recvfd of libxlDomainStartRestore'

The usage is the same as p2p migration, execpt adding one extra
'--tunnelled' to the libvirt p2p migration command.

Signed-off-by: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
2017-02-15 14:47:14 -07:00
Joao Martins
d2100f2b4a libxl: refactor libxlDomainMigrationPrepare
The newly introduced function libxlDomainMigrationPrepareAny
will be shared between P2P and tunnelled variations.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
2017-02-15 10:15:50 -07:00
Michal Privoznik
27ac5f3741 qemu_conf: Properly check for retval of qemuDomainNamespaceAvailable
This function is returning a boolean therefore check for '< 0'
makes no sense. It should have been
'!qemuDomainNamespaceAvailable'.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-15 15:40:01 +01:00
Michal Privoznik
b57bd206b9 qemu_conf: Check for namespaces availability more wisely
The bare fact that mnt namespace is available is not enough for
us to allow/enable qemu namespaces feature. There are other
requirements: we must copy all the ACL & SELinux labels otherwise
we might grant access that is administratively forbidden or vice
versa.
At the same time, the check for namespace prerequisites is moved
from domain startup time to qemu.conf parser as it doesn't make
much sense to allow users to start misconfigured libvirt just to
find out they can't start a single domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-15 12:43:23 +01:00
Jim Fehlig
ec94e14b68 apparmor: don't fail on non-apparmor <seclabel>
If the apparmor security driver is loaded/enabled and domain config
contains a <seclabel> element whose type attribute is not 'apparmor',
starting the domain fails when attempting to label resources such
as tap FDs.

Many of the apparmor driver entry points attempt to retrieve the
apparmor security label from the domain def, returning failure if
not found. Functions such as AppArmorSetFDLabel fail even though
domain config contains an explicit 'none' secuirty driver, e.g.

  <seclabel type='none' model='none'/>

Change the entry points to succeed if the domain config <seclabel>
is not apparmor. This matches the behavior of the selinux driver.
2017-02-14 16:53:30 -07:00
Jim Fehlig
5cdfc80ba8 apparmor: don't overwrite error from reload_profile
Like other callers of reload_profile, don't overwrite errors in
AppArmorSetSecurityHostdevLabelHelper.
2017-02-14 16:53:30 -07:00
Jiri Denemark
598b6d7999 qemu_monitor_json: Properly check GetArray return value
Commit 2a8d40f4ec refactored qemuMonitorJSONGetCPUx86Data and replaced
virJSONValueObjectGet(reply, "return") with virJSONValueObjectGetArray.
While the former is guaranteed to always return non-NULL pointer the
latter may return NULL if the returned JSON object is not an array.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2017-02-14 23:09:31 +01:00
Andrea Bolognani
ee6ec7824d qemu: Call chmod() after mknod()
mknod() is affected my the current umask, so we're not
guaranteed the newly-created device node will have the
right permissions.

Call chmod(), which is not affected by the current umask,
immediately afterwards to solve the issue.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1421036
2017-02-14 19:23:05 +01:00
Ján Tomko
4a41cf18b1 util: fix off-by-one when expanding a bitmap
To make sure bit 'b' fits into the bitmap, we need to allocate b+1
bits, since we number from 0.

Adjust the bitmap test to set a bit at a multiple of 16.
That way the test fails without this fix, because the VIR_REALLOC
call clears the newly added memory even if the original pointer
has not changed.
2017-02-14 13:30:48 +01:00
Ján Tomko
723fef99c0 qemu: enforce maximum ports value for nec-xhci
This controller only allows up to 15 ports.

https://bugzilla.redhat.com/show_bug.cgi?id=1375417
2017-02-13 16:34:09 +01:00
Ján Tomko
4a7773f7ea conf: check port range even for USB hubs
Move the range check introduced by commit 2650d5e into
virDomainUSBAddressFindPort. That way both virDomainUSBAddressRelease
and virDomainUSBAddressSetAddHub can benefit from it.

Reported-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-13 13:08:36 +01:00
Ján Tomko
384504f7ba qemu: assign USB port on a selected hub for all devices
Due to a logic error, the autofilling of USB port when a bus is
specified:
    <address type='usb' bus='0'/>
does not work for non-hub devices on domain startup.

Fix the logic in qemuDomainAssignUSBPortsIterator to also
assign ports for USB addresses that do not yet have one.

https://bugzilla.redhat.com/show_bug.cgi?id=1374128
2017-02-13 09:46:15 +01:00
Marc Hartmayer
d6bc7622f0 rpc: Fix potentially segfaults
We have to allocate first and if, and only if, it was successful we
can set the count. A segfault has occurred in
virNetServerServiceNewPostExecRestart() when VIR_ALLOC_N(svc->socks,
n) has failed, but svc->nsocsk = n was already set. Thus
virObejectUnref(svc) was called and therefore it was possible that
virNetServerServiceDispose was called => segmentation fault.  For
safeness NULL pointer check were added in
virNetServerServiceDispose().

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
2017-02-12 15:02:42 -05:00
Roman Bogorodskiy
5620c60959 bhyve: add e1000 nic support
Recently e1000 NIC support was added to bhyve; implement that in
the bhyve driver:

 - Add capability check by analyzing output of the 'bhyve -s 0,e1000'
   command
 - Modify bhyveBuildNetArgStr() to support e1000 and also pass
   virConnectPtr so it could call bhyveDriverGetCaps() to check if this
   NIC is supported
 - Modify command parsing code to add support for e1000 and adjust tests
 - Add net-e1000 test
2017-02-11 06:51:28 +04:00
Nitesh Konkar
ef41eda68a util: Fix indentation for virnetdevmacvlan
Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
2017-02-10 14:13:50 -05:00
John Ferlan
4ab0c959e9 libxl: Resolve possible resource leak in dom0 maximum memory setting
If either the "if (STRPREFIX(mem_tokens[j], "max:"))" is never entered
or the "if (virStrToLong_ull(mem_tokens[j] + 4, &p, 10, maxmem) < 0)" break
is hit, control goes back to the outer loop processing 'cmd_tokens' and
it's possible that the 'mem_tokens' would be overwritten.

Found by Coverity
2017-02-10 14:11:04 -05:00
Erik Skultety
b2774db9c2 storage: Fix checking whether source filesystem is mounted
Right now, we use simple string comparison both on the source paths
(mount's output vs pool's source) and the target (mount's mnt_dir vs
pool's target). The problem are symlinks and mount indeed returns
symlinks in its output, e.g. /dev/mappper/lvm_symlink. The same goes for
the pool's source/target, so in order to successfully compare these two
replace plain string comparison with virFileComparePaths which will
resolve all symlinks and canonicalize the paths prior to comparison.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1417203

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2017-02-10 17:01:12 +01:00
Erik Skultety
875894245a util: Introduce virFileComparePaths
So rather than comparing 2 paths (strings) as they are, which can very
easily lead to unnecessary errors (e.g. in storage driver) that the paths
are not the same when in fact they'd be e.g. just symlinks to the same
location, we should put our best effort into resolving any symlinks and
canonicalizing the path and only then compare the 2 paths for equality.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2017-02-10 17:01:12 +01:00
Erik Skultety
8fcf6330b6 storage: Fix reporting an error on an already mounted filesystem
When FS pool's source is already mounted on the target location instead
of just simply marking the pool as active, thus starting it we fail with
an error stating that the source is indeed already mounted on the target.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2017-02-10 17:01:12 +01:00
Boris Fiuczynski
d15b29be25 remote generator: Increase upper limit on lists of node devices
On a system with 697 SCSI disks each configured with 8 paths the command
virsh nodedev-list fails with
error: Failed to list node devices
error: internal error: Too many node_devices '16816' for limit '16384'
Increasing the upper limit on lists of node devices from 16K to 64K.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2017-02-10 16:05:24 +01:00
Nitesh Konkar
f278a148e2 Mention the min duration for nodesuspend explicitly
Although currently this is documented in virsh man page
and virsh help, the expicit mention in the error message
is helful for tools using the API directly.

Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
2017-02-10 09:13:30 -05:00
Michal Privoznik
732629dad3 qemuMonitorCPUModelInfoFree: Don't leak model_info->props
==11846== 240 bytes in 1 blocks are definitely lost in loss record 81 of 107
==11846==    at 0x4C2BC75: calloc (vg_replace_malloc.c:624)
==11846==    by 0x18C74242: virAllocN (viralloc.c:191)
==11846==    by 0x4A05E8: qemuMonitorCPUModelInfoCopy (qemu_monitor.c:3677)
==11846==    by 0x446E3C: virQEMUCapsNewCopy (qemu_capabilities.c:2171)
==11846==    by 0x437335: testQemuCapsCopy (qemucapabilitiestest.c:108)
==11846==    by 0x437CD2: virTestRun (testutils.c:180)
==11846==    by 0x437AD8: mymain (qemucapabilitiestest.c:176)
==11846==    by 0x4397B6: virTestMain (testutils.c:992)
==11846==    by 0x437B44: main (qemucapabilitiestest.c:188)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-10 10:25:44 +01:00
Nitesh Konkar
09a91f0528 Fix indentation in datatypes.h
Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
2017-02-09 17:31:41 -05:00
Marc Hartmayer
fd98631cf0 remote generator: handle remoteDomainCreateWithFlags()
This commit removes the handcrafted code for
remoteDomainCreateWithFlags() and lets it auto generate.

A little bit of history repeating...
Commit 03d813bbcd removed the auto generation of
remoteDomainCreateWithFlags() because it was thought that the design
flaw in the remote protocol for virDomainCreate is also within the
remote protocol for virDomainCreateWithFlags. As the commit message of
ddaf15d7a3 mentions this is not the case therefore we
can auto generate the client part.

Even worse there was a typo in remoteDomainCreateWithFlags()

'remote_domain_create_with_flags_args ret;' but in fact it has to be
'remote_domain_create_with_flags_ret ret;'.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2017-02-09 17:21:15 -05:00
Marc Hartmayer
c26fe44be5 util: reset the counters to zero
After freeing the data structures we have to reset the counters to
zero. This fixes a segmentation fault when virNetDevIPInfoClear is
called twice (e.g. this is possible in virDomainNetDefParseXML() if
virDomainNetIPInfoParseXML(...) fails with ret < 0 (this leads to the
first call of 'virNetDevIPInfoClear(&def->guestIP)') and the resulting
call of virDomainNetDefFree(def) in the error path of
virDomainNetDefParseXML() (this leads to the second call of
virNetDevIPInfoClear(&def->guestIP), and finally to the segmentation
fault).

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
2017-02-09 14:20:42 -05:00
Marc Hartmayer
28dd54a5b9 conf: Fix libvirtd free() segfault if virDomainChrSourceDefNew(...) fails
If virDomainChrSourceDefNew(xmlopt) fails, it will lead to free()ing
the uninitialized pointer bus. The fix for this is to initialize bus
with NULL.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
2017-02-09 14:18:51 -05:00
Marc Hartmayer
62b2c2fcdd qemu: Check if virQEMUCapsNewCopy(...) has failed
Check if virQEMUCapsNewCopy(...) has failed, thus a segmentation fault
in virQEMUCapsFilterByMachineType(...) will be avoided.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
2017-02-09 14:08:00 -05:00
David Dai
728c0e5df4 qemu: Fix live migration over RDMA with IPv6
Using libvirt to do live migration over RDMA via IPv6 address failed.

For example:
    rhel73_host1_guest1 qemu+ssh://[deba::2222]/system --verbose
root@deba::2222's password:
error: internal error: unable to execute QEMU command 'migrate': RDMA
ERROR: could not rdma_getaddrinfo address deba

As we can see, the IPv6 address used by rdma_getaddrinfo() has only
"deba" part because we didn't properly enclose the IPv6 address in []
and passed rdma:deba::2222:49152 as the migration URI in
qemuMonitorMigrateToHost.

Signed-off-by: David Dai <zdai@linux.vnet.ibm.com>
2017-02-09 19:47:09 +01:00
Jim Fehlig
79692c3874 libxl: fix dom0 maximum memory setting
When the libxl driver is initialized, it creates a virDomainDef
object for dom0 and adds it to the list of domains. Total memory
for dom0 was being set from the max_memkb field of libxl_dominfo
struct retrieved from libxl, but this field can be set to
LIBXL_MEMKB_DEFAULT (~0ULL) if dom0 maximum memory has not been
explicitly set by the user.

This patch adds some simple parsing of the Xen commandline,
looking for a dom0_mem parameter that also specifies a 'max' value.
If not specified, dom0 maximum memory is effectively all physical
host memory.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2017-02-09 10:02:19 -07:00
Jim Fehlig
d2b77608e9 libxl: fix reporting of maximum memory
The libxl driver reports different values of maximum memory depending
on state of a domain. If inactive, maximum memory value is reported
correctly. When active, maximum memory is derived from max_pages value
returned by the XEN_SYSCTL_getdomaininfolist sysctl operation. But
max_pages can be changed by toolstacks and does not necessarily
represent the maximum memory a domain can use during its active
lifetime.

A better location for determining a domain's maximum memory is the
/local/domain/<id>/memory/static-max node in xenstore. This value
is set from the libxl_domain_build_info.max_memkb field when creating
the domain. Currently it cannot be changed nor can its value be
exceeded by a balloon operation. From libvirt's perspective, always
reporting maximum memory with virDomainDefGetMemoryTotal() will produce
the same results as reading the static-max node in xenstore.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2017-02-09 09:38:34 -07:00
Jim Fehlig
bd1168101a libxl: fix disk detach when <driver> not specified
When a user does not explicitly set a <driver> in the disk config,
libvirt defers selection of a default to libxl. This approach works
fine when starting a domain with such configuration or attaching a
disk to a running domain. But when detaching such a disk, libxl
will fail with "unrecognized disk backend type: 0". libxl makes no
attempt to recalculate a default backend (driver) on detach and
simply fails when uninitialized.

This patch updates the libvirt disk config with the backend selected
by libxl when starting a domain or attaching a disk to a running
domain. Another benefit of this approach is that the live XML is
also updated with the backend driver selected by libxl.
2017-02-09 09:24:44 -07:00
Jim Fehlig
321a28c6ae libxl: set default disk format in device post-parse
When starting a domian, a libxl_domain_config object is created from
virDomainDef. Any virDomainDiskDef devices with a format of
VIR_STORAGE_FILE_NONE are mapped to LIBXL_DISK_FORMAT_RAW in the
corresponding libxl_disk_device, but the virDomainDiskDef format is
never updated to reflect the change.

A better place to set a default format for disk devices is the
device post-parse callback, ensuring the virDomainDiskDef object
reflects the default format.
2017-02-09 09:24:44 -07:00
Boris Fiuczynski
f4d06ca8fd network: allow to specify timeout for openvswitch calls
This patchs allows to set the timeout value used for all
openvswitch calls. The default timeout value remains as
before at 5 seconds.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-09 14:34:08 +01:00
Boris Fiuczynski
66583c0cf7 libvirtd: add openvitch timeout value
Provide the ability to specify a default timeout value for
successful completion of openvswitch calls in the libvirtd
configuration file.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-09 14:34:08 +01:00
Jaroslav Safka
1c4f3b56f8 qemu: Add args generation for file memory backing
This patch add support for file memory backing on numa topology.

The specified access mode in memoryBacking can be overriden
by specifying token memAccess in numa cell.
2017-02-09 14:27:19 +01:00
Jaroslav Safka
bc6d3121a4 conf: Add new xml elements for file memorybacking support
This part introduces new xml elements for file based
memorybacking support and their parsing.
(It allows vhost-user to be used without hugepages.)

New xml elements:
<memoryBacking>
  <source type="file|anonymous"/>
  <access mode="shared|private"/>
  <allocation mode="immediate|ondemand"/>
</memoryBacking>
2017-02-09 14:27:19 +01:00
Jaroslav Safka
48d9e6cdcc qemu_conf: Add param memory_backing_dir
Add new parameter memory_backing_dir where files will be stored when memoryBacking
source is selected as file.

Value is stored inside char* memoryBackingDir
2017-02-09 14:27:19 +01:00
Jaroslav Safka
7c0c5f6d4b qemu, conf: Rename virNumaMemAccess to virDomainMemoryAccess
Rename to avoid duplicate code. Because virDomainMemoryAccess will be
used in memorybacking for setting default behaviour.

NOTE: The enum cannot be moved to qemu/domain_conf because of headers
dependency
2017-02-09 14:27:19 +01:00
Maxim Nestratov
c3ee75e5aa cpu: fix typo: rename __kvm_hv_spinlock to __kvm_hv_spinlocks
Strings associated with virDomainHyperv values in domain_conf.c are used to
construct HyperV CPU features names to be compared with names defined in
cpu_x86_data.h and the names for HyperV "spinlocks" feature don't match.
This leads to a misleading warning:
"host doesn't support hyperv 'spinlocks' feature" even when it's supported.
Let's fix it and rename along with it VIR_CPU_x86_KVM_HV_SPINLOCK to
VIR_CPU_x86_KVM_HV_SPINLOCKS.

Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
2017-02-09 13:52:16 +01:00
Jiri Denemark
644804765b qemu_command: Fix check for gluster disks
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2017-02-09 11:48:10 +01:00
Jiri Denemark
2cc317b1f5 qemu_blockjob: Avoid dereferencing NULL on OOM
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2017-02-09 11:48:10 +01:00
Jiri Denemark
b97839b835 cpu_x86: Fix memory leak in virCPUx86Translate
virCPUDefStealModel is called with keepVendor == true which means the
cpu structure will keep its original vendor/vendor_id values. Thus it
makes no sense to copy them to the translated definition as they won't
be used there anyway. Except that the translated->vendor pointer might
get lost in x86Decode.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2017-02-09 11:48:10 +01:00
Maxim Nestratov
eda4ec94ff vz: cleanup: remove unused constant
PARALLELS_STATISTICS_DROP_COUNT isn't used anymore

Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
2017-02-09 13:06:15 +03:00
Maxim Nestratov
c52f5bea0d vz: fix event handle leak in prlsdkHandlePerfEvent
When we happen to lose a domain but still get a performance event
for it, we should also free the event handle.

Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
2017-02-09 13:06:15 +03:00
Maxim Nestratov
05456cc97e vz: fix handle leak in prlsdkHandleVmStateEvent
Every successful call of PrlEvent_GetParamByName allocates a handle,
which has to be freed.

Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
2017-02-09 13:06:15 +03:00
Michal Privoznik
c2130c0d47 qemu_security: Introduce ImageLabel APIs
Just like we need wrappers over other virSecurityManager APIs, we
need one for virSecurityManagerSetImageLabel and
virSecurityManagerRestoreImageLabel. Otherwise we might end up
relabelling device in wrong namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-09 08:04:57 +01:00
Jim Fehlig
c89a6e7878 libxl: use init and dispose functions with libxl_physinfo
The typical pattern when calling libxl functions that populate a
structure is

  libxl_foo foo;
  libxl_foo_init(&foo);
  libxl_get_foo(ctx, &foo);
  ...
  libxl_foo_dispose(&foo);

Fix several instances of libxl_physinfo missing the init and
dispose calls.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2017-02-08 09:12:00 -07:00
Jim Fehlig
ff225538d4 libxl: honor autoballoon setting in libxl.conf
libxlGetAutoballoonConf is supposed to honor user-specified
autoballoon setting in libxl.conf. As written, the user-specified
setting could be overwritten by the subsequent logic to check
dom0_mem parameter. If user-specified setting is present and
correct, accept it. Only fallback to checking Xen dom0_mem
command line parameter if user-specfied setting is not present.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2017-02-08 09:10:49 -07:00
Joao Martins
91ac80a986 xenconfig: fix xml to xl.cfg conversion with no graphics
If no graphics element is in XML xenFormatXLSpice will access
graphics without checking it has one in the first place, leading to a
segmentation fault.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
2017-02-08 08:41:13 -07:00
Michal Privoznik
b7feabbfdc qemuDomainNamespaceSetupDisk: Simplify disk check
Firstly, instead of checking for next->path the
virStorageSourceIsEmpty() function should be used which also
takes disk type into account.
Secondly, not every disk source passed has the correct type set
(due to our laziness). Therefore, instead of checking for
virStorageSourceIsBlockLocal() and also S_ISBLK() the former can
be refined to just virStorageSourceIsLocalStorage().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:56:21 +01:00
Michal Privoznik
786d8d91b4 qemuDomainDiskChainElement{Prepare,Revoke}: manage /dev entry
Again, one missed bit. This time without this commit there is no
/dev entry  in the namespace of the qemu process when doing disk
snapshots or block-copy.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:56:13 +01:00
Michal Privoznik
18ce9d139d qemuDomainNamespace{Setup,Teardown}Disk: Don't pass pointer to full disk
These functions do not need to see the whole virDomainDiskDef.
Moreover, they are going to be called from places where we don't
have access to the full disk definition. Sticking with
virStorageSource is more than enough.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:56:05 +01:00
Michal Privoznik
76d491ef14 qemuDomainNamespaceSetupDisk: Drop useless @src variable
Since its introduction in 81df21507b this variable was never
used.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:55:56 +01:00
Michal Privoznik
8dc867e978 qemu_domain: Don't pass virDomainDeviceDefPtr to ns helpers
There is no need for this. None of the namespace helpers uses it.
Historically it was used when calling secdriver APIs, but we
don't to that anymore.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:55:52 +01:00
Michal Privoznik
848dbe1937 qemu_security: Drop qemuSecuritySetRestoreAllLabelData struct
This struct is unused after 095f042ed6.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:55:46 +01:00
Michal Privoznik
45599e407c qemuDomainAttachSCSIVHostDevice: manage /dev entry
Again, one missed bit. This time without this commit there is no
/dev entry in the namespace of the qemu process when attaching
vhost SCSI device.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:54:52 +01:00
Michal Privoznik
7d93a88519 qemuDomainAttachSCSIVHostDevice: Prefer qemuSecurity wrappers
Since we have qemuSecurity wrappers over
virSecurityManagerSetHostdevLabel and
virSecurityManagerRestoreHostdevLabel we ought to use them
instead of calling secdriver APIs directly.  Without those
wrappers the labelling won't be done in the correct namespace
and thus won't apply to the nodes seen by qemu itself.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:53:43 +01:00
Laine Stump
2841e6756d qemu: propagate bridge MTU into qemu "host_mtu" option
libvirt was able to set the host_mtu option when an MTU was explicitly
given in the interface config (with <mtu size='n'/>), set the MTU of a
libvirt network in the network config (with the same named
subelement), and would automatically set the MTU of any tap device to
the MTU of the network.

This patch ties that all together (for networks based on tap devices
and either Linux host bridges or OVS bridges) by learning the MTU of
the network (i.e. the bridge) during qemuInterfaceBridgeConnect(), and
returning that value so that it can then be passed to
qemuBuildNicDevStr(); qemuBuildNicDevStr() then sets host_mtu in the
interface's commandline options.

The result is that a higher MTU for all guests connecting to a
particular network will be plumbed top to bottom by simply changing
the MTU of the network (in libvirt's config for libvirt-managed
networks, or directly on the bridge device for simple host bridges or
OVS bridges managed outside of libvirt).

One question I have about this - it occurred to me that in the case of
migrating a guest from a host with an older libvirt to one with a
newer libvirt, the guest may have *not* had the host_mtu option on the
older machine, but *will* have it on the newer machine. I'm curious if
this could lead to incompatibilities between source and destination (I
guess it all depends on whether or not the setting of host_mtu has a
practical effect on a guest that is already running - Maxime?)

Likewise, we could run into problems when migrating from a newer
libvirt to older libvirt - The guest would have been told of the
higher MTU on the newer libvirt, then migrated to a host that didn't
understand <mtu size='blah'/>. (If this really is a problem, it would
be a problem with or without the current patch).
2017-02-07 14:02:19 -05:00
Laine Stump
c0f706865e network: honor mtu setting when creating network
This resolves: https://bugzilla.redhat.com/1224348
2017-02-07 14:00:27 -05:00
Laine Stump
68a42bf6f7 conf: support configuring mtu size in a virtual network
Example:

  <network>
     ...
     <mtu size='9000'/>
     ...

If mtu is unset, it's assumed that we want the default for whatever is
the underlying transport (usually this is 1500).

This setting isn't yet wired in, so it will have no effect.

This partially resolves: https://bugzilla.redhat.com/1224348
2017-02-07 13:52:06 -05:00
Laine Stump
dd8ac030fb util: add MTU arg to virNetDevTapCreateInBridgePort()
virNetDevTapCreateInBridgePort() has always set the new tap device to
the current MTU of the bridge it's being attached to. There is one
case where we will want to set the new tap device to a different
(usually larger) MTU - if that's done with the very first device added
to the bridge, the bridge's MTU will be set to the device's MTU. This
patch allows for that possibility by adding "int mtu" to the arg list
for virNetDevTapCreateInBridgePort(), but all callers are sending -1,
so it doesn't yet have any effect.

Since the requested MTU isn't necessarily what is used in the end (for
example, if there is no MTU requested, the tap device will be set to
the current MTU of the bridge), and the hypervisor may want to know
the actual MTU used, we also return the actual MTU to the caller (if
actualMTU is non-NULL).
2017-02-07 13:45:08 -05:00
Andrea Bolognani
c2e60ad0e5 qemu: Forbid <memoryBacking><locked> without <memtune><hard_limit>
In order for memory locking to work, the hard limit on memory
locking (and usage) has to be set appropriately by the user.

The documentation mentions the requirement already: with this
patch, it's going to be enforced by runtime checks as well,
by forbidding a non-compliant guest from being defined as well
as edited and started.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1316774
2017-02-07 18:43:10 +01:00
Roman Bogorodskiy
66c21aee89 bhyve: fix virtio disk addresses
Like it usually happens, I fixed one thing and broke another:
in 803966c76 address allocation was fixed for SATA disks, but
broke that for virtio disks, because it dropped disk address
assignment completely. It's not needed for SATA disks anymore,
but still needed for the virtio ones.

Bring that back and add a couple of tests to make sure it won't
happen again.
2017-02-07 19:17:58 +04:00
Michal Privoznik
7f0b382522 qemuDomainAttachDeviceMknod: Don't loop endlessly
When working with symlinks it is fairly easy to get into a loop.
Don't.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 13:20:19 +01:00
Michal Privoznik
3f5fcacf89 qemuDomainAttachDeviceMknod: Deal with symlinks
Similarly to one of the previous commits, we need to deal
properly with symlinks in hotplug case too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 13:20:17 +01:00
Michal Privoznik
4ac847f93b qemuDomainCreateDevice: Don't loop endlessly
When working with symlinks it is fairly easy to get into a loop.
Don't.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 13:18:32 +01:00
Michal Privoznik
54ed672214 qemuDomainCreateDevice: Properly deal with symlinks
Imagine you have a disk with the following source set up:

/dev/disk/by-uuid/$uuid (symlink to) -> /dev/sda

After cbc45525cb the transitive end of the symlink chain is
created (/dev/sda), but we need to create any item in chain too.
Others might rely on that.
In this case, /dev/disk/by-uuid/$uuid comes from domain XML thus
it is this path that secdriver tries to relabel. Not the resolved
one.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 13:18:10 +01:00
Michal Privoznik
b621291f5c qemuDomain{Attach,Detach}Device NS helpers: Don't relabel devices
After previous commit this has become redundant step.
Also setting up devices in namespace and setting their label
later on are two different steps and should be not done at once.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 10:40:53 +01:00
Michal Privoznik
0f0fcc2cd4 qemu_security: Use more transactions
The idea is to move all the seclabel setting to security driver.
Having the relabel code spread all over the place looks very
messy.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 10:40:53 +01:00
Michal Privoznik
3e6839d4e8 qemuSecurityRestoreAllLabel: Don't use transactions
Because of the nature of security driver transactions, it is
impossible to use them properly. The thing is, transactions enter
the domain namespace and commit all the seclabel changes.
However, in RestoreAllLabel() this is impossible - the qemu
process, the only process running in the namespace, is gone. And
thus is the namespace. Therefore we shouldn't use the transactions
as there is no namespace to enter.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 10:40:53 +01:00
Michal Privoznik
0a4652381f qemuDomainPrepareDisk: Fix ordering
The current ordering is as follows:
1) set label
2) create the device in namespace
3) allow device in the cgroup

While this might work for now, it will definitely not work if the
security driver would use transactions as in that case there
would be no device to relabel in the domain namespace as the
device is created in the second step.
Swap steps 1) and 2) to allow security driver to use more
transactions.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 10:40:53 +01:00
Michal Privoznik
6094169b83 util: Introduce virFileReadLink
We will need to traverse the symlinks one step at the time.
Therefore we need to see where a symlink is pointing to.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 10:40:53 +01:00
Michal Privoznik
3172d26730 virProcessRunInMountNamespace: Report errors from child
The comment to the function states that the errors from the child
process are reported. Well, the error buffer is filled with
possible error messages. But then it is thrown away. Among with
important error message from the child process.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 10:40:53 +01:00
Michal Privoznik
aaf0ac7e7c xenFormatXLDisk: Don't leak @target
==11260== 1,006 bytes in 1 blocks are definitely lost in loss record 106 of 111
==11260==    at 0x4C2AE5F: malloc (vg_replace_malloc.c:297)
==11260==    by 0x4C2BDFF: realloc (vg_replace_malloc.c:693)
==11260==    by 0x4EA430B: virReallocN (viralloc.c:245)
==11260==    by 0x4EA7C52: virBufferGrow (virbuffer.c:130)
==11260==    by 0x4EA7D28: virBufferAdd (virbuffer.c:165)
==11260==    by 0x4EA8E10: virBufferStrcat (virbuffer.c:718)
==11260==    by 0x42D263: xenFormatXLDiskSrcNet (xen_xl.c:960)
==11260==    by 0x42D4EB: xenFormatXLDiskSrc (xen_xl.c:1015)
==11260==    by 0x42D870: xenFormatXLDisk (xen_xl.c:1101)
==11260==    by 0x42DA89: xenFormatXLDomainDisks (xen_xl.c:1148)
==11260==    by 0x42EAF8: xenFormatXL (xen_xl.c:1558)
==11260==    by 0x40E85F: testCompareParseXML (xlconfigtest.c:105)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 10:31:59 +01:00
John Ferlan
48ad600916 util: Fix domain object leaks on closecallbacks
Originally/discovered proposed by "Wang King <king.wang@huawei.com>"

When the virCloseCallbacksSet is first called, it increments the refcnt
on the domain object to ensure it doesn't get deleted before the callback
is called. The refcnt would be decremented in virCloseCallbacksUnset once
the entry is removed from the closeCallbacks has table.

When (mostly) normal shutdown occurs, the qemuProcessStop will end up
calling qemuProcessAutoDestroyRemove and will remove the callback from
the list and hash table normally and decrement the refcnt.

However, when qemuConnectClose calls virCloseCallbacksRun, it will scan
the (locked) closeCallbacks list for matching domain and callback function.
If an entry is found, it will be removed from the closeCallbacks list and
placed into a lookaside list to be processed when the closeCallbacks lock
is dropped. The callback function (e.g. qemuProcessAutoDestroy) is called
and will run qemuProcessStop. That code will fail to find the callback
in the list when qemuProcessAutoDestroyRemove is called and thus not decrement
the domain refcnt. Instead since the entry isn't found the code will just
return (mostly) harmlessly.

This patch will resolve the issue by taking another ref during the
search UUID process during virCloseCallackRun, decrementing the refcnt
taken by virCloseCallbacksSet, calling the callback routine and returning
overwriting the vm (since it could return NULL). Finally, it will call the
virDomainObjEndAPI to lower the refcnt and remove the lock taken during
the search UUID processing. This may cause the vm to be destroyed.
2017-02-03 19:38:39 -05:00
Daniel P. Berrange
aed0850e39 virtlockd: fix systemd unit file dependancies
After deploying virtlogd by default we identified a number of
mistakes in the systemd unit file. virtlockd's relationship
to libvirtd is the same as virtlogd, so we must apply the
same unit file fixes to virtlockd

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-02-03 16:40:08 +00:00
Jim Fehlig
f86a7a8372 libxl: fix dom0 autoballooning with Xen 4.8
xen.git commit 57f8b13c changed several of the libxl memory
get/set functions to take 64 bit parameters. The libvirt
libxl driver still uses uint32_t variables for these various
parameters, which is particularly problematic for the
libxl_set_memory_target() function.

When dom0 autoballooning is enabled, libvirt (like xl) determines
the memory needed to start a domain and the memory available. If
memory available is less than memory needed, dom0 is ballooned
down by passing a negative value to libxl_set_memory_target()
'target_memkb' parameter. Prior to xen.git commit 57f8b13c,
'target_memkb' was an int32_t. Subtracting a larger uint32 from
a smaller uint32 and assigning it to int32 resulted in a negative
number. After commit 57f8b13c, the same subtraction is widened
to a int64, resulting in a large positive number. The simple
fix taken by this patch is to assign the difference of the
uint32 values to a temporary int32 variable, which is then
passed to 'target_memkb' parameter of libxl_set_memory_target().

Note that it is undesirable to change libvirt to use 64 bit
variables since it requires setting LIBXL_API_VERSION to 0x040800.
Currently libvirt supports LIBXL_API_VERSION >= 0x040400,
essentially Xen >= 4.4.
2017-02-02 10:24:24 -07:00
Nitesh Konkar
4f405ebd1d qemu: Fix indentation in qemu_interface.h
Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
2017-02-01 09:27:48 +01:00
Martin Kletzander
bb5d6379a0 qemu: Don't lose group_name
Now that we have a function for properly assigning the blockdeviotune
info, let's use it instead of dropping the group name on every
assignment.  Otherwise it will not work with both --live and --config
options.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2017-01-31 20:19:35 +01:00
Martin Kletzander
eae7cfd42d conf: Add virDomainDiskSetBlockIOTune
That function sets disk->blkdeviotune sensibly.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2017-01-31 20:19:35 +01:00
Martin Kletzander
8336cbca21 qemu: Fix indentation in qemu_domain.h for RNG Namespaces
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2017-01-31 16:13:32 +01:00
Maxim Nestratov
99fb668ede vz: change printing format specifier for network statistics
This is necessary to be able to get statistics for venet0 or
"host-routed" adapter, which has -1 index and thus, its statistics
is shown as "net.nic4294967295".

Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
2017-01-31 17:05:20 +03:00
Nikolay Shirokovskiy
4ebb75c364 vz: support virDomainReset 2017-01-31 17:03:22 +03:00
Nikolay Shirokovskiy
48317abbf7 vz: support virDomainAbortJob 2017-01-31 17:02:26 +03:00
Ján Tomko
3ac97c2ded qemu: Add enough USB hubs to accomodate all devices
Commit 815d98a started auto-adding one hub if there are more USB devices
than available USB ports.

This was a strange choice, since there might be even more devices.
Before USB address allocation was implemented in libvirt, QEMU
automatically added a new USB hub if the old one was full.

Adjust the logic to try adding as many hubs as will be needed
to plug in all the specified devices.

https://bugzilla.redhat.com/show_bug.cgi?id=1410188
2017-01-31 13:09:08 +01:00
Ján Tomko
077c6d450f conf: move VIR_DOMAIN_USB_HUB_PORTS to the header file
For reusing in qemu_domain_address.c.
2017-01-31 13:09:08 +01:00
Roman Bogorodskiy
803966c76d bhyve: fix SATA address allocation
As bhyve for a long time didn't have a notion of the explicit SATA
controller and created a controller for each drive, the bhyve driver
in libvirt acted in a similar way and didn't care about the SATA
controllers and assigned PCI addresses to drives directly, as
the generated command will look like this anyway:

 2:0,ahci-hd,somedisk.img

This no longer makes sense because:

 1. After commit c07d1c1c4f it's not possible to assign
    PCI addresses to disks
 2. Bhyve now supports multiple disk drives for a controller,
    so it's going away from 1:1 controller:disk mapping, so
    the controller object starts to make more sense now

So, this patch does the following:

 - Assign PCI address to SATA controllers (previously we didn't do this)
 - Assign disk addresses instead of PCI addresses for disks. Now, when
   building a bhyve command, we take PCI address not from the disk
   itself but from its controller
 - Assign addresses at XML parsing time using the
   assignAddressesCallback. This is done mainly for being able to
   verify address allocation via xml2xml tests
 - Adjust existing bhyvexml2{xml,argv} tests to chase the new
   address allocation

This patch is largely based on work of Fabian Freyer.
2017-01-30 20:48:42 +04:00
Roman Bogorodskiy
13a050b2c3 bhyve: add virBhyveDriverCreateXMLConf
Add virBhyveDriverCreateXMLConf, a simple wrapper around
virDomainXMLOptionNew that makes it easier to pass bhyveConnPtr
as a private data for parser. It will be used later for device
address allocation at parsing time.

Update consumers to use it instead of direct calls to
virDomainXMLOptionNew.

As we now have proper callbacks connected for the tests, update
test files accordingly to include the automatically generated
PCI root controller.
2017-01-30 20:48:42 +04:00
Fabian Freyer
20a7737d35 bhyve: detect 32 SATA devices per controller support
Introduce a BHYVE_CAP_AHCI32SLOT capability that shows
if 32 devices per SATA controller are supported, and
a bhyveProbeCapsAHCI32Slot function that probes it.
2017-01-30 20:48:42 +04:00
Nikolay Shirokovskiy
b66bf0730a vz: add state group to all domain stats 2017-01-30 19:44:13 +03:00
Nikolay Shirokovskiy
2a41a2301b vz: add balloon group to all domain stats 2017-01-30 19:44:13 +03:00
Nikolay Shirokovskiy
e15e94c2dd vz: add vcpu group to all domain stats 2017-01-30 19:44:13 +03:00
Nikolay Shirokovskiy
87f41f38e3 vz: add net group to all domain stats 2017-01-30 19:44:13 +03:00
Nikolay Shirokovskiy
0d5ca32e38 vz: provide block stats for all domain stats 2017-01-30 19:44:13 +03:00
Nikolay Shirokovskiy
9c10c03093 vz: don't show bootorder for containers
Because this is invalid xml for containers. This patch almost
reverts 7eda8369, but still skips converting vz sdk bootorder
for containers to libvirt bootorder because we use boot order
in containers for quite different purpurse.
2017-01-30 19:44:13 +03:00
Ján Tomko
de325472cc qemu: assign USB addresses on redirdev hotplug too
https://bugzilla.redhat.com/show_bug.cgi?id=1375410
2017-01-30 16:17:35 +01:00
Michal Privoznik
a5cae75a3e qemuBuildChrChardevStr: Don't leak @charAlias
==12618== 110 bytes in 10 blocks are definitely lost in loss record 269 of 295
==12618==    at 0x4C2AE5F: malloc (vg_replace_malloc.c:297)
==12618==    by 0x1CFC6DD7: vasprintf (vasprintf.c:73)
==12618==    by 0x1912B2FC: virVasprintfInternal (virstring.c:551)
==12618==    by 0x1912B411: virAsprintfInternal (virstring.c:572)
==12618==    by 0x50B1FF: qemuAliasChardevFromDevAlias (qemu_alias.c:638)
==12618==    by 0x518CCE: qemuBuildChrChardevStr (qemu_command.c:4973)
==12618==    by 0x522DA0: qemuBuildShmemBackendChrStr (qemu_command.c:8674)
==12618==    by 0x523209: qemuBuildShmemCommandLine (qemu_command.c:8789)
==12618==    by 0x526135: qemuBuildCommandLine (qemu_command.c:9843)
==12618==    by 0x48B4BA: qemuProcessCreatePretendCmd (qemu_process.c:5897)
==12618==    by 0x4378C9: testCompareXMLToArgv (qemuxml2argvtest.c:498)
==12618==    by 0x44D5A6: virTestRun (testutils.c:180)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-30 10:38:03 +01:00
Martin Kletzander
b425245520 qemu: Add better message for some invalid block I/O settings
For example when both total_bytes_sec and total_bytes_sec_max are set,
but the former gets cleaned due to new call setting, let's say,
read_bytes_sec, we end up with this weird message for the command:

 $ virsh blkdeviotune fedora vda --read-bytes-sec 3000
 error: Unable to change block I/O throttle
 error: unsupported configuration: value 'total_bytes_sec_max' cannot be set if 'total_bytes_sec' is not set

So let's make it more descriptive.  This is how it looks after the change:

 $ virsh blkdeviotune fedora vda --read-bytes-sec 3000
 error: Unable to change block I/O throttle
 error: unsupported configuration: cannot reset 'total_bytes_sec' when 'total_bytes_sec_max' is set

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1344897

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2017-01-29 19:57:13 +01:00
Martin Kletzander
87ee705183 qemu: Miscellaneous Block I/O tune cleanups
Well, just two.  One indentation and the usage of 'ret'.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2017-01-29 19:53:52 +01:00
Martin Kletzander
e9d75343d4 qemu: Only set group_name when actually requested
We were setting it based on whether it was supported and that lead to
setting it to NULL, which our JSON code caught.  However it ended up
producing the following results:

 $ virsh blkdeviotune fedora vda --total-bytes-sec-max 2000
 error: Unable to change block I/O throttle
 error: internal error: argument key 'group' must not have null value

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2017-01-29 19:46:51 +01:00
Pavel Hrdina
f19390d2d3 domain_conf: vnc: preserve autoport value if no port was specified
The issue is that if this graphics definition is provided:

  <graphics type='vnc' port='0'/>

it's parsed as:

  <graphics type='vnc' autoport='no'>
    <listen type='address'/>
  </graphics>

but if the resulting XML is parsed again the output is:

  <graphics type='vnc' port='-1' autoport='yes'>
    <listen type='address'/>
  </graphics>

and this should not happen.  The XML have to always remain the same
after it was already parsed by libvirt.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1383039

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2017-01-27 09:46:03 +01:00
Nitesh Konkar
26ac16f3ce perf: Prevent enabling of already enabled perf event
Currently, on every --enable perf_event command,
a new event->fd is created and counting of perf
event counter starts from zero and previous
event->fd is lost. This patch prevents this
behaviour.

Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
2017-01-26 15:13:58 -05:00
Olga Krishtal
d18c083c39 vstorage: Fix build
Needed storage_util.h - missed while merging '5f07c3c07' with
commit id '479a2f16'.

Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
2017-01-26 12:25:37 -05:00
John Ferlan
448e2d5e94 storage: Fix build due to recent storage backend code movement
Commit id '5f07c3c07' broke the freebsd build in the libvirt CI test
environment because the UMOUNT was not defined unless WITH_STORAGE_FS
is defined.

So remove the virStorageBackendUmountLocal from storage_util.c,h and
restore the code back in the storage_backend_fs.c and _vstorage.c
modules.
2017-01-26 11:43:30 -05:00
Olga Krishtal
479a2f16f1 storage: Introduce Virtuozzo vstorage pool and volume APIs
Added create/define/etc pool operations for vstorage backend.

Used the common/local pool API's from storage_util for operations
that are not specific to vstorage. In particular Refresh and Delete
Pool operations as well as all the Volume operations.

Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
2017-01-26 10:43:42 -05:00
Olga Krishtal
e590d5301e storage: Introduce Virtuozzo vstorage backend
Added general definitions for vstorage pool backend including
the build options to add --with-storage-vstorage checking.
In order to use vstorage as a backend for a storage pool
vstorage tools (vstorage and vstorage-mount) need to be installed.

Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
2017-01-26 10:43:42 -05:00
John Ferlan
1452c85fb7 storage: Create common file/dir volume backend helpers
Move all the volume functions to storage_util to create local/common helpers
using the same naming syntax as the existing upload, download, and wipe
virStorageBackend*Local API's.

In the process of doing so, found more API's that can now become local
to storage_util. In order to distinguish between local/external - I
changed the names of the now local only ones from "virStorageBackend..."
to just "storageBackend..."

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-26 10:40:05 -05:00
John Ferlan
5f07c3c079 storage: Create common file/dir pool backend helpers
Move some pool functions to storage_util to create local/common helpers
using the same naming syntax as the existing upload, download, and wipe
virStorageBackend*Local API's.

In the process of doing so, found a few API's that can now become local
to storage_util. In order to distinguish between local/external - I
changed the names of the now local only ones from "virStorageBackend..."
to just "storageBackend..."

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-26 10:40:05 -05:00
John Ferlan
e26c21629e storage: Move the virStorageBackendFileSystem{Start|Stop} API's
Just moving code around with minor adjustment to have the Stop
code combine with the Unmount code since all the Stop code did
was call the Unmount code.
2017-01-26 10:40:05 -05:00
Michal Privoznik
572eda12ad qemu: Implement mtu on interface
Not only we should set the MTU on the host end of the device but
also let qemu know what MTU did we set.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-26 10:00:01 +01:00
Michal Privoznik
b020cf73fe domain_conf: Introduce <mtu/> to <interface/>
So far we allow to set MTU for libvirt networks. However, not all
domain interfaces have to be plugged into a libvirt network and
even if they are, they might want to have a different MTU (e.g.
for testing purposes).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-26 09:59:56 +01:00
Michal Privoznik
eebec1697e virDomainNetDefParseXML: s/ret/rv/
We use @ret to hold the actual return value of the function we
are currently in. To hold a return value of a function called we
use different variables: @rv, @rc, etc. Honour this naming
scheme in virDomainNetDefParseXML too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-25 09:18:49 +01:00
Chen Hanxiao
f97a8a3284 THREADS.txt: fix typos
s/wakup/wakeup

Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
2017-01-25 09:18:49 +01:00
Jim Fehlig
b4386fdac7 xenconfig: add support for more timers
Currently xenconfig only supports the hpet timer for HVM domains.
Include support for tsc timer for both PV and HVM domains.
2017-01-24 16:18:13 -07:00
Jim Fehlig
87df87e06b libxl: support emulate mode of tsc timer
While at it, use members of libxl_tsc_mode enum instead of literal
int values.
2017-01-24 16:18:13 -07:00
Jim Fehlig
6e4759d069 libxl: fix timer configuration
The current logic around configuring timers in libxl based on
virDomainDef object is a bit brain dead. Unsupported timers are
silently ignored and tsc is only recognized if it is the first
timer specified.

Change the logic to reject unsupported timers and honor the tsc
timer regardless of its order when multiple timers are specified.
2017-01-24 16:18:13 -07:00
Shivaprasad G Bhat
bd12889616 util: Forbid resetting non-endpoint devices
It is destructive to attempt reset on a pci- or cardbus-bridge, the
host can crash.  The bridges won't contain any guest data and neither
they can be passed through using vfio/stub.  So, no point in allowing a
reset on them.

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
2017-01-23 17:23:12 +01:00
Shivaprasad G Bhat
bec9b9b01a util: Forbid assigning a pci-bridge to a guest
Non-endpoint devices like pci-bridges cannot be assigned to guests.
Prevent such attempts.

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
2017-01-23 17:23:03 +01:00
John Ferlan
5f580d82f3 util: Fix typo in previous commit
Should be Unlock not Lock... Bad fingers.
2017-01-21 12:46:09 -05:00
Wang King
a563451e2b util: unlock closeCallbacks if get callbacks for connect fail
Avoid return with the closeCallbacks locked when get callbacks list for connect fail.

Signed-off-by: Wang King <king.wang@huawei.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-21 12:39:52 -05:00
Chen Hanxiao
980f2a35c7 qemu_domain: add timestamp in tainting of guests log
We lacked of timestamp in tainting of guests log,
which bring troubles for finding guest issues:
such as whether a guest powerdown caused by qemu-monitor-command
or others issues inside guests.
If we had timestamp in tainting of guests log,
it would be helpful when checking guest's /var/log/messages.

Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
2017-01-21 12:34:19 -05:00
Michal Privoznik
0cacdc6f24 virDomainHostdevSubsysSCSIVHostDefParseXML: Don't leak @protocol
==24748== 12 bytes in 2 blocks are definitely lost in loss record 25 of 84
==24748==    at 0x4C2BF80: malloc (vg_replace_malloc.c:296)
==24748==    by 0x1A1E1E78: xmlStrndup (in /usr/lib64/libxml2.so.2.9.4)
==24748==    by 0x18D0495F: virXMLPropString (virxml.c:506)
==24748==    by 0x18D1FB3E: virDomainHostdevSubsysSCSIVHostDefParseXML (domain_conf.c:6280)
==24748==    by 0x18D20350: virDomainHostdevDefParseXMLSubsys (domain_conf.c:6450)
==24748==    by 0x18D34E7D: virDomainHostdevDefParseXML (domain_conf.c:13218)
==24748==    by 0x18D42598: virDomainDefParseXML (domain_conf.c:17745)
==24748==    by 0x18D440A9: virDomainDefParseNode (domain_conf.c:18236)
==24748==    by 0x18D43EFA: virDomainDefParse (domain_conf.c:18180)
==24748==    by 0x18D43FA0: virDomainDefParseFile (domain_conf.c:18206)
==24748==    by 0x44EDA1: testCompareDomXML2XMLFiles (testutils.c:1140)
==24748==    by 0x4365F8: testXML2XMLActive (qemuxml2xmltest.c:59)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-20 16:31:05 +01:00
Jiri Denemark
6cb204b7ac qemu: Reset hostModelInfo in virQEMUCapsReset
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2017-01-20 15:52:56 +01:00
Michal Privoznik
57b5e27d3d qemu: set default vhost-user ifname
Based on work of Mehdi Abaakouk <sileht@sileht.net>.

When parsing vhost-user interface XML and no ifname is found we
can try to fill it in in post parse callback. The way this works
is we try to make up interface name from given socket path and
then ask openvswitch whether it knows the interface.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-20 15:42:12 +01:00
Peter Krempa
1d4fd2dd0f qemu: hotplug: Properly emit "DEVICE_DELETED" event when unplugging memory
The event needs to be emitted after the last monitor call, so that it's
not possible to find the device in the XML accidentally while the vm
object is unlocked.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1414393
2017-01-20 14:24:35 +01:00
Roman Bogorodskiy
6a9c7468de bhyve: fix interface type handling for argv2xml
When generating a domain XML from native command (i.e. via
the connectDomainXMLFromNative call), we should use
interface type 'bridge' rather than 'ethernet' because we only
support bridges at this point.

As we don't have bridge name explicitly specified on the command line,
just use 'virbr0' as a default.
2017-01-19 20:19:44 +04:00
Daniel P. Berrange
2e045a4f9b storage: avoid use of undefined GLUSTER_CLI variable
Previous commit tried to change configure logic such that the
GLUSTER_CLI parameter would always be set:

  commit 9e97c8c0f0
  Author: Peter Krempa <pkrempa@redhat.com>
  Date:   Mon Jan 9 15:56:12 2017 +0100

    storage: gluster: Remove build-time dependency on the 'gluster' cli tool

This missed the fact that the AC_PATH_PROG call was itself inside an 'if'
conditional that would not be called in with_storage_gluster was false. As
a result, GLUSTER_CLI was still conditionally defined.

Just kill the GLUSTER_CLI parameter and AC_PATH_PROG call entirely and pass a
bare "gluster" string to virFindFileInPath instead.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-19 10:56:54 +00:00
Daniel P. Berrange
b9cc6316c0 qemu: catch failure of drive_add
Previously when QEMU failed "drive_add" due to an error opening
a file it would report

  "could not open disk image"

These days though, QEMU reports

  "Could not open '/tmp/virtd-test_e3hnhh5/disk1.qcow2': Permission denied"

which we were not detecting as an error condition.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-19 10:56:53 +00:00
Peter Krempa
01d9c3497c storage: sheepdog: Split out functions required for tests
Separate the headers so that functions only required for testing of the
sheepdog backend are separated into their own file.
2017-01-19 09:25:51 +01:00
Peter Krempa
ebc8564c1a storage: scsi: Remove private constants from header
They are used only in the SCSI backend driver so there's no need to
pollute the headers.
2017-01-19 09:25:51 +01:00
Peter Krempa
0de123c84e storage: scsi: Fix build if SCSI backend is disabled but iSCSI is enabled
The iSCSI backend driver was using stuff from the SCSI driver without
making sure that it's compiled in. Move the common code into the
storage_util.c since it does not contain any specific code.
2017-01-19 09:25:51 +01:00
Peter Krempa
d66dda6504 storage: fs: Compile file backends even if filesystem support is disabled
The file backend code was mistakenly put into #if WITH_STORAGE_FS. This
is not necessary since all the backends just access files on disk, and
thus the code for WITH_STORAGE_DIR is sufficient to compile everything.
2017-01-19 09:25:51 +01:00
Peter Krempa
46e8049c15 storage: Split utility functions from storage_backend.(ch)
The file became a garbage dump for all kinds of utility functions over
time. Move them to a separate file so that the files can become a clean
interface for the storage backends.
2017-01-19 09:25:51 +01:00
Peter Krempa
4417481d8a storage: Remove common code from specific driver backend
The storage driver helper functions that deal with parted were put into
the disk backend code but are used commonly across.
2017-01-19 09:25:51 +01:00
Boris Fiuczynski
666bee3973 nodedev: Fabric name must not be required for fc_host capability
fabric_name is one of many fc_host attributes in Linux that is optional
and left to the low-level driver to decide if it is implemented.
The zfcp device driver does not provide a fabric name for an fcp host.

This patch removes the requirement for a fabric name by making it optional.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2017-01-18 06:31:54 -05:00
Boris Fiuczynski
d59226926e util: add file exists check in virReadFCHost
File open errors are prevented by a file exists check before
virFileReadAll is called since all callers of the virReadFCHost
method handle errors themselves based on the NULL return anyway.
Also included is a minor spelling correction in a comment.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2017-01-18 06:25:55 -05:00
John Ferlan
0d157b3fed disk: Fixup error handling path for devmapper when part_separator='yes'
https://bugzilla.redhat.com/show_bug.cgi?id=1346566

If libvirt_parthelper is erroneously told to append the partition
separator 'p' onto the generated output for a disk pool using device
mapper that has 'user_friendly_names' set to true, then the error
recovery path will fail to find volume resulting in the pool being
in an unusable state.

So, augment the documentation to provide the better hint that the
part_separator='yes' should be set when user_friendly_names are not
being used. Additionally, once we're in the error path where the
returned name doesn't match the expected partition name try to see
if the reason is because the 'p' was erroneosly added. If so alter
the about to be removed vol->target.path so that the DiskDeleteVol
code can find the partition that was created and remove it.
2017-01-18 06:17:36 -05:00
John Ferlan
9508682ba0 storage: Allow probe of volume capacity for BLOCK type
If the voldef type is VIR_STORAGE_VOL_BLOCK, then as long as the
format is known, let's allow the probe to happen - gets a truer value
and the same probe/update would be allowed for the same volume defined
in a domain.
2017-01-18 06:09:38 -05:00
John Ferlan
d04bb05fb7 storage: Fix virStorageBackendUpdateVolTargetInfo type check
For volume processing in virStorageBackendUpdateVolTargetInfo to get
the capacity commit id 'a760ba3a7' added the ability to probe a volume
that didn't list a target format. Unfortunately, the code used the
virStorageSource  (e.g. target->type - virStorageType) rather than
virStorageVolDef (e.g. vol->type - virStorageVolType) in order to
make the comparison. As it turns out target->type for a volume is
not filled in at all for a voldef as the code relies on vol->type.
Ironically the result is that only VIR_STORAGE_VOL_BLOCK's would get
their capacity updated.

This patch will adjust the code to check the "vol->type" field instead
as an argument. This way for a voldef, the correct comparison is made.

Additionally for a backingStore, the 'type' field is never filled in;
however, since we know that the provided path is a location at which
the backing store can be accessed on the local filesystem thus just
pass VIR_STORAGE_VOL_FILE in order to satisfy the adjusted voltype
check. Whether it's a FILE or a BLOCK only matters if we're trying to
get more data based on the target->format.
2017-01-18 06:09:38 -05:00
Peter Krempa
9e97c8c0f0 storage: gluster: Remove build-time dependency on the 'gluster' cli tool
The tool is used for pool discovery. Since we call an external binary we
don't really need to compile out the code that uses it. We can check
whether it exists at runtime.
2017-01-18 10:45:15 +01:00
Peter Krempa
ce5055d7bc storage: gluster: Report error if no volumes were found in pool lookup
Similarly to the 'netfs' pool, return an error if the host does not have
any volumes.
2017-01-18 10:45:15 +01:00
Peter Krempa
7bdb4b8fda storage: Fix error reporting when looking up storage pool sources
In commit 4090e15399 we went back from reporting no errors if no storage
pools were found on a given host to reporting a bad error. And only in
cases when gluster was not installed.

Report a less bad error in case there are no volumes. Also report the
error when gluster is installed but no volumes were found, since
virStorageBackendFindGlusterPoolSources would return success in that
case.
2017-01-18 10:45:15 +01:00
Peter Krempa
9d14cf595a qemu: Move cpu hotplug code into qemu_hotplug.c
Move all the worker code into the appropriate file. This will also allow
testing of cpu hotplug.
2017-01-18 09:57:06 +01:00
Peter Krempa
5570f26763 qemu: Prepare for reuse of qemuDomainSetVcpusLive
Extract the call to qemuDomainSelectHotplugVcpuEntities outside of
qemuDomainSetVcpusLive and decide whether to hotplug or unplug the
entities specified by the cpumap using a boolean flag.

This will allow to use qemuDomainSetVcpusLive in cases where we prepare
the list of vcpus to enable or disable by other means.
2017-01-18 09:57:06 +01:00
Peter Krempa
15727562a6 util: json: Add helper to reformat JSON strings
For use in test cases it will be helpful to allow reformatting JSON
strings. Add a wrapper on top of the parser and formatter to achieve
this.
2017-01-18 09:57:06 +01:00
Peter Krempa
5cd670fea8 qemu: monitor: More strict checking of 'query-cpus' if hotplug is supported
In cases where CPU hotplug is supported by qemu force the monitor to
reject invalid or broken responses to 'query-cpus'. It's expected that
the command returns usable data in such case.
2017-01-18 09:57:06 +01:00
Erik Skultety
7e8b2da74f security: SELinux: fix the transaction model's list append
The problem is in the way how the list item is created prior to
appending it to the transaction list - the @path argument is just a
shallow copy instead of deep copy of the hostdev device's path.
Unfortunately, the hostdev devices from which the @path is extracted, in
order to add them into the transaction list, are only temporary and
freed before the buildup of the qemu namespace, thus making the @path
attribute in the transaction list NULL, causing 'permission denied' or
'double free' or 'unknown cause' errors.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1413773

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2017-01-17 15:49:57 +01:00
Erik Skultety
df7f42d5be security: DAC: fix the transaction model's list append
The problem is in the way how the list item is created prior to
appending it to the transaction list - the @path attribute is just a
shallow copy instead of deep copy of the hostdev device's path.
Unfortunately, the hostdev devices from which the @path is extracted, in
order to add them into the transaction list, are only temporary and
freed before the buildup of the qemu namespace, thus making the @path
attribute in the transaction list NULL, causing 'permission denied' or
'double free' or 'unknown cause' errors.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1413773

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2017-01-17 15:49:57 +01:00
Jiri Denemark
f66b185c46 qemu: Don't leak hostCPUModelInfo in virQEMUCaps
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2017-01-17 14:36:52 +01:00
Michal Privoznik
d0baf54e53 qemu: Actually unshare() iff running as root
https://bugzilla.redhat.com/show_bug.cgi?id=1413922

While all the code that deals with qemu namespaces correctly
detects whether we are running as root (and turn into NO-OP for
qemu:///session) the actual unshare() call is not guarded with
such check. Therefore any attempt to start a domain under
qemu:///session shall fail as unshare() is reserved for root.

The fix consists of moving unshare() call (for which we have a
wrapper called virProcessSetupPrivateMountNS) into
qemuDomainBuildNamespace() where the proper check is performed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
2017-01-17 13:23:56 +01:00
Daniel P. Berrange
2d0c4947ab Revert "perf: Add cache_l1d perf event support"
This reverts commit ae16c95f1b.
2017-01-16 16:54:34 +00:00
John Ferlan
40ec4ff658 storage: Alter error message in probe/empty checks
For case VIR_STORAGE_BLKID_PROBE_DIFFERENT, clean up the message to
avoid using the virsh like --overwrite syntax. Additionally provide
a different error message when not writing the label to avoid confusion.
2017-01-14 10:13:05 -05:00
John Ferlan
f462f9ad6e storage: Clean up return value checking
Rather than special casing the VIR_STORAGE_BLKID_PROBE_UNKNOWN after
calling virStorageBackendBLKIDFindPart, just allow the switch statement
handle setting ret = -2.
2017-01-14 10:13:05 -05:00
John Ferlan
d1f5dfc416 storage: Alter logic when both BLKID and PARTED unavailable
If neither BLKID or PARTED is available and we're not writing, then
just return 0 which allows the underlying storage pool to generate
a failure. If both are unavailable and we're writing, then generate
a more generic error message.
2017-01-14 10:13:05 -05:00
Collin L. Walling
e8a43f1995 qemu-capabilities: Fix query-cpu-model-expansion on s390 with older kernel
When running on s390 with a kernel that does not support cpu model checking and
with a Qemu new enough to support query-cpu-model-expansion, the gathering of qemu
capabilities will fail. Qemu responds to the query-cpu-model-expansion qmp
command with an error because the needed kernel ioct does not exist. When this
happens a guest cannot even be defined due to missing qemu capabilities data.

This patch fixes the problem by silently ignoring generic errors stemming from
calls to query-cpu-model-expansion.

Reported-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
2017-01-13 16:55:58 +01:00
Michal Privoznik
93a062c3b2 qemu: Copy SELinux labels for namespace too
When creating new /dev/* for qemu, we do chown() and copy ACLs to
create the exact copy from the original /dev. I though that
copying SELinux labels is not necessary as SELinux will chose the
sane defaults. Surprisingly, it does not leaving namespace with
the following labels:

crw-rw-rw-. root root system_u:object_r:tmpfs_t:s0     random
crw-------. root root system_u:object_r:tmpfs_t:s0     rtc0
drwxrwxrwt. root root system_u:object_r:tmpfs_t:s0     shm
crw-rw-rw-. root root system_u:object_r:tmpfs_t:s0     urandom

As a result, domain is unable to start:

error: internal error: process exited while connecting to monitor: Error in GnuTLS initialization: Failed to acquire random data.
qemu-kvm: cannot initialize crypto: Unable to initialize GNUTLS library: Failed to acquire random data.

The solution is to copy the SELinux labels as well.

Reported-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-13 14:45:52 +01:00
Peter Krempa
a80957c518 Revert "storage: Validate the device formats at logical startup"
The check is pointless since LVM is capable to detect it's own members
and the check is flawed as it would fail if neither libblkid nor parted
is installed.

We don't really need to babysit LVM in this way.

This reverts commit cb38b6cbc7.
2017-01-13 09:28:28 +01:00
Peter Krempa
9538dff96f Revert "storage: For FS pool check for properly formatted target volume"
The check does not work properly (crashes) with netfs filesystems and
also checking that a device is not empty when attempting to mount a
filesystem is not very usefull since the mount will fail anyways.

As the code would improve only a very minor corner case I don't really
see a reason to have this code at all.

This code would also fail if libvirt is compiled without support for
blkid and without parted.

This reverts commit a11fd69735.
2017-01-13 09:28:28 +01:00
Jim Fehlig
ecb587e4ca libxl: always enable pae for x86_64 HVM
For HVM domains, pae is only set in libxl_domain_build_info when
explicitly specified in the hypervisor <features> config. This is
fine for i686 machines, but is incorrect behavior for x86_64 machines
where pae must always be enabled. See the following discussion for
additional details

https://www.redhat.com/archives/libvir-list/2017-January/msg00254.html
2017-01-12 18:42:39 -07:00
Jiri Denemark
19e06cfa25 qemu: Ignore non-boolean CPU model properties
The query-cpu-model-expansion is currently implemented for s390(x) only
and all CPU properties it returns are booleans. However, x86
implementation will report more types of properties. Without making the
code more tolerant older libvirt would fail to probe newer QEMU
versions.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2017-01-12 11:58:25 +01:00
Jiri Denemark
ec23791517 qemu: Don't check CPU model property key
The qemuMonitorJSONParseCPUModelProperty function is a callback for
virJSONValueObjectForeachKeyValue and is called for each key/value pair,
thus it doesn't really make sense to check whether key is NULL.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2017-01-12 11:58:25 +01:00
Michal Privoznik
cbc45525cb qemuDomainCreateDevice: Canonicalize paths
So far the decision whether /dev/* entry is created in the qemu
namespace is really simple: does the path starts with "/dev/"?
This can be easily fooled by providing path like the following
(for any considered device like disk, rng, chardev, ..):

  /dev/../var/lib/libvirt/images/disk.qcow2

Therefore, before making the decision the path should be
canonicalized.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 18:08:13 +01:00
Michal Privoznik
49f326edc0 qemu: Use namespaces iff available on the host kernel
So far the namespaces were turned on by default unconditionally.
For all non-Linux platforms we provided stub functions that just
ignored whatever namespaces setting there was in qemu.conf and
returned 0 to indicate success. Moreover, we didn't really check
if namespaces are available on the host kernel.

This is suboptimal as we might have ignored user setting.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 18:07:43 +01:00
Michal Privoznik
41816751a7 util: Introduce virFileMoveMount
This is a simple wrapper over mount(). However, not every system
out there is capable of moving a mount point. Therefore, instead
of having to deal with this fact in all the places of our code we
can have a simple wrapper and deal with this fact at just one
place.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 18:06:30 +01:00
Michal Privoznik
cd32783cd4 lxc_container: Drop userns_supported
This is unnecessary wrapper around virProcessNamespaceAvailable().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 18:05:03 +01:00
Michal Privoznik
083fcd06d3 lxc: Move lxcContainerAvailable to virprocess
Other drivers (like qemu) would like to know if the namespaces
are available therefore it makes sense to move this function to
a shared module.

At the same time, this function had some default namespaces that
are checked with every call. It is not necessary - let callers
pass just those namespaces they are interested in.

With the move the function is renamed to
virProcessNamespaceAvailable.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 18:02:35 +01:00
Michal Privoznik
2ff8c30548 qemuDomainSetupAllInputs: Update debug message
Due to a copy-paste error, the debug message reads:

  Setting up disks

It should have been:

  Setting up inputs.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 17:39:24 +01:00
Cédric Bosdonnat
162bb0e7b1 libxl: fix usb inputs loop error
List indexes where mixed up in the code looping over the USB
input devices.
2017-01-11 17:07:25 +01:00
Pino Toscano
1a5de3fe2e remote: do not check for an existing config dir
When composing the path to the default known_hosts file (for the libssh
and libssh2 drivers), do not check whether the configuration directory
(determined by virGetUserConfigDirectory()) exists: both the drivers can
handle non-existing files, and are able to create them (and their
directories) in that case.

This adds a small behaviour change: before, the key for an unknown host,
and manually accepted, was saved only if the configuration directory
existed -- a bit incoherent behaviour though.
2017-01-11 13:38:04 +01:00
Pino Toscano
45c4a70c70 remote: fix logic for known_hosts and keyfile checks
If any of them is specified for the libssh and libssh2 drivers, there is
no need to depend on checks based on other paths: in particular, a
specified path for known_hosts was ignored if the local config directory
could not be determined, and the path for keyfile was ignored if the
home could not be determined.

Instead, lazily determine and use these two paths only in case they are
needed.
2017-01-11 13:37:45 +01:00
Pino Toscano
408a1ce5f8 rpc: libssh: allow a NULL known_hosts file
Make sure that virNetLibsshSessionSetHostKeyVerification accepts a NULL
value for the path to the known_hosts file:
- call ssh_options_set(SSH_OPTIONS_KNOWNHOSTS) anyway, using /dev/null,
  otherwise libssh will use its default path
- do not call ssh_write_knownhost when no known hosts file was set

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1406457
2017-01-11 13:37:24 +01:00
Laine Stump
5949b53aec conf: eliminate virDomainPCIAddressReleaseSlot() in favor of ...Addr()
Surprisingly there was a virDomainPCIAddressReleaseAddr() function
already, but it was completely unused. Since we don't reserve entire
slots at once any more, there is no need to release entire slots
either, so we just replace the single call to
virDomainPCIAddressReleaseSlot() with a call to
virDomainPCIAddressReleaseAddr() and remove the now unused function.

The keen observer may be concerned that ...Addr() doesn't call
virDomainPCIAddressValidate(), as ...Slot() did. But really the
validation was pointless anyway - if the device hadn't been suitable
to be connected at that address, it would have failed validation
before every being reserved in the first place, so by definition it
will pass validation when it is being unplugged. (And anyway, even if
something "bad" happened and we managed to have a device incorrectly
at the given address, we would still want to be able to free it up for
use by a device that *did* validate properly).
2017-01-11 05:00:34 -05:00
Laine Stump
6cc2014202 qemu: rename qemuDomainPCIAddressReserveNextSlot() to ...Addr()
This function doesn't actually reserve an entire slot any more, it
reserves a single PCI address, so this name is more appropriate.
2017-01-11 05:00:08 -05:00
Laine Stump
c5aea19d56 qemu: remove qemuDomainPCIAddressReserveNextAddr()
This function is only called in two places, and the function itself is
just adding a single argument and calling
virDomainPCIAddressReserveNextAddr(), so we can remove it and instead
call virDomainPCIAddressReserveNextAddr() directly. (The main
motivation for doing this is to free up the name so that
qemuDomainPCIAddressReserveNextSlot() can be renamed in the next
patch, as its current name is now inaccurate and misleading).
2017-01-11 04:59:42 -05:00
Laine Stump
27b0f971c4 conf: rename virDomainPCIAddressReserveSlot() to ...Addr()
This function doesn't actually reserve an entire slot any more, it
reserves a single PCI address, so this name is more appropriate.
2017-01-11 04:58:32 -05:00
Laine Stump
640ce18679 conf: rename virDomainPCIAddressReserveAddr() to ...Internal()
This is in preparation for renaming virDomainPCIAddressReserveSlot()
to virDomainPCIAddressReserveAddr(), which is a better description of
what it does.
2017-01-11 04:57:06 -05:00
Laine Stump
24c8c47230 conf: make virDomainPCIAddressReserveAddr() a static function
It is now only used in domain_addr.c.
2017-01-11 04:55:43 -05:00
Laine Stump
905859a6e5 qemu: replace virDomainPCIAddressReserveAddr with virDomainPCIAddressReserveSlot
All occurences of the former use fromConfig=true, and that's exactly
how virDomainPCIAddressReserveSlot() calls
virDomainPCIaddressReserveAddr(), so just use *Slot() so that *Addr()
can be made static to conf/domain_addr.c (both functions will be
renamed in upcoming patches).
2017-01-11 04:55:06 -05:00
Laine Stump
43f8147749 conf: eliminate virDomainPCIAddressReserveNextSlot()
Since we don't actually reserve an entire slot at a time anymore, the
name of this function is just confusing, and it's almost identical in
operation to virDomainPCIAddressReserveNextAddr() anyway, so remove
the *Slot() function and replace calls to it with calls to *Addr(...,
-1).
2017-01-11 04:53:48 -05:00
Laine Stump
e97fab2665 conf: rename virDomainPCIAddressGetNextSlot() to ...GetNextAddr()
With the advent of VIR_PCI_CONNECT_AGGREGATE_SLOT, the new name is
more appropriate, since the address returned may be another address
on the same slot as last time, not necessarily a new slot.
2017-01-11 04:52:19 -05:00
Laine Stump
b59bbdba4b conf: fix fromConfig argument to virDomainPCIAddressValidate()
fromConfig should be true if the caller wants
virDomainPCIAddressValidate() to loosen restrictions on its
interpretation of the pciConnectFlags. In particular, either
PCI_DEVICE or PCIE_DEVICE will be counted as equivalent to both, and
HOTPLUG will be ignored. In a few cases where libvirt was manually
overriding automatic address assignment, it was setting fromConfig to
false when validating the hardcoded manual override. This patch
changes those to fromConfig=true as a preemptive strike against any
future bugs that might otherwise surface.
2017-01-11 04:51:54 -05:00
Laine Stump
79901543b9 conf: fix fromConfig argument to virDomainPCIAddressReserveAddr()
Although setting virDomainPCIAddressReserveAddr()'s fromConfig=true is
correct when a PCI addres is coming from a domain's config, the *true*
purpose of the fromConfig argument is to lower restrictions on what
kind of device can plug into what kind of controller - if fromConfig
is true, then a PCIE_DEVICE can plug into a slot that is marked as
only compatible with PCI_DEVICE (and vice versa), and the HOTPLUG flag
is ignored.

For a long time there have been several calls to
virDomainPCIAddressReserveAddr() that have fromConfig incorrectly set
to false - it's correct that the addresses aren't coming from user
config, but they are coming from hardcoded exceptions in libvirt that
should, if anything, pay *even less* attention to following the
pciConnectFlags (under the assumption that the libvirt programmer knew
what they were doing).

See commit b87703cf7 for an example of an actual bug caused by the
incorrect setting of the "fromConfig" argument to
virDomainPCIAddressReserveAddr(). Although they haven't resulted in
any reported bugs, this patch corrects all the other incorrect
settings of fromConfig in calls to virDomainPCIAddressReserveAddr().
2017-01-11 04:47:12 -05:00
Laine Stump
147ebe6ddf conf: aggregate multiple pcie-root-ports onto a single slot
Set the VIR_PCI_CONNECT_AGGREGATE_SLOT flag for pcie-root-ports so
that they will be assigned to all the functions on a slot.

Some qemu test case outputs had to be adjusted due to the
pcie-root-ports now being put on multiple functions.
2017-01-11 04:45:57 -05:00
Laine Stump
48d39cf96d conf: aggregate multiple devices on a slot when assigning PCI addresses
If a PCI device has VIR_PCI_CONNECT_AGGREGATE_SLOT set in its
pciConnectFlags, then during address assignment we allow multiple
instances of this type of device to be auto-assigned to multiple
functions on the same device. A slot is used for aggregating multiple
devices only if the first device assigned to that slot had
VIR_PCI_CONNECT_AGGREGATE_SLOT set. but any device types that have
AGGREGATE_SLOT set might be mix/matched on the same slot.

(NB: libvirt should never set the AGGREGATE_SLOT flag for a device
type that might need to be hotplugged. Currently it is only planned
for pcie-root-port and possibly other PCI controller types, and none
of those are hotpluggable anyway)

There aren't yet any devices that use this flag. That will be in a
later patch.
2017-01-11 04:43:22 -05:00
Laine Stump
8f4008713a qemu: use virDomainPCIAddressSetAllMulti() to set multi when needed
If there are multiple devices assigned to the different functions of a
single PCI slot, they will not work properly if the device at function
0 doesn't have its "multi" attribute turned on, so it makes sense for
libvirt to turn it on during PCI address assignment. Setting multi
then assures that the new setting is stored in the config (so it will
be used next time the domain is started), preventing any potential
problems in the case that a future change in the configuration
eliminates the devices on all non-0 functions (multi will still be set
for function 0 even though it is the only function in use on the slot,
which has no useful purpose, but also doesn't cause any problems).

(NB: If we were to instead just decide on the setting for
multifunction at runtime, a later removal of the non-0 functions of a
slot would result in a silent change in the guest ABI for the
remaining device on function 0 (although it may seem like an
inconsequential guest ABI change, it *is* a guest ABI change to turn
off the multi bit).)
2017-01-11 04:42:08 -05:00
Laine Stump
3c1a0fc27d conf: new function virDomainPCIAddressSetAllMulti()
This utility function iterates through all devices looking for any
with a PCI address that has function != 0 (which implies that multiple
functions are in use on that slot), then uses an inner iterator to
find the device that's on function 0 of that same slot and sets the
"multi" in its virDomainDeviceInfo (as long as it hasn't already been
set explicitly by someone who presumably has better information than
we do).

It isn't yet called from anywhere, so will have no functional effect.
2017-01-11 04:40:24 -05:00
Laine Stump
66e0b08d34 conf: start search for next unused PCI address at same slot as previous find
There is a very slight time advantage to beginning the search for the
next unused PCI address at the slot *after* the previous find (which
is now used), but if we do that, we will miss allocating the other
functions of the same slot (when we implement a
VIR_PCI_CONNECT_AGGREGATE_SLOT flag to support that).
2017-01-11 04:39:08 -05:00
Laine Stump
99bf66f3fa conf: eliminate repetitive code in virDomainPCIAddressGetNextSlot()
virDomainPCIAddressGetNextSlot() starts searching from the last
allocated address and goes to the end of all the buses, then goes back
to the first bus and searches from there up to the starting point (in
case any address has been freed since the last time an address was
allocated. The loops for these two are almost, but not exactly, the
same, so they have remained as separate loops with the same code
inside the loop. To lessen maintenance headaches, the identical code
has been moved out into the function
virDomainPCIAddressFindUnusedFunctionOnBus(), which is called in place
of the loop contents.
2017-01-11 04:38:04 -05:00
Laine Stump
9ff9d9f5a9 conf: eliminate concept of "reserveEntireSlot"
setting reserveEntireSlot really accomplishes nothing - instead of
going to the trouble of computing the value for reserveEntireSlot and
then possibly setting *all* functions of the slot as in-use, we can
just set the in-use bit only for the specific function being used by a
device.  Later we will know from the context (the PCI connect flags,
and whether we are reserving a specific address or asking for "the
next available") whether or not it is okay to allocate other functions
on the same slot.

Although it's not used yet, we allow specifying "-1" for the function
number when looking for the "next available slot" - this is going to
end up meaning "return the lowest available function in the slot, but
since we currently only provide a function from an otherwise unused
slot, "-1" ends up meaning "0".
2017-01-11 04:36:34 -05:00
Laine Stump
9838cad9cd conf: use struct instead of int for each slot in virDomainPCIAddressBus
When keeping track of which functions of which slots are allocated, we
will need to have more information than just the current bitmap with a
bit for each function that is currently stored for each slot in a
virDomainPCIAddressBus. To prepare for adding more per-slot info, this
patch changes "uint8_t slots" into "virDomainPCIAddressSlot slot", which
currently has a single member named "functions" that serves the same
purpose previously served directly by "slots".
2017-01-11 04:29:48 -05:00
Cédric Bosdonnat
a30b08b717 libxl: define a per-domain logger.
libxl doesn't provide a way to write one log for each domain. Thus
we need to demux the messages. If our logger doesn't know to which
domain to attribute a message, then it will write it to the default
log file.

Starting with Xen 4.9 (commit f9858025 and following), libxl will
write the domain ID in an easy to grab manner. The logger introduced
by this commit will use it to demux the libxl log messages.

Thanks to the default log file, this logger will also work with older
versions of Xen.
2017-01-11 09:32:47 +01:00
Dawid Zamirski
73c6f16baf vbox: consolidate vbox IID structures.
* remove _vboxIID_v2_x and _vboxIID_v3_x structs and repalce with one
  _vboxIID as all supprted vbox versions have the same IID structure.
* remove vboxIIDUnion that was used to abstract version depended IID
  differences.
* remove IID_MEMBER macro and use the new vboxIID directly.
2017-01-10 19:20:08 -05:00
Dawid Zamirski
3628891789 vbox: fix _displayTakeScreenShotPNGToArray
This function was not implemented for vbox 5+ which removed
TakeScreenShotPNGToArray but provides TakeScreenShotToArray with
BitmapFormat_PNG argument which is the same thing.
2017-01-10 19:20:07 -05:00
Dawid Zamirski
5a5c6de3a3 vbox: IVRDxServer to IVRDEServer.
The IVRDxServer was used because vbox < 4 used to have IVRDPServer
whereas vbox >= 4 has IVRDEServer. Now that support for legacy
versions is being removed, we can use IVRDEServer.
2017-01-10 19:20:06 -05:00
Dawid Zamirski
f2f70c21d0 vbox: remove code dealing with oldMediumInterface
* removed oldMediumInterface flag and related code that was used for
  vbox 2.x
* remove accelerate2DVideo and networkRemoveInterface flags which were
  also conditionals for handling legacy vbox versions.
2017-01-10 19:20:06 -05:00
Dawid Zamirski
1d963578e8 vbox: remove domain events support.
this was implemented only for vbox 3 series and was mostly stubs
anyway.
2017-01-10 19:20:06 -05:00
Dawid Zamirski
374422ea1c vbox: remove getMachineForSession flag.
* the getMachineForSession is always true for 4.0+. This also means that
  checkflag argument in openSessionForMachine no longer has any meaning
  because it was or'ed with getMachineForSession (always true)
* remove supportScreenshot flag - vbox 4.0+ supports it
* remove detachDevicesExplicitly flag only relevant for < 4.0
2017-01-10 19:19:49 -05:00
Dawid Zamirski
d7f369b571 vbox: do not use IHardDisk anymore.
VirtualBox 4.0+ uses IMedium and IHardDisk is no longer used, so

* remove typef IMedium IHardDisk
* merge UIHardDisk into UIMedium
* update all references accordingly
2017-01-10 19:19:49 -05:00
Dawid Zamirski
c7c286c6bd vbox: remove _vboxAttachDrivesOld
and fold vboxAttachDrivesNew into vboxAttachDrives
2017-01-10 19:19:49 -05:00
Dawid Zamirski
c8d7e90fd6 vbox: remove code for old API versions.
This removes most of the code wrapped in VBOX_API_VERSION < 4000000
preprocessor checks. Those are the ones that can be safely removed
without needing to update driver code to accomodate it.
2017-01-10 19:19:46 -05:00
Dawid Zamirski
655a99f166 vbox: remove calls to *InstallUniformedAPI macros.
That is, for versions older than 4.0. Also do not try to include
headers for those old versions.
2017-01-10 19:14:53 -05:00
Dawid Zamirski
7f10ac33e9 vbox: remove SDK header files for vbox 3 and older.
* delete SDK header files for vbox older than 4.0
* delete .c files for vbox older than 4.0
* update vbox_XPCOMCGlue to use oldest supported header file, that is 4.0
  going forward.
* remove deleted files from Makefile.am
2017-01-10 19:14:33 -05:00
Michal Privoznik
3027bacf95 virSecuritySELinuxSetFileconHelper: Fix build with broken selinux.h
There are still some systems out there that have broken
setfilecon*() prototypes. Instead of taking 'const char *tcon' it
is taking 'char *tcon'. The function should just set the context,
not modify it.

We had been bitten with this problem before which resulted in
292d3f2d and subsequently b109c09765. However, with one my latest
commits (4674fc6afd) I've changed the type of @tcon variable to
'const char *' which results in build failure on the systems from
above.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 19:23:49 +01:00
Michal Privoznik
269589146c qemu_domain: Move qemuDomainGetPreservedMounts
This function is used only from code compiled on Linux. Therefore
on non-Linux platforms it triggers compilation error:

../../src/qemu/qemu_domain.c:209:1: error: unused function 'qemuDomainGetPreservedMounts' [-Werror,-Wunused-function]

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 19:23:49 +01:00
Peter Krempa
b469853812 qemu: blockjob: Fix locking of block copy/active block commit
For the blockjobs, where libvirt is able to track the state internally
we can fix locking of images we can remove the appropriate locks.

Also when doing a pivoting operation we should not acquire the lock on
any of those images since both are actually locked already.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1302168
2017-01-10 19:12:19 +01:00
Peter Krempa
f61e40610d qemu: snapshot: Properly handle image locking
Images that became the backing chain of the current image due to the
snapshot need to be unlocked in the lock manager. Also if qemu was
paused during the snapshot the current top level images need to be
released until qemu is resumed so that they can be acquired properly.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1191901
2017-01-10 19:12:19 +01:00
Peter Krempa
cbb4d229de qemu: snapshot: Refactor snapshot rollback on failure
The code at first changed the definition and then rolled it back in case
of failure. This was ridiculous. Refactor the code so that the image in
the definition is changed only when the snapshot is successful.

The refactor will also simplify further fix of image locking when doing
snapshots.
2017-01-10 19:12:19 +01:00
Peter Krempa
7456c4f5f0 qemu: snapshot: Don't redetect backing chain after snapshot
Libvirt is able to properly model what happens to the backing chain
after a snapshot so there's no real need to redetect the data.
Additionally with the _REUSE_EXT flag this might end up in redetecting
wrong data if the user puts wrong backing chain reference into the
snapshot image.
2017-01-10 19:12:19 +01:00
Jim Fehlig
a05e2570c9 libxl: implement virDomainGetMaxVcpus
The libxl driver already supports getting maximum vcpu count via
libxlDomainGetVcpusFlags, allowing to trivially implement
virDomainGetMaxVcpus.
2017-01-10 11:07:08 -07:00
John Ferlan
bdd371c5c5 storage: Fix storage_backend probing when PARTED not installed.
Commit id 'a48c674f' caused problems for systems without PARTED installed.

So move the PARTED probing code back to storage_backend_disk.c and create
a shim within storage_backend.c to call it if WITH_STORAGE_DISK is true;
otherwise, just return -1 with the error.
2017-01-10 10:20:17 -05:00
John Ferlan
cb38b6cbc7 storage: Validate the device formats at logical startup
At startup time, rather than blindly trusting the target devices are
still properly formatted, let's check to make sure the pool's target
devices are all properly formatted before attempting to start the pool.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
f573f84eb7 storage: Add overwrite flag checking for logical pool
https://bugzilla.redhat.com/show_bug.cgi?id=1373711

Add support and documentation for the [NO_]OVERWRITE flags for the
logical backend.

Update virsh.pod with a description of the process for usage of
the flags and building of the pool's volume group.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
d5cc5f8997 storage: Extract logical device initialize into a helper
Make the remaining code a bit cleaner.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
71a08b5a5a storage: Clean up logical pool devices on build failure
If the build fails, then we need to ensure that we've run pvremove
on any devices which we've run pvcreate on; otherwise, a subsequent
build could fail since running pvcreate twice on a device requires
special force arguments.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
a4cb4a74f9 storage: Adjust disk label found to match labels
Currently as long as the disk is formatted using a known parted format
type, the algorithm is happy to continue. However, that leaves a scenario
whereby a disk formatted using "pc98" could be used by a pool that's defined
using "dvh" (or vice versa). Alter the check to be match and different
and adjust the caller.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
a48c674fba storage: Move and rename disk backend label checking
Rather than have the Disk code having to use PARTED to determine if
there's something on the device, let's use the virStorageBackendDeviceProbe.
and only fallback to the PARTED probing if the BLKID code isn't built in.

This will also provide a mechanism for the other current caller (File
System Backend) to utilize a PARTED parsing algorithm in the event that
BLKID isn't built in to at least see if *something* exists on the disk
before blindly trying to use. The PARTED error checking will not find
file system types, but if there is a partition table set on the device,
it will at least cause a failure.

Move virStorageBackendDiskValidLabel and virStorageBackendDiskFindLabel
to storage_backend and rename/rework the code to fit the new model.

Update the virsh.pod description to provide a more generic description
of the process since we could now use either blkid or parted to find
data on the target device.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
a11fd69735 storage: For FS pool check for properly formatted target volume
Prior to starting up, let's be sure the target volume device is
formatted as we expect; otherwise, inhibit the start.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
19ced38f1c storage: Add writelabel bool for virStorageBackendDeviceProbe
It's possible that the API could be called from a startup path in
order to check whether the label on the device matches what our
format is. In order to handle that condition, add a 'writelabel'
boolean to the API in order to indicate whether a write or just
read is about to happen.

This alters two "error" conditions that would care about knowing.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
a22e1a0032 storage: Add partition type checks for BLKID probing
A device may be formatted using some sort of disk partition format type.
We can check that using the blkid_ API's as well - so alter the logic to
allow checking the device for both a filesystem and a disk partition.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
f23d4bbce3 storage: Fix implementation of no-overwrite for file system backend
https://bugzilla.redhat.com/show_bug.cgi?id=1363586

Commit id '27758859' introduced the "NO_OVERWRITE" flag check for
file system backends; however, the implementation, documentation,
and algorithm was inconsistent. For the "flag" description for the
API the flag was described as "Do not overwrite existing pool";
however, within the storage backend code the flag is described
as "it probes to determine if filesystem already exists on the
target device, renurning an error if exists".

The code itself was implemented using the paradigm to set up the
superblock probe by creating a filter that would cause the code
to only search for the provided format type. If that type wasn't
found, then the algorithm would return success allowing the caller
to format the device. If the format type already existed on the
device, then the code would fail indicating that the a filesystem
of the same type existed on the device.

The result is that if someone had a file system of one type on the
device, it was possible to overwrite it if a different format type
was specified in updated XML effectively trashing whatever was on
the device already.

This patch alters what NO_OVERWRITE does for a file system backend
to be more realistic and consistent with what should be expected when
the caller requests to not overwrite the data on the disk.

Rather than filter results based on the expected format type, the
code will allow success/failure be determined solely on whether the
blkid_do_probe calls finds some known format on the device. This
adjustment also allows removal of the virStoragePoolProbeResult
enum that was under utilized.

If it does find a formatted file system different errors will be
generated indicating a file system of a specific type already exists
or a file system of some other type already exists.

In the original virsh support commit id 'ddcd5674', the description
for '--no-overwrite' within the 'pool-build' command help output
has an ambiguous "of this type" included in the short description.
Compared to the longer description within the "Build a given pool."
section of the virsh.pod file it's more apparent that the meaning
of this flag would cause failure if a probe of the target already
has a filesystem.

So this patch also modifies the short description to just be the
antecedent of the 'overwrite' flag, which matches the API description.
This patch also modifies the grammar in virsh.pod for no-overwrite
as well as reworking the paragraph formats to make it easier to read.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
John Ferlan
553d21da6c storage: Introduce virStorageBackendDeviceIsEmpty
Rename virStorageBackendFileSystemProbe and to virStorageBackendBLKIDFindFS
and move to the more common storage_backend module.

Create a shim virStorageBackendDeviceIsEmpty which will make the call
to the virStorageBackendBLKIDFindFS and check the return value.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-10 08:44:50 -05:00
Michal Privoznik
406e390962 qemu: Drop qemuDomainDeleteNamespace
After previous commits, this function is no longer needed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:57 +01:00
Michal Privoznik
5d198c2b2c qemuDomainCreateNamespace: move mkdir to qemuDomainBuildNamespace
Again, there is no need to create /var/lib/libvirt/$domain.*
directories in CreateNamespace(). It is sufficient to create them
as soon as we need them which is in BuildNamespace. This way we
don't leave them around for the whole lifetime of domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:57 +01:00
Michal Privoznik
5d30057695 qemuDomainGetPreservedMounts: Do not special case /dev
The c1140eb9e got me thinking. We don't want to special case /dev
in qemuDomainGetPreservedMounts(), but in all other places in the
code we special case it anyway. I mean,
/var/run/libvirt/$domain.dev path is constructed separately just
so that it is not constructed here. It makes only a little sense
(if any at all).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:57 +01:00
Michal Privoznik
40ebbf72d5 qemuDomainCreateNamespace: s/unlink/rmdir/
If something goes wrong in this function we try a rollback. That
is unlink all the directories we created earlier. For some weird
reason unlink() was called instead of rmdir().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:57 +01:00
Michal Privoznik
095f042ed6 qemu: Use transactions from security driver
So far if qemu is spawned under separate mount namespace in order
to relabel everything it needs an access to the security driver
to run in that namespace too. This has a very nasty down side -
it is being run in a separate process, so any internal state
transition is NOT reflected in the daemon. This can lead to many
sleepless nights. Therefore, use the transaction APIs so that
libvirt developers can sleep tight again.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:11 +01:00
Michal Privoznik
4674fc6afd security_selinux: Implement transaction APIs
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 12:50:00 +01:00
Michal Privoznik
67232478db security_dac: Implement transaction APIs
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 12:50:00 +01:00
Michal Privoznik
95576b4df0 security driver: Introduce transaction APIs
With our new qemu namespace code in place, the relabelling of
devices is done not as good is it could: a child process is
spawned, it enters the mount namespace of the qemu process and
then runs desired API of the security driver.

Problem with this approach is that internal state transition of
the security driver done in the child process is not reflected in
the parent process. While currently it wouldn't matter that much,
it is fairly easy to forget about that. We should take the extra
step now while this limitation is still fresh in our minds.

Three new APIs are introduced here:
  virSecurityManagerTransactionStart()
  virSecurityManagerTransactionCommit()
  virSecurityManagerTransactionAbort()

The Start() is going to be used to let security driver know that
we are starting a new transaction. During a transaction no
security labels are actually touched, but rather recorded and
only at Commit() phase they are actually updated. Should
something go wrong Abort() aborts the transaction freeing up all
memory allocated by transaction.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 12:49:59 +01:00
Michal Privoznik
39779eb195 security_dac: Resolve virSecurityDACSetOwnershipInternal const correctness
The code at the very bottom of the DAC secdriver that calls
chown() should be fine with read-only data. If something needs to
be prepared it should have been done beforehand.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 12:49:59 +01:00
Andrea Bolognani
1d8454639f qemu: Use virtio-pci by default for mach-virt guests
virtio-pci is the way forward for aarch64 guests: it's faster
and less alien to people coming from other architectures.
Now that guest support is finally getting there (Fedora 24,
CentOS 7.3, Ubuntu 16.04 and Debian testing all support
virtio-pci out of the box), we'd like to start using it by
default instead of virtio-mmio.

Users and applications can already opt-in by explicitly using

  <address type='pci'/>

inside the relevant elements, but that's kind of cumbersome and
requires all users and management applications to adapt, which
we'd really like to avoid.

What we can do instead is use virtio-mmio only if the guest
already has at least one virtio-mmio device, and use virtio-pci
in all other situations.

That means existing virtio-mmio guests will keep using the old
addressing scheme, and new guests will automatically be created
using virtio-pci instead. Users can still override the default
in either direction.

Existing tests such as aarch64-aavmf-virtio-mmio and
aarch64-virtio-pci-default already cover all possible
scenarios, so no additions to the test suites are necessary.
2017-01-10 12:33:53 +01:00
Peter Krempa
a946ea1a33 qemu: setvcpus: Properly coldplug vcpus when hotpluggable vcpus are present
When coldplugging vcpus to a VM that already has a few hotpluggable
vcpus the code might generate invalid configuration as
non-hotpluggable cpus need to be clustered starting from vcpu 0.

This fix forces the added vcpus to be hotpluggable in such case.

Fixes a corner case described in:
https://bugzilla.redhat.com/show_bug.cgi?id=1370357
2017-01-10 10:47:06 +01:00
Nitesh Konkar
ae16c95f1b perf: Add cache_l1d perf event support
This patch adds support and documentation for
a generalized hardware cache event called cache_l1d
perf event.

Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
2017-01-09 18:15:31 -05:00
Jiri Denemark
dc2bfdc815 Update remote_protocol-structs for new events
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2017-01-09 19:53:55 +01:00
Daniel P. Berrange
42241208d9 secret: add support for value change events
Emit an event whenever a secret value changes

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 16:42:04 +00:00
Daniel P. Berrange
06fcee63cf secret: add support for lifecycle events
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 15:53:49 +00:00
Daniel P. Berrange
3b7bd6e540 remote: implement secret lifecycle event APIs
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 15:53:49 +00:00
Daniel P. Berrange
bd300b7194 conf: simplify internal virSecretDef handling of usage
The public virSecret object has a single "usage_id" field
but the virSecretDef object has a different 'char *' field
for each usage type, but the code all assumes every usage
type has a corresponding single string. Get rid of the
pointless union in virSecretDef and just use "usage_id"
everywhere. This doesn't impact public XML format, only
the internal handling.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 15:53:49 +00:00
Daniel P. Berrange
df740caf54 conf: add secret event handling
Add helper APIs / objects for managing secret events

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 15:53:48 +00:00
Daniel P. Berrange
34fd3caabf Introduce secret lifecycle event APIs
Add public APIs to allow applications to watch for define and
undefine of secret objects.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 15:53:48 +00:00
Daniel P. Berrange
89283c138e remote: fix struct for device removal failed event
The handler for the device removal failed event was using
the struct for the device added event. Fortunately the
layout was the same, so this was harmless.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 15:53:46 +00:00
Daniel P. Berrange
c50070173d Add domain event for metadata changes
When changing the metadata via virDomainSetMetadata, we now
emit an event to notify the app of changes. This is useful
when co-ordinating different applications read/write of
custom metadata.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 15:53:00 +00:00
Daniel P. Berrange
f1e48297cf cgroup: add virCgroupAddMachineTask stub for win32
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 14:27:34 +00:00
Daniel P. Berrange
44f79a0bd0 lxc: ensure libvirt_lxc and qemu-nbd move into systemd machine slice
Currently when spawning containers with systemd, the container PID 1
will get moved into the systemd machine slice. Libvirt then manually
moves the libvirt_lxc and qemu-nbd processes into the cgroups associated
with the slice, but skips the systemd controller cgroup. This means that
from systemd's POV, libvirt_lxc and qemu-nbd are still part of the
libvirtd.service unit.

On systemctl daemon-reload, it will notice that libvirt_lxc & qemu-nbd
are in the libvirtd.service unit for the systemd controller, but in the
machine cgroups for resources. Systemd will thus move them back into
the libvirtd.service resource cgroups next time libvirtd is restarted.
This causes libvirtd to kill off the container due to incorrect cgroup
placement.

The solution is to ensure that when moving libvirt_lxc & qemu-nbd, we
also move the systemd cgroup controller placement. Normally this is
not something we ever want todo, but this is a special case as we are
intentionally wanting to move them to a different systemd unit.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-09 12:46:52 +00:00
Michal Privoznik
65fb0b79f7 security_selinux: s/virSecuritySELinuxSecurity/virSecuritySELinux/
It doesn't make much sense to have two different prefix for
functions within the same driver.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-09 09:17:42 +01:00
Michal Privoznik
981d979047 virutil: Provide non-linux impl for virGetFCHostNameByFabricWWN
Currently, there's only linux implementation for
virGetFCHostNameByFabricWWN(). Since the symbol is exported in
our private symbols we ought to have implementation for other
platforms too. This also triggers compilation error on FreeBSD:

../src/.libs/libvirt_driver_storage_impl.a(libvirt_driver_storage_impl_la-storage_backend_scsi.o): In function `createVport':
/usr/home/jenkins/libvirt-master/systems/libvirt-freebsd/build/src/../../src/storage/storage_backend_scsi.c:740: undefined reference to `virGetFCHostNameByFabricWWN'

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-09 09:13:41 +01:00
Maxim Nestratov
af78cb0486 qemu: Allow to specify pit timer tick policy=discard
Separate out the "policy=discard" into it's own specific
qemu command line.

We'll rename "kvm-pit-device" test case to be "kvm-pit-discard"
since it has the syntax we'd be using.

Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
2017-01-06 18:27:06 -05:00
Maxim Nestratov
ef5c8bb412 qemu: Fix pit timer tick policy=delay
By a mistake, for the VIR_DOMAIN_TIMER_TICKPOLICY_DELAY qemu
command line creation, 'discard' was used instead of 'delay'
in commit id '1569fa14'.

Test "kvm-pit-delay" is fixed accordingly to show the correct
option being generated.

Remove the (now) redundant kvm-pit-device tests. As it turns
out there is no need to specify both QEMU_CAPS_NO_KVM_PIT and
QEMU_CAPS_KVM_PIT_TICK_POLICY since they are mutually exclusive
and "kvm-pit-device" becomes just the same as "kvm-pit-delay".

Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
2017-01-06 18:27:06 -05:00
John Ferlan
78be2e8b74 iscsi: Add parent wwnn/wwpn or fabric capability for createVport
https://bugzilla.redhat.com/show_bug.cgi?id=1349696

As it turns out using only the 'parent' to achieve the goal of a
consistent vHBA parent has issues with reboots where the scsi_hostX
parent could change to scsi_hostY causing either failure to create
the vHBA or usage of the wrong HBA for our vHBA.

Thus add the ability to search for the "parent" by the parent wwnn/
wwpn values or just a fabric_name if someone only cares to ensure
usage of the same SAN for the vHBA.
2017-01-06 17:15:34 -05:00
John Ferlan
8366cb0a20 util: Introduce virGetFCHostNameByFabricWWN
Create a utility routine in order to read the scsi_host fabric_name files
looking for a match to a passed fabric_name
2017-01-06 17:15:34 -05:00
John Ferlan
bb74a7ffeb conf: Add more fchost search fields for storage pool vHBA creation
Add new fields to the fchost structure to allow creation of a vHBA via
the storage pool when a parent_wwnn/parent_wwpn or parent_fabric_wwn is
supplied in the storage pool XML.
2017-01-06 17:15:34 -05:00
John Ferlan
2b13361bc7 nodedev: Add the ability to create vHBA by parent wwnn/wwpn or fabric_wwn
https://bugzilla.redhat.com/show_bug.cgi?id=1349696

When creating a vHBA, the process is to feed XML to nodeDeviceCreateXML
that lists the <parent> scsi_hostX to use to create the vHBA. However,
between reboots, it's possible that the <parent> changes its scsi_hostX
to scsi_hostY and saved XML to perform the creation will either fail or
create a vHBA using the wrong parent.

So add the ability to provide "wwnn" and "wwpn" or "fabric_wwn" to
the <parent> instead of a name of the scsi_hostN that is the parent.
The allowed XML will thus be:

  <parent>scsi_host3</parent>  (current)

or

  <parent wwnn='$WWNN' wwpn='$WWPN'/>

or

  <parent fabric_wwn='$WWNN'/>

Using the wwnn/wwpn or fabric_wwn ensures the same 'scsi_hostN' is
selected between hardware reconfigs or host reboots. The fabric_wwn
Using the wwnn/wwpn pair will provide the most specific search option,
while fabric_wwn will at least ensure usage of the same SAN, but maybe
not the same scsi_hostN.

This patch will add the new fields to the nodedev.rng for input purposes
only since the input XML is essentially thrown away, no need to Format
the values since they'd already be printed as part of the scsi_host
data block.

New API virNodeDeviceGetParentHostByWWNs will take the parent "wwnn" and
"wwpn" in order to search the list of devices for matching capability
data fields wwnn and wwpn.

New API virNodeDeviceGetParentHostByFabricWWN will take the parent "fabric_wwn"
in order to search the list of devices for matching capability data field
fabric_wwn.
2017-01-06 17:14:12 -05:00
Collin L. Walling
d47db7b16d qemu: command: Support new cpu feature argument syntax
Qemu has abandoned the +/-feature syntax in favor of key=value. Some
architectures (s390) do not support +/-feature. So we update libvirt to handle
both formats.

If we detect a sufficiently new Qemu (indicated by support for qmp
query-cpu-model-expansion) we use key=value else we fall back to +/-feature.

Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
2017-01-06 12:24:57 +01:00
Jiri Denemark
5d513d4659 qemu-caps: Get host model directly from Qemu when available
When qmp query-cpu-model-expansion is available probe Qemu for its view of the
host model. In kvm environments this can provide a more complete view of the
host model because features supported by Qemu and Kvm can be considered.

Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
2017-01-06 12:24:57 +01:00
Collin L. Walling
fab9d6e1a9 qemu: qmp query-cpu-model-expansion command
query-cpu-model-expansion is used to get a list of features for a given cpu
model name or to get the model and features of the host hardware/environment
as seen by Qemu/kvm.

Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
2017-01-06 12:24:57 +01:00
Jason J. Herne
8f77821522 s390-cpu: Remove nodeData and decode
On s390, the host's features are heavily influenced by not only the host
hardware but also by hardware microcode level, host OS version, qemu
version and kvm version. In this environment it does not make sense to
attempt to report exact host details.

Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Acked-by: Jiri Denemark <jdenemar@redhat.com>
2017-01-06 12:24:56 +01:00
Jason J. Herne
79d72011ee s390: Cpu driver support for update and compare
Implement compare for s390. Required to test the guest against the host for
guest cpu model runnability checking. We always return IDENTICAL to bypass
Libvirt's checking. s390 will rely on Qemu to perform the runnability checking.

Implement update for s390. required to support use of cpu "host-model" mode.

Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Acked-by: Jiri Denemark <jdenemar@redhat.com>
2017-01-06 12:24:56 +01:00
Martin Kletzander
c1140eb9ed qemu: Remove /dev mount info properly
Just so it doesn't bite us in the future, even though it's unlikely.

And fix the comment above it as well.  Commit e08ee7cd34 took the
info from the function it's calling, but that was lie itself in the
first place.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2017-01-05 16:24:55 +01:00
Martin Kletzander
08ad8f9fe2 util: Don't lie in virFileGetMount*Subtree's docstrings
The resulting function virFileGetMountSubtreeImpl() just uses
virStringSortRevCompare or virStringSortCompare which uses strcmp().

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2017-01-05 16:23:25 +01:00
Michal Privoznik
e08ee7cd34 qemuDomainGetPreservedMounts: Fetch list of /dev/* mounts dynamically
With my namespace patches, we are spawning qemu in its own
namespace so that we can manage /dev entries ourselves. However,
some filesystems mounted under /dev needs to be preserved in
order to be shared with the parent namespace (e.g. /dev/pts).
Currently, the list of mount points to preserve is hardcoded
which ain't right - on some systems there might be less or more
items under real /dev that on our list. The solution is to parse
/proc/mounts and fetch the list from there.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-05 16:00:20 +01:00
Michal Privoznik
486fd7f700 internal: Simplify STREQ_NULLABLE
Our STREQ_NULLABLE and STRNEQ_NULLABLE macros are too
complicated. This was a result of some broken version of gcc.
However, that is long gone and therefore we can simplify the
macros.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-05 14:40:15 +01:00
Michal Privoznik
6de3f11637 qemuProcessLaunch: fix indentation
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-05 14:38:45 +01:00
Wangjing (King, Euler)
3afaae4984 qemu: snapshot: restart CPUs when recover from interrupted snapshot job
If we restart libvirtd while VM was doing external memory snapshot, VM's
state be updated to paused as a result of running a migration-to-file
operation, and then VM will be left as paused state. In this case we must
restart the VM's CPUs to resume it.

Signed-off-by: Wang King <king.wang@huawei.com>
2017-01-05 10:47:03 +01:00
John Ferlan
1d0fde7ee1 util: Remove need for extra VIR_FREE's in virGetFCHostNameByWWN
Rather than extraneous VIR_FREE's depending on where we are in the code,
move them to the top of the loop and in the cleanup path.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-04 17:09:59 -05:00
John Ferlan
9fdc8c4269 scsi: Converge more createVport checks
Remove duplicated code - make one simple path through

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-04 17:09:59 -05:00
John Ferlan
476ecf2a2a scsi: Change order of checks in createVport
Move the check for an already existing vHBA to the top of the function.
No sense in first decoding a provided parent if the next thing we're going
to do is fail if a provided wwnn/wwpn already exists.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-04 17:09:59 -05:00
John Ferlan
79ab093518 scsi: Clean up createVport exit paths
Use the ret = -1, goto cleanup, etc. rather than current hodgepodge.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-04 17:09:59 -05:00
John Ferlan
8b629a3c01 nodedev: Add ability to find a vport capable vHBA
If a <parent> is not supplied in the XML used to create a non-persistent
vHBA, then instead of failing, let's try to find a "vports" capable node
device and use that.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-04 17:09:59 -05:00
John Ferlan
8f3054a0f8 nodedev: Create helpers to search for vport capable nodedevs
Extract out code from virNodeDeviceGetParentHost into helpers - it's
going to be reused in upcoming patches to search on more fields

Create virNodeDeviceFindVPORTCapDef in order to return a virNodeDevCapsDefPtr
of the VPORT_OPS and virNodeDeviceFindFCParentHost to use the function and
generate an error message if the device doesn't have the capability.

Also clean up the processing in virNodeDeviceGetParentHost to remove
need for goto's.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-01-04 17:09:59 -05:00
Peter Krempa
2e86c0816f qemu: snapshot: Resume VM after live snapshot
Commit 4b951d1e38 missed the fact that the
VM needs to be resumed after a live external checkpoint (memory
snapshot) where the cpus would be paused by the migration rather than
libvirt.
2017-01-04 16:50:18 +01:00
Michal Privoznik
dd78da09b0 qemuDomainCreateDevice: Be more careful about device path
Again, not something that I'd hit, but there is a chance in
theory that this might bite us. Currently the way we decide
whether or not to create /dev entry for a device is by marching
first four characters of path with "/dev". This might be not
enough. Just imagine somebody has a disk image stored under
"/devil/path/to/disk". We ought to be matching against "/dev/".

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-04 15:36:42 +01:00
Michal Privoznik
ce01a2b11c qemuDomainAttachDeviceMknodHelper: Don't unlink() so often
Not that I'd encounter any bug here, but the code doesn't look
100% correct. Imagine, somebody is trying to attach a device to a
domain, and the device's /dev entry already exists in the qemu
namespace. This is handled gracefully and the control continues
with setting up ACLs and calling security manager to set up
labels. Now, if any of these steps fail, control jump on the
'cleanup' label and unlink() the file straight away. Even when it
was not us who created the file in the first place. This can be
possibly dangerous.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-04 15:36:42 +01:00
Michal Privoznik
3aae99fe71 qemu: Handle EEXIST gracefully in qemuDomainCreateDevice
https://bugzilla.redhat.com/show_bug.cgi?id=1406837

Imagine you have a domain configured in such way that you are
assigning two PCI devices that fall into the same IOMMU group.
With mount namespace enabled what happens is that for the first
PCI device corresponding /dev/vfio/X entry is created and when
the code tries to do the same for the second mknod() fails as
/dev/vfio/X already exists:

2016-12-21 14:40:45.648+0000: 24681: error :
qemuProcessReportLogError:1792 : internal error: Process exited
prior to exec: libvirt: QEMU Driver error : Failed to make device
/var/run/libvirt/qemu/windoze.dev//vfio/22: File exists

Worse, by default there are some devices that are created in the
namespace regardless of domain configuration (e.g. /dev/null,
/dev/urandom, etc.). If one of them is set as backend for some
guest device (e.g. rng, chardev, etc.) it's the same story as
described above.

Weirdly, in attach code this is already handled.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-04 15:36:42 +01:00
Andrea Bolognani
f0af48f0dd util: Fix syntax-check
Commit b9cc24839b introduced a new #define but neglected
to format it properly, thus breaking syntax-check.
2017-01-04 12:47:01 +01:00
Andrea Bolognani
b9cc24839b util: Turn virFirewallAddRule() into a macro
Clang 3.9 refuses to compile the existing code with the
following error:

  util/virfirewall.c:425:20: error: passing an object that undergoes
                             default argument promotion to 'va_start'
                             has undefined behavior [-Werror,-Wvarargs]
      va_start(args, layer);
                     ^
  util/virfirewall.c:420:37: note: parameter of type 'virFirewallLayer'
                             is declared here
                     virFirewallLayer layer,
                                      ^

This happens because 'layer' is of type virFirewallLayer, which
is an enum type and not a standard type such as eg. void* or int.

To solve the issue, turn virFirewallAddRule() from a very thin
wrapper around virFirewallAddRuleFullV() to a macro that expands
to a call to virFirewallAddRuleFull() - itself a very thin wrapper
around the aforementioned virFirewallAddRuleFullV() - with no loss
of functionality or type safety.
2017-01-04 11:14:56 +01:00
John Ferlan
7f7d990483 qemu: Don't assume secret provided for LUKS encryption
https://bugzilla.redhat.com/show_bug.cgi?id=1405269

If a secret was not provided for what was determined to be a LUKS
encrypted disk (during virStorageFileGetMetadata processing when
called from qemuDomainDetermineDiskChain as a result of hotplug
attach qemuDomainAttachDeviceDiskLive), then do not attempt to
look it up (avoiding a libvirtd crash) and do not alter the format
to "luks" when adding the disk; otherwise, the device_add would
fail with a message such as:

   "unable to execute QEMU command 'device_add': Property 'scsi-hd.drive'
    can't find value 'drive-scsi0-0-0-0'"

because of assumptions that when the format=luks that libvirt would have
provided the secret to decrypt the volume.

Access to unlock the volume will thus be left to the application.
2017-01-03 12:59:18 -05:00
Michal Privoznik
a6f05c5a81 networkxml2conftest: s/lo/lo0/ on non-Linux
After 478ddedc12 a bug is fixed where we wrongly presumed loopack
device name on non-Linux systems. It's lo0. However, the fix is
not reflected in the tests which are failing now.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-02 13:30:35 +01:00
Michal Privoznik
70b0a8e542 src: Build libvirt_nss.la iff WITH_NSS
If the nss module is disabled we don't need to build the
supplementary library for it either.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-02 13:25:42 +01:00
Michal Privoznik
5dc6169bc8 virmacmap: Don't use hash table dataFree callback
Due to nature of operations we do over the string list (more
precisely due to how virStringListRemove() works), it is not the
best idea to use dataFree callback. Problem is, on MAC address
remove, the string list remove function modifies the original
list in place. Then, virHashUpdateEntry() is called which frees
all the data stored in the list rendering @newMacsList point to
freed data.

==16002== Invalid read of size 8
==16002==    at 0x50BC083: virFree (viralloc.c:582)
==16002==    by 0x513DC39: virStringListFree (virstring.c:251)
==16002==    by 0x51089B4: virMacMapHashFree (virmacmap.c:67)
==16002==    by 0x50EF30B: virHashAddOrUpdateEntry (virhash.c:352)
==16002==    by 0x50EF4FD: virHashUpdateEntry (virhash.c:415)
==16002==    by 0x5108BED: virMacMapRemoveLocked (virmacmap.c:129)
==16002==    by 0x51092D5: virMacMapRemove (virmacmap.c:346)
==16002==    by 0x402F02: testMACRemove (virmacmaptest.c:107)
==16002==    by 0x403F15: virTestRun (testutils.c:180)
==16002==    by 0x4032C4: mymain (virmacmaptest.c:205)
==16002==    by 0x405A3B: virTestMain (testutils.c:992)
==16002==    by 0x403D87: main (virmacmaptest.c:237)
==16002==  Address 0xdd5a4d0 is 0 bytes inside a block of size 24 free'd
==16002==    at 0x4C2AD6F: realloc (vg_replace_malloc.c:693)
==16002==    by 0x50BB99B: virReallocN (viralloc.c:245)
==16002==    by 0x513DC0B: virStringListRemove (virstring.c:235)
==16002==    by 0x5108BA6: virMacMapRemoveLocked (virmacmap.c:124)
==16002==    by 0x51092D5: virMacMapRemove (virmacmap.c:346)
==16002==    by 0x402F02: testMACRemove (virmacmaptest.c:107)
==16002==    by 0x403F15: virTestRun (testutils.c:180)
==16002==    by 0x4032C4: mymain (virmacmaptest.c:205)
==16002==    by 0x405A3B: virTestMain (testutils.c:992)
==16002==    by 0x403D87: main (virmacmaptest.c:237)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-02 13:05:34 +01:00
Michal Privoznik
806582a5d1 virmacmap: Fix variable handling
In virMacMapRemoveLocked() we have two variables: @macsList and
@newMacsList. Obviously, @newMacsList is supposed to hold pointer
to modified list but in fact it holds pointer to the old list.
It's confusing.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-02 13:05:34 +01:00
Maxim Nestratov
e4aa80dfde vz: get disks statistics for CTs
A CT disk statistics is reported with prefix "hdd" and we should use
it to extract data.

Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
2016-12-22 22:06:40 +03:00
Maxim Nestratov
7eda8369fc vz: set boot from disk for CT only when there is no root filesystem
Before, boot devices information for CTs was always empty and we
didn't indicate that containers can boot from disk.

Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
2016-12-22 22:06:39 +03:00
Maxim Nestratov
8c9252aa6d vz: report disks either as disks or filesystems depending on original xml
Virtuozzo SDK interface doesn't differ filesystems from disks and sees them as disks.
Before, we always mistakenly presented disks based on files as filesystems, which is
not completely correct. Now we are going to show either disks or filesystems depending
on a hint, which uses boot device section of VZ config. Though this information
doesn't change booting order of a CT, it is used by vz libvirt interface as a hint
for libvirt representation of disks. Since now, if we have filesystems in input xml,
then we add them to VZ booting devices list and rely on this information to show
corresponding libvirt xml.

Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
2016-12-22 22:06:39 +03:00
Maxim Nestratov
1abc8b3966 vz: don't add implicit devices for CTs
Implicit devices like controllers are confusing for CTs and
function virDomainDefAddImplicitDevices never intended to be called
for CTs.

Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
2016-12-22 22:06:39 +03:00
Maxim Nestratov
e485310ab2 vz: report "scsi" bus for disks when nothing was set explixitly
This is necessary to show CTs created out of libvirt correctly.

Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
2016-12-22 22:06:39 +03:00
Shivaprasad G Bhat
5f65c96e8d Allow virtio-console on PPC64
virQEMUCapsSupportsChardev existing checks returns true
for spapr-vty alone. Instead verify spapr-vty validity
and let the logic to return true for other device types
so that virtio-console passes.

The non-pseries machines dont have spapr-vio-bus. So, the
function always returned false for them before.

Fixes - https://bugzilla.redhat.com/show_bug.cgi?id=1257813

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
2016-12-21 18:01:10 +01:00
Nikolay Shirokovskiy
9f08b76631 qemu: clean out unused migrate to unix 2016-12-21 16:24:59 +01:00
Pavel Hrdina
02957106a0 configure: move XenAPI driver check to its own file
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-12-21 15:39:41 +01:00
Pavel Hrdina
60af91ca85 m4/virt-devmapper: use LIBVIRT_CHECK_(PKG|LIB)
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-12-21 15:39:39 +01:00
Pavel Hrdina
9587319333 configure: move windows common check to its own file
This renames MSCOM_LIBS to WIN32_EXTRA_LIBS to make it consistent with
WIN32_EXTRA_CFLAGS.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-12-21 15:39:39 +01:00
Pavel Hrdina
aee0043bd7 configure: move with-driver-modules check to its own file
Rename DRIVER_MODULE_(LDFLAGS|LIBS|CFLAGS) to unify the naming.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-12-21 15:39:38 +01:00
Cédric Bosdonnat
9cae9c886b xen: add QED format test
Follow up of commit 340bb6b7 to add unit tests for the QED format
support. Also add missing QED case in xenFormatXLDisk()
2016-12-21 15:06:40 +01:00
John Ferlan
0c234889c4 storage: Introduce virStorageVolInfoFlags
https://bugzilla.redhat.com/show_bug.cgi?id=1332019

This function will essentially be a wrapper to virStorageVolInfo in order
to provide a mechanism to have the "physical" size of the volume returned
instead of the "allocation" size. This will provide similar capabilities to
the virDomainBlockInfo which can return both allocation and physical of a
domain storage volume.

NB: Since we're reusing the _virStorageVolInfo and not creating a new
_virStorageVolInfoFlags structure, we'll need to generate the rpc APIs
remoteStorageVolGetInfoFlags and remoteDispatchStorageVolGetInfoFlags
(although both were originally created from gendispatch.pl and then
just copied into daemon/remote.c and src/remote/remote_driver.c).

The new API will allow the usage of a VIR_STORAGE_VOL_GET_PHYSICAL flag
and will make the decision to return the physical or allocation value
into the allocation field.

In order to get that physical value, virStorageBackendUpdateVolTargetInfoFD
adds logic to fill in physical value matching logic in qemuStorageLimitsRefresh
used by virDomainBlockInfo when the domain is inactive.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-12-20 13:52:39 -05:00
John Ferlan
78661cb1f4 conf: Display <physical> in output of voldef
Although the virStorageBackendUpdateVolTargetInfo will update the
target.physical value, there is no way to provide that information
via the virStorageGetVolInfo API since it only returns the capacity
and allocation of a volume. So as described in commit id '0282ca45',
it should be possible to generate an output only <physical> value
for that purpose.

This patch generates the <physical> value in the volume XML output
for the sole purpose of being able to view/see the value to allow
someone to parse the XML in order to obtain the value.

Update the documentation to describe the output only nature.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-12-20 13:52:39 -05:00
John Ferlan
b9b1aa6392 qemu: Adjust qemuDomainGetBlockInfo data for sparse backed files
According to commit id '0282ca45a' the 'physical' value should
essentially be the last offset of the image or the host physical
size in bytes of the image container. However, commit id '15fa84ac'
refactored the GetBlockInfo to use the same returned data as the
GetStatsBlock API for an active domain. For the 'entry->physical'
that would end up being the "actual-size" as set through the
qemuMonitorJSONBlockStatsUpdateCapacityOne (commit '7b11f5e5').
Digging deeper into QEMU code one finds that actual_size is
filled in using the same algorithm as GetBlockInfo has used for
setting the 'allocation' field when the domain is inactive.

The difference in values is seen primarily in sparse raw files
and other container type files (such as qcow2), which will return
a smaller value via the stat API for 'st_blocks'. Additionally
for container files, the 'capacity' field (populated via the
QEMU "virtual-size" value) may be slightly different (smaller)
in order to accomodate the overhead for the container. For
sparse files, the state 'st_size' field is returned.

This patch thus alters the allocation and physical values for
sparse backed storage files to be more appropriate to the API
contract. The result for GetBlockInfo is the following:

 capacity: logical size in bytes of the image (how much storage
           the guest will see)
 allocation: host storage in bytes occupied by the image (such
             as highest allocated extent if there are no holes,
             similar to 'du')
 physical: host physical size in bytes of the image container
           (last offset, similar to 'ls')

NB: The GetStatsBlock API allows a different contract for the
values:

 "block.<num>.allocation" - offset of the highest written sector
                            as unsigned long long.
 "block.<num>.capacity" - logical size in bytes of the block device
                          backing image as unsigned long long.
 "block.<num>.physical" - physical size in bytes of the container
                          of the backing image as unsigned long long.
2016-12-20 12:56:44 -05:00
Marc Hartmayer
c07d1c1c4f conf: Detect misconfiguration between disk bus and disk address
This patch detects a misconfiguration between the disk bus type and disk
address type for controller based disk buses (SATA, SCSI, FDC and
IDE). The addresses of these bus types are all managed in common code so
it's possible to decide in common code whether the disk address and bus
type are compatible or not.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
2016-12-20 11:34:30 +01:00
Marc Hartmayer
fb2cd32c9a qemu: qemuDomainDiskChangeSupported: Add missing 'address' check
Disk->info is not live updatable so add a check for this. Otherwise
libvirt reports success even though no data was updated.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-12-20 11:22:44 +01:00
Marc Hartmayer
804eccf8f7 conf: Make virDomainDeviceInfoAddressIsEqual() public
This function will be needed by the QEMU driver in an upcoming
patch. Additionally, removed a useless empty line.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
2016-12-20 11:22:44 +01:00
Boris Fiuczynski
dbeaa7e666 cgroup: reduce complexity of controller disabling
This patch reduces the complexity of the filtering algorithm in
virCgroupDetect by first correcting the controller mask and then
checking for potential co-mounts without any correlating
controller mask modifications.

If you agree that this patch removes complexity and improves
readability it could simply be squashed into the first patch
of this series.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
2016-12-20 11:18:09 +01:00
Boris Fiuczynski
dfcfe0bb9c cgroup: unavailable controller prevents controller disabling
The cgroup controller filtering in virCgroupDetect does not work
properly if the following conditions are met:
1) the host system does not have a cgroup controller which
libvirt requests (unavailable controller) and
2) libvirt is configured to disable a controller (disabled controller) and
3) the disabled controller is located before the unavailable controller
in virCgroupController.

As an example: The memory controller is unavailable and the cpuset
controller is configured to be disabled.
In this scenario trying to start a domain results in the error
error: Controller 'cpuset' is not wanted, but 'memory' is co-mounted: Invalid argument

This error occurs when virCgroupDetect is called with a valid parent group.
The resulting group created by virCgroupCopyMounts holds for cpuset and
memory controller empty mount points. The filtering of disabled controllers
checks for co-mounts by comparing the mount points. The cpuset controller
causes the filtering to occur before the memory controller is marked as to be
ignored by modifying the controller mask since it is unavailable.
Therefore the co-mount detection logic compares the cpuset and memory controller
mount points and since both are empty the memory controller is regarded
erroneously as being co-mounted.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-20 11:17:22 +01:00
Peter Krempa
9ab36bc233 locking: Fix documentation on how automatic sanlock leases are stored
s/MD5 checkout/MD5 hash/
2016-12-19 17:28:41 +01:00
Peter Krempa
8551d39f4f qemu: blockcopy: Save monitor error prior to calling into lock manager
The error would be overwritten otherwise producing a meaningless error
message.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1302171
2016-12-19 17:28:41 +01:00
Jiri Denemark
3d98acc9e3 network: Add support for local PTR domains
Similarly to localOnly DNS domain, localPtr attribute can be used to
tell the DNS server not to forward reverse lookups for unknown IPs which
belong to the virtual network.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-12-19 09:03:29 +01:00
Jiri Denemark
acd547dc95 util: Introduce virSocketAddrPTRDomain
The API creates PTR domain which corresponds to a given addr/prefix.
Both IPv4 and IPv6 addresses are supported, but the prefix must be
divisible by 8 for IPv4 and divisible by 4 for IPv6.

The generated PTR domain has the following format

IPv4: 1.2.3.4.in-addr.arpa
IPv6: 0.1.2.3.4.5.6.7.8.9.a.b.c.d.e.f.0.1.2.3.4.5.6.7.8.9.a.b.c.d.e.f.ip6.arpa

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-12-19 09:03:29 +01:00
Jiri Denemark
770b1d2b56 conf: Make virNetworkIPDefParseXML a little bit saner
Iterating over all child nodes when we only support one instance of each
child is pretty weird. And it would even cause memory leaks if more
than one <tftp> element was specified.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-12-19 09:03:29 +01:00
Peter Krempa
9e9305542e qemu: block copy: Forbid block copy to relative paths
Similarly to 29bb066915 forbid paths used with blockjobs to be relative.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1300177
2016-12-16 18:30:39 +01:00
Michal Privoznik
50b2a2375a virfile: Support bind mount only on linux
Other systems (despite having sys/mount.h) do not support bind
mounts.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-16 11:51:06 +00:00
Michal Privoznik
ab41ce7f4e qemu: Mark more namespace code linux-only
Some of the functions are not called on non-linux platforms
which makes them useless there.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-16 11:51:06 +00:00
Daniel P. Berrange
1d29c889ad Make use of PERF_COUNT_HW_REF_CPU_CYCLES conditional
The PERF_COUNT_HW_REF_CPU_CYCLES constant is not available
on all Linux distros libvirt targets, so its use must be
made conditional. Other constant have existed long enough
that we can assume they exist, as we don't support very
old distros like RHEL-5 any more.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-12-16 10:47:05 +00:00
Nitesh Konkar
71bbe65311 perf: add ref_cpu_cycles perf event support
This patch adds support and documentation for
the ref_cpu_cycles perf event.

Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
2016-12-15 17:32:03 -05:00
Nitesh Konkar
9ae79400ff perf: add stalled_cycles_backend perf event support
This patch adds support and documentation for
the stalled_cycles_backend perf event.

Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
2016-12-15 16:47:05 -05:00
Nitesh Konkar
060c159b08 perf: add stalled_cycles_frontend perf event support
This patch adds support and documentation
for the stalled_cycles_frontend perf event.

Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
2016-12-15 16:47:05 -05:00
Nitesh Konkar
7d34731067 perf: add bus_cycles perf event support
This patch adds support and documentation
for the bus_cycles perf event.

Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
2016-12-15 16:47:05 -05:00
Erik Skultety
1a38fbaa86 admin: Introduce virAdmConnectSetLoggingFilters
Enable libvirt users to modify logging filters of a daemon from outside.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2016-12-15 10:36:23 +01:00
Erik Skultety
ceeb85bd00 admin: Introduce virAdmConnectSetLoggingOutputs
Enable libvirt users to modify daemon's logging output settings from outside.
If either an empty string or NULL is passed, a default logging output will be
used the same way as it would be in case writing an empty string to the
libvirtd.conf

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2016-12-15 10:36:23 +01:00
Erik Skultety
cd484b534e admin: Introduce virAdmConnectGetLoggingFilters
Enable libvirt users to query logging filter settings.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2016-12-15 10:36:23 +01:00
Erik Skultety
fc7d1be79e admin: Introduce virAdmConnectGetLoggingOutputs
Enable libvirt users to query logging output settings.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2016-12-15 10:36:23 +01:00
Erik Skultety
94c465d0eb daemon: Hook up the virLog{Get,Set}DefaultOutput to the daemon's init routine
Now that virLog{Get,Set}DefaultOutput routines are introduced we can wire them
up to the daemon's logging initialization code. Also, change the order of
operations a bit so that we still strictly honor our precedence of settings:
cmdline > env > config now that outputs and filters are not appended anymore.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2016-12-15 10:36:23 +01:00
Erik Skultety
0d6cf32721 admin: Allow passing NULL to virLogSetOutputs
Along with an empty string, it should also be possible for users to pass
NULL to the public APIs which in turn would trigger a routine(future
work) responsible for defining an appropriate default logging output
given the current circumstances.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2016-12-15 10:36:23 +01:00
Erik Skultety
ae06048bf5 virlog: Introduce virLog{Get,Set}DefaultOutput
These helpers will manage the log destination defaults (fetch/set). The reason
for this is to stay consistent with the current daemon's behaviour with respect
to /etc/libvirt/<daemon>.conf file, since both assignment of an empty string
or not setting the log output variable at all trigger the daemon's decision on
the default log destination which depends on whether the daemon runs daemonized
or not.
This patch also changes the logic of the selection of the default
logging output compared to how it is done now. The main difference though is
that we should only really care if we're running daemonized or not, disregarding
the fact of (not) having a TTY completely (introduced by commit eba36a3878) as
that should be of the libvirtd's parent concern (what FD it will pass to it).

 Before:
 if (godaemon || !hasTTY):
     if (journald):
         use journald

 if (godaemon):
     if (privileged):
         use SYSCONFIG/libvirtd.log
     else:
         use XDG_CONFIG_HOME/libvirtd.log
 else:
     use stderr

 After:
 if (godaemon):
     if (journald):
         use journald

     else:
         if (privileged):
             use SYSCONFIG/libvirtd.log
         else:
             use XDG_CONFIG_HOME/libvirtd.log
 else:
     use stderr

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2016-12-15 10:36:23 +01:00
Peter Krempa
4b951d1e38 qemu: snapshot: Don't attempt to resume cpus if they were not paused
External disk-only snapshots with recent enough qemu don't require
libvirt to pause the VM. The logic determining when to resume cpus was
slightly flawed and attempted to resume them even if they were not
paused by the snapshot code. This normally was not a problem, but with
locking enabled the code would attempt to acquire the lock twice.

The fallout of this bug would be a error from the API, but the actual
snapshot being created. The bug was introduced with when adding support
for external snapshots with memory (checkpoints) in commit f569b87.

Resolves problems described by:
https://bugzilla.redhat.com/show_bug.cgi?id=1403691
2016-12-15 09:46:41 +01:00
Peter Krempa
e8f167a623 qemu: monitor: Don't resume lockspaces in resume event handler
After qemu delivers the resume event it's already running and thus it's
too late to enter lockspaces since it may already have modified the
disk. The code only creates false log entries in the case when locking
is enabled. The lockspace needs to be acquired prior to starting cpus.
2016-12-15 09:46:41 +01:00
Michal Privoznik
f444faa94a qemu: Enable mount namespace
https://bugzilla.redhat.com/show_bug.cgi?id=1404952

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
661887f558 qemu: Let users opt-out from containerization
Given how intrusive previous patches are, it might happen that
there's a bug or imperfection. Lets give users a way out: if they
set 'namespaces' to an empty array in qemu.conf the feature is
suppressed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
f95c5c48d4 qemu: Manage /dev entry on RNG hotplug
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
f5fdf23a68 qemu: Manage /dev entry on chardev hotplug
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
6e57492839 qemu: Manage /dev entry on hostdev hotplug
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
81df21507b qemu: Manage /dev entry on disk hotplug
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
eadaa97548 qemu: Enter the namespace on relabelling
Instead of trying to fix our security drivers, we can use a
simple trick to relabel paths in both namespace and the host.
I mean, if we enter the namespace some paths are still shared
with the host so any change done to them is visible from the host
too.
Therefore, we can just enter the namespace and call
SetAllLabel()/RestoreAllLabel() from there. Yes, it has slight
overhead because we have to fork in order to enter the namespace.
But on the other hand, no complexity is added to our code.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
2160f338a7 qemu: Prepare RNGs when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
8ec8a8c5ff qemu: Prepare inputs when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
2c654490f3 qemu: Prepare TPM when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
4e4451019c qemu: Prepare chardevs when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
73267cec46 qemu: Prepare hostdevs when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
054202d020 qemu: Prepare disks when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
bb4e529664 qemu: Spawn qemu under mount namespace
Prime time. When it comes to spawning qemu process and
relabelling all the devices it's going to touch, there's inherent
race with other applications in the system (e.g. udev). Instead
of trying convincing udev to not touch libvirt managed devices,
we can create a separate mount namespace for the qemu, and mount
our own /dev there. Of course this puts more work onto us as we
have to maintain /dev files on each domain start and device
hot(un-)plug. On the other hand, this enhances security also.

From technical POV, on domain startup process the parent
(libvirtd) creates:

  /var/lib/libvirt/qemu/$domain.dev
  /var/lib/libvirt/qemu/$domain.devpts

The child (which is going to be qemu eventually) calls unshare()
to create new mount namespace. From now on anything that child
does is invisible to the parent. Child then mounts tmpfs on
$domain.dev (so that it still sees original /dev from the host)
and creates some devices (as explained in one of the previous
patches). The devices have to be created exactly as they are in
the host (including perms, seclabels, ACLs, ...). After that it
moves $domain.dev mount to /dev.

What's the $domain.devpts mount there for then you ask? QEMU can
create PTYs for some chardevs. And historically we exposed the
host ends in our domain XML allowing users to connect to them.
Therefore we must preserve devpts mount to be shared with the
host's one.

To make this patch as small as possible, creating of devices
configured for domain in question is implemented in next patches.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
a5896e8ca4 qemu_cgroup: Expose defaultDeviceACL
This is a list of devices that qemu needs for its run (apart from
what's configured for domain). The devices on the list are
enabled in the CGroups by default so they will be good candidates
for initial /dev for new qemu.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
5ac52bd0fe virscsivhost: Introduce virSCSIVHostDeviceGetPath
We will need this function in near future so that we know what
/dev device corresponds to the SCSI device.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
6bcacd55e5 virscsi: Introduce virSCSIDeviceGetPath
We will need this function in near future so that we know what
/dev device corresponds to the SCSI device.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
c4237d8e0c virusb: Introduce virUSBDeviceGetPath
We will need this function in near future so that we know what
/dev device corresponds to the USB device.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
654b4d48bc virfile: Introduce ACL helpers
Namely, virFileGetACLs, virFileSetACLs, virFileFreeACLs and
virFileCopyACLs. These functions are going to be required when we
are creating /dev for qemu. We have copy anything that's in
host's /dev exactly as is. Including ACLs.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
1a7c9a5d50 virfile: Introduce virFileSetupDev
This part of code that LXC currently uses will be reused so move
to a generic function.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
48a12d3b25 virprocess: Introduce virProcessSetupPrivateMountNS
This part of code that LXC currently uses will be reused so move
to a generic function.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Joao Martins
de8607d77d libxl: reverse defaults on HVM net device attach
libvirt libxl picks its own default with respect to the default NIC
to use. libxlMakeNic is the one responsible for this and on boot it
picks LIBXL_NIC_TYPE_VIF_IOEMU for HVM domains such that it accomodates
both PV and emulated one. The good behaving guest at boot will then
select the pv and unplug the emulated device.

Now, on HVM when attaching an interface it will pick the same default
that is LIBXL_NIC_TYPE_VIF_IOEMU which as a result will fail the attach
(see xen commit 32e9d0f ("libxl: nic type defaults to vif in hotplug for
hvm guest"). Xen doesn't yet support the hotplug of emulated devices,
but we don't want to rule out that case either, which might get support
in the future. Hence we simply reverse the defaults when we are
attaching the interface which allows libvirt to prefer the PV nic first
without adding "model='netfront'" following the same pattern as above
commit. Also to avoid ruling out the emulated one we set to
LIBXL_NIC_TYPE_IOEMU when setting a model type that is not 'netfront'.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2016-12-14 13:41:46 -07:00
Cédric Bosdonnat
340bb6b7ef libxl: add QED disk format support
If libxl has QED disk format support, then pass the feature
over to the user.
2016-12-14 18:03:08 +01:00
Cédric Bosdonnat
cb25972fd1 xenconfig: add default in xenParseXLDisk()'s switches
Without a default: case in the switches in xenParseXLDisk(), build
would fail with every new disk backend or image format added in libxl,
as this is the case in this error:

http://logs.test-lab.xenproject.org/osstest/logs/103325/build-amd64-libvirt/5.ts-libvirt-build.log
2016-12-14 18:02:58 +01:00
Daniel P. Berrange
3e8dac148a Remove reference to enum that never existed
The virDomainSendProcessSignal method says the flags values
come from virDomainProcessSignalFlag, but this enum has
never existed. No flags are needed for this method.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-12-14 16:42:27 +00:00
Jiri Denemark
c1cb4cb9f6 virjson: Remove const from virJSONValueObjectForeachKeyValue
Almost none of our virJSONValue*Get* functions accept const virJSONValue
pointers and it wouldn't even make sense since we sometimes modify what
we get. And because there is no reason for preventing callers of
virJSONValueObjectForeachKeyValue from modifying the values they get in
each iteration we can just stop doing it.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-12-14 16:21:57 +01:00
Daniel P. Berrange
a81cfb649d Avoid variable named 'stat'
Using a variable named 'stat' clashes with the system function
'stat()' causing compiler warnings on some platforms

cc1: warnings being treated as errors
../../src/qemu/qemu_monitor_text.c: In function 'parseMemoryStat':
../../src/qemu/qemu_monitor_text.c:604: error: declaration of 'stat' shadows a global declaration [-Wshadow]
/usr/include/sys/stat.h:455: error: shadowed declaration is here [-Wshadow]

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-12-14 12:17:08 +00:00
Peter Krempa
15398e6a4c log: Fix loading of conf file for log daemon
'log_outputs' would be read into the variable for log_filters
2016-12-14 07:24:24 +01:00
Peter Krempa
e98b30909b lock: Fix loading of config file for the lock daemon
'log_outputs' would be read into the variable for log_filters
2016-12-14 07:24:24 +01:00
Viktor Mihajlovski
283e290434 qemu: Allow use of hot plugged host CPUs if no affinity set
If the cpuset cgroup controller is disabled in /etc/libvirt/qemu.conf
QEMU virtual machines can in principle use all host CPUs, even if they
are hot plugged, if they have no explicit CPU affinity defined.

However, there's libvirt code supposed to handle the situation where
the libvirt daemon itself is not using all host CPUs. The code in
qemuProcessInitCpuAffinity attempts to set an affinity mask including
all defined host CPUs. Unfortunately, the resulting affinity mask for
the process will not contain the offline CPUs. See also the
sched_setaffinity(2) man page.

That means that even if the host CPUs come online again, they won't be
used by the QEMU process anymore. The same is true for newly hot
plugged CPUs. So we are effectively preventing that QEMU uses all
processors instead of enabling it to use them.

It only makes sense to set the QEMU process affinity if we're able
to actually grow the set of usable CPUs, i.e. if the process affinity
is a subset of the online host CPUs.

There's still the chance that for some reason the deliberately chosen
libvirtd affinity matches the online host CPU mask by accident. In this
case the behavior remains as it was before (CPUs offline while setting
the affinity will not be used if they show up later on).

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Tested-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
2016-12-13 18:25:00 -05:00
Viktor Mihajlovski
1be35910f7 util: Allow to query the presence of host CPU bitmaps
The functions to retrieve online and present host CPU information
are only supported on Linux for the time being.

This leads to runtime errors if these function are used on other
platforms. To avoid that, code in higher levels using the functions
must replicate the conditional compilation in higher level which
is error prone (and is plainly spoken ugly).

Adding a function virHostCPUHasBitmap that can be used to check
for host CPU bitmap support.

NB: There are other functions including the host CPU count that
are lacking support on all platforms, but they are too essential
in order to be bypassed.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2016-12-13 18:12:09 -05:00
Jiri Denemark
f00c00475f qemu: Fix virQEMUCapsFindTarget on ppc64le
virQEMUCapsFindTarget is supposed to find an alternative QEMU binary if
qemu-system-$GUEST_ARCH doesn't exist. The alternative is using host
architecture when it is compatible with $GUEST_ARCH. But a special
treatment has to be applied for ppc64le since the QEMU binary is always
called qemu-system-ppc64.

Broken by me in v2.2.0-171-gf2e71550d.

https://bugzilla.redhat.com/show_bug.cgi?id=1403745

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-12-13 22:11:33 +01:00
Nitesh Konkar
8981d7925e perf: add branch_misses perf event support
This patch adds support and documentation
for the branch_misses perf event.

Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
2016-12-12 18:04:52 -05:00
Nikolay Shirokovskiy
cdd6819318 qemu: agent: take monitor lock in qemuAgentNotifyEvent
qemuAgentNotifyEvent accesses monitor structure and is called on qemu
reset/shutdown/suspend events under domain lock. Other monitor
functions on the other hand take monitor lock and don't hold domain lock.
Thus it is possible to have risky simultaneous access to the structure
from 2 threads. Let's take monitor lock here to make access exclusive.
2016-12-12 17:14:11 -05:00
Nikolay Shirokovskiy
c9a191fc48 qemu: don't use vm when lock is dropped in qemuDomainGetFSInfo
Current call to qemuAgentGetFSInfo in qemuDomainGetFSInfo is
unsafe. Domain lock is dropped and we use vm->def. Let's make
def copy to fix that.
2016-12-12 17:14:11 -05:00
Nikolay Shirokovskiy
3ab9652a86 qemu: agent: fix uninitialized var case in qemuAgentGetFSInfo
In case of 0 filesystems *info is not set while according
to virDomainGetFSInfo contract user should call free on it even
in case of 0 filesystems. Thus we need to properly set
it. NULL will be enough as free eats NULLs ok.
2016-12-12 17:14:11 -05:00
John Ferlan
cf436a560d qemu: Fix GetBlockInfo setting allocation from wr_highest_offset
The libvirt-domain.h documentation indicates that for a qcow2 file
in a filesystem being used for a backing store should report the disk
space occupied by a file; however, commit id '15fa84ac' altered the
code to trust that the wr_highest_offset should be used whenever
wr_highest_offset_valid was set.

As it turns out this will lead to indeterminite results. For an active
domain when qemu hasn't yet had the need to find the wr_highest_offset
value, qemu will report 0 even though qemu-img will report the proper
disk size. This causes reporting of the following XML:

  <disk type='file' device='disk'>
    <driver name='qemu' type='qcow2'/>
    <source file='/path/to/test-1g.qcow2'/>

to be as follows:

Capacity:       1073741824
Allocation:     0
Physical:       1074139136

with qemu-img indicating:

image: /path/to/test-1g.qcow2
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 1.0G

Once the backing source file is opened on the guest, then wr_highest_offset
is updated, but only to the high water mark and not the size of the file.

This patch will adjust the logic to check for the file backed qcow2 image
and enforce setting the allocation to the returned 'physical' value, which
is the 'actual-size' value from a 'query-block' operation.

NB: The other consumer of the wr_highest_offset output (GetAllDomainStats)
has a contract that indicates 'allocation' is the offset of the highest
written sector, so it doesn't need adjustment.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-12-12 16:04:17 -05:00
John Ferlan
9d734b60a7 util: Introduce virStorageSourceUpdateCapacity
Instead of having duplicated code in qemuStorageLimitsRefresh and
virStorageBackendUpdateVolTargetInfo to get capacity specific data
about the storage backing source or volume -- create a common API
to handle the details for both.

As a side effect, virStorageFileProbeFormatFromBuf returns to being
a local/static helper to virstoragefile.c

For the QEMU code - if the probe is done, then the format is saved so
as to avoid future such probes.

For the storage backend code, there is no need to deal with the probe
since we cannot call the new API if target->format == NONE.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-12-12 16:04:17 -05:00
John Ferlan
3039ec962e util: Introduce virStorageSourceUpdateBackingSizes
Instead of having duplicated code in qemuStorageLimitsRefresh and
virStorageBackendUpdateVolTargetInfoFD to fill in the storage backing
source or volume allocation, capacity, and physical values - create a
common API that will handle the details for both.

The common API will fill in "default" capacity values as well - although
those more than likely will be overridden by subsequent code. Having just
one place to make the determination of what the values should be will
make things be more consistent.

For the QEMU code - the data filled in will be for inactive domains
for the GetBlockInfo and DomainGetStatsOneBlock API's. For the storage
backend code - the data will be filled in during the volume updates.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-12-12 16:04:17 -05:00
John Ferlan
c5f6151390 util: Introduce virStorageSourceUpdatePhysicalSize
Commit id '8dc27259' introduced virStorageSourceUpdateBlockPhysicalSize
in order to retrieve the physical size for a block backed source device
for an active domain since commit id '15fa84ac' changed to use the
qemuMonitorGetAllBlockStatsInfo and qemuMonitorBlockStatsUpdateCapacity
API's to (essentially) retrieve the "actual-size" from a 'query-block'
operation for the source device.

However, the code only was made functional for a BLOCK backing type
and it neglected to use qemuOpenFile, instead using just open. After
the open the block lseek would find the end of the block and set the
physical value, close the fd and return.

Since the code would return 0 immediately if the source device wasn't
a BLOCK backed device, the physical would be displayed incorrectly,
such as follows in domblkinfo for a file backed source device:

Capacity:       1073741824
Allocation:     0
Physical:       0

This patch will modify the algorithm to get the physical size for other
backing types and it will make use of the qemuDomainStorageOpenStat
helper in order to open/stat the source file depending on its type.
The qemuDomainGetStatsOneBlock will no longer inhibit printing errors,
but it will still ignore them leaving the physical value set to 0.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-12-12 16:04:17 -05:00
John Ferlan
a7fea19fcd qemu: Introduce helper qemuDomainStorageUpdatePhysical
Currently just a shim to call virStorageSourceUpdateBlockPhysicalSize

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-12-12 16:04:17 -05:00
John Ferlan
732af77cce qemu: Add helpers to handle stat data for qemuStorageLimitsRefresh
Split out the opening of the file and fetch of the stat buffer into a
helper qemuDomainStorageOpenStat. This will handle either opening the
local or remote storage.

Additionally split out the cleanup of that into a separate helper
qemuDomainStorageCloseStat which will either close the file or
call the virStorageFileDeinit function.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-12-12 16:04:17 -05:00
John Ferlan
7149d1693d qemu: Clean up description for qemuStorageLimitsRefresh
Originally added by commit id '89646e69' prior to commit id '15fa84ac'
and '71d2c172' which ensured that qemuStorageLimitsRefresh was only called
for inactive domains.

Adjust the comment describing the need for FIXME and move all the text
to the function description.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-12-12 16:04:17 -05:00
John Ferlan
f17d68067e docs: Replace missing description for perf.cpu_cycles
Lost during merge of commit id '8546adf80' and '585ad00b5'
2016-12-11 07:56:53 -05:00
Pavel Glushchak
b1f916abbc vz: added VIR_MIGRATE_NON_SHARED_INC migration flag support
This flag is used in Virtuozzo backend implicitly, thus
we need to support it and don't fail if it's set.

Signed-off-by: Pavel Glushchak <pglushchak@virtuozzo.com>
2016-12-09 17:21:53 +03:00
Pavel Glushchak
5bafa1d721 vz: set PVMT_DONT_CREATE_DISK migration flag
This flag tells backend not to create instance
disks making behavior the same as in qemu driver.
Disk files have to be created beforehand on target
host manually or by upper management layer i.e.
OpenStack Nova.

Signed-off-by: Pavel Glushchak <pglushchak@virtuozzo.com>
2016-12-09 17:21:43 +03:00
Nikolay Shirokovskiy
61a0026a94 qemu: Fix xml dump of autogenerated websocket
When save/migrate a domain and we autogenerated a port, then if we
print the inactive domain config, write out a -1 for the socket value;
otherwise, it's possible that the subsequent start will fail if the
autogenerated websocket used conflicts with an existing running config
that also used autogenerated websockets.

Examples:

== A. Can not restore domain with autoconfigured websocket.

domain 1 and 2 have autoconfigured websocket.

1. domain 1 is started then, saved
2. domain 2 is started
3. domain 1 restoration is failed:

error: internal error: qemu unexpectedly closed the monitor: 2016-11-21T10:23:11.356687Z
qemu-kvm: -vnc 0.0.0.0:2,websocket=5700: Failed to start VNC server on `(null)':
Failed to bind socket: Address already in use

== B. Can not migrate domain with autoconfigured websocket.

domain 1 on host A, domain 2 on host B, both have autoconfigured websocket

1. domain 1 started, domain 2 started
2. domain 1 migration to host B is failed with the above error.
2016-12-09 07:54:39 -05:00
Nikolay Shirokovskiy
1215965a4c qemu: mark user defined websocket as used
We need extra state variable to distinguish between autogenerated
and user defined cases after auto generation is done.
2016-12-09 07:54:34 -05:00
Nikolay Shirokovskiy
b07cfd724f qemu: Refactor qemuProcessGraphicsReservePorts
Use switch for enums rather than if/else conditions.
2016-12-09 07:40:46 -05:00
Michal Privoznik
b492f7ef0f qemuGetDomainHugepagePath: Initialize @ret
The variable may be used uninitialized in this function.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-09 10:51:37 +01:00
Mehdi Abaakouk
e0d893e86d Move virstat.c code to virnetdevtap.c
This is just a code move of virstat.c to virnetdevtap.c
2016-12-09 10:28:07 +01:00
Mehdi Abaakouk
9b6de7c506 virstat: fix signature of virstat helper
In preparation to the code move to virnetdevtap.c, this change:

* renames virNetInterfaceStats to virNetDevTapInterfaceStats
* changes 'path' to 'ifname', to use the same vocable as other
  method in virnetdevtap.c.
* Add the attributes checker
2016-12-09 10:27:56 +01:00
Mehdi Abaakouk
013df874db Gathering vhostuser interface stats with ovs
When vhostuser interfaces are used, the interface statistics
are not available in /proc/net/dev.

This change looks at the openvswitch interfaces statistics
tables to provide this information for vhostuser interface.

Note that in openvswitch world drop/error doesn't always make sense
for some interface type. When these informations are not available we
set them to 0 on the virDomainInterfaceStats.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-09 10:23:09 +01:00
Peter Krempa
a4ed5b4212 qemu: Don't try to find compression program for "raw" memory images
There's nothing to compress if the requested snapshot memory format is
set to 'raw' explicitly. After commit 9e14689ea libvirt would try to
run /sbin/raw to process the memory stream if the qemu.conf option
snapshot_image_format is set.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1402726
2016-12-08 17:12:54 +01:00
Cédric Bosdonnat
3cd556d486 lxc: monitor now holds a reference to the domain
If the monitor doesn't hold a reference to the domain object
the object may be destroyed before the monitor actually stops.
2016-12-08 16:35:53 +01:00
Michal Privoznik
ce937d3710 security: Drop virSecurityManagerSetHugepages
Since its introduction in 2012 this internal API did nothing.
Moreover we have the same API that does exactly the same:
virSecurityManagerDomainSetPathLabel.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-08 15:45:52 +01:00
Michal Privoznik
f55afd83b1 qemu: Create hugepage path on per domain basis
If you've ever tried running a huge page backed guest under
different user than in qemu.conf, you probably failed. Problem is
even though we have corresponding APIs in the security drivers,
there's no implementation and thus we don't relabel the huge page
path. But even if we did, so far all of the domains share the
same path:

   /hugepageMount/libvirt/qemu

Our only option there would be to set 0777 mode on the qemu dir
which is totally unsafe. Therefore, we can create dir on
per-domain basis, i.e.:

   /hugepageMount/libvirt/qemu/domainName

and chown domainName dir to the user that domain is configured to
run under.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-08 15:45:52 +01:00
Michal Privoznik
7ed6934f3b virDomainObjGetShortName: take virDomainDef
So far this function takes virDomainObjPtr which:
1) is an overkill,
2) might be not available in all the places we will use it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-08 15:45:52 +01:00
Martin Kletzander
dc18766b10 conf: Make scheduler formatting simpler
Since the great rework of how we store vcpu- and iothread-related
data, we have overly complex part of code that is trying to format the
scheduler tuning data in as less lines as possible by grouping
settings for multiple threads.  That was designed as an input syntax
sugar for users, but we don't need to also use that when formatting
the XML.  Switching to simple enumeration makes the code nicer,
shorter and more welcoming to future changes.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-12-08 15:27:52 +01:00
Daniel P. Berrange
0be9cea199 test: fix screenshot API impl
When redoing the website we deleted the libvirtLogo.png file
not remembering that the test driver screenshot API impl
relied on it.

Rather than having the test driver use the logo as a side
effect, give it its own dedicated image to use. This is
installed in /usr/share/libvirt/test-screenshot.png and
is taken from a NeXT Cube running WorldWideWeb[1]. The
very first web browser in existance, running on the
hardware it was originally written on.

[1] https://en.wikipedia.org/wiki/WorldWideWeb

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-12-08 10:57:32 +00:00
Pavel Hrdina
a96a256083 configure: remove check for CPUID
This check is not required because all i386 and x86_64 cpus have the
cpuid instruction.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-12-07 16:21:31 +01:00
Peter Krempa
cf44dc072a qemu: capabilities: Add gluster.debug_level detection for 2.8.0+
Qemu 2.8.0+ changes arguments structure for blockdev-add in the effort
to make it finally stable. Since libvirt recently added the detection of
gluster debug support relying on the old syntax we need to add the new
as well.
2016-12-07 13:34:22 +01:00
Nitesh Konkar
8546adf80b perf: add one more perf event support
With current perf framework, this patch adds support and documentation
for the branch_instructions perf event.

Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
2016-12-07 07:03:57 -05:00
John Ferlan
016b63bdf0 docs: Fix code example formatting for virDomainInterfaceAddresses
Adjust the spacing so that the code examples are display in the code/text
box rather than just as paragraph text.
2016-12-07 06:07:16 -05:00
John Ferlan
585ad00b5b docs: Adjust formatting for virConnectGetAllDomainStats output
Adjust the spacing a bit in order to generate 'cleaner' looking output.
This matches what virDomainMemoryStats does and it creates text/code boxes
in order to list each of the stats for each category.
2016-12-07 06:07:16 -05:00
Viktor Mihajlovski
ac8ac9e052 cgroup: Use system reported "unlimited" value for comparison
With kernel 3.18 (since commit 3e32cb2e0a12b6915056ff04601cf1bb9b44f967)
the "unlimited" value for cgroup memory limits has changed once again as
its byte value is now computed from a page counter.
The new "unlimited" value reported by the cgroup fs is therefore 2**51-1
pages which is (VIR_DOMAIN_MEMORY_PARAM_UNLIMITED - 3072). This results
e.g. in virsh memtune displaying 9007199254740988 instead of unlimited
for the limits.

This patch uses the value of memory.limit_in_bytes from the cgroup
memory root which is the system's "real" unlimited value for comparison.

See also libvirt commit 231656bbeb for the
history for kernel 3.12 and before.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2016-12-06 16:25:20 +01:00
Michal Privoznik
22f7ceb695 nss: Introduce libvirt-guest module
So far the NSS module looks up only hostnames as provided by
guests themselves. However, there are some cases where this is
not enough: e.g. when there's a fresh new guest being installed
(with some generic hostname) say from a live ISO image; or some
(older) systems don't advertise their hostname in DHCP
transactions at all.
In cases like that it would be helpful if we translate domain
name as seen by libvirt too so that users can:

  # virsh start $dom && ssh $dom

In order to achieve that new libvirt-guest module is introduced,
while older libvirt module maintains its current behaviour (that
is translating guest provided names into IP addresses).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-06 13:34:00 +01:00
Michal Privoznik
1f9db235e7 network: Track MAC address map
Now that we have a module that's able to track
<domain, mac addres list> pairs, hook it up into
our network driver.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-06 13:33:18 +01:00
Michal Privoznik
86980bc75c util: Introduce virMACMap module
This module will be used to track:

  <domain, mac address list>

pairs. It will be important to know these mappings without
libvirt connection (that is from a JSON file), because NSS
module will use those to provide better host name translation.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-06 13:33:18 +01:00
Michal Privoznik
b9b664c5a8 util: Introduce virFileRewriteStr
There are couple of places where we have a string and want to
save it to a file. Atomically. In all those places we use
virFileRewrite() but also implement the very same callback which
takes the string and write it into temp file. This makes no
sense. Unify the callbacks and move them to one place.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-06 13:33:18 +01:00
Michal Privoznik
b379c44c35 virstring: Introduce virStringListRemove
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-06 13:33:18 +01:00
Michal Privoznik
ec38d6f741 virstring: Introduce virStringListAdd
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-06 13:33:18 +01:00
Michal Privoznik
03e3da2212 network: Don't unlock non-locked network driver
In dd7bfb2cdc I've removed locking of the network driver upon
it's allocation. However, I forgot to remove one location of the
driver unlock.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-06 13:33:18 +01:00
John Ferlan
1ff38366b8 qemu: Add the group name option to the iotune command line
Add in the block I/O throttling group parameter to the command line
if supported. If not supported, fail command creation.

Add the xml2argvtest for testing.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-12-05 18:30:38 -05:00
John Ferlan
32d99cb772 conf: Add support for blkiotune group_name option
Modify _virDomainBlockIoTuneInfo and rng schema to support the group_name
option for iotune throttling. Document the new value.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-12-05 18:30:34 -05:00
John Ferlan
c53bd25b13 qemu: Add support for parsing iotune group setting
Add support to read/parse the iotune group setting for qemu.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-12-05 18:12:08 -05:00
John Ferlan
d0f82df205 qemu: Adjust various bool BlockIoTune set_ values into a single mask
Rather than have multiple bool values, create a single enum with bits
representing what fields are set. Fields are generally set in groups
of 3 (read, write, total).
2016-12-05 18:12:08 -05:00
John Ferlan
ad9f127302 qemu: Alter qemuMonitorJSONSetBlockIoThrottle command logic
Currently we build the JSON object for the "block_set_io_throttle"
command using the knowledge that a NULL for a support*Options boolean
would essentially ignore the rest of the arguments.

This may not work properly if some capability was backported, plus it just
looks rather ugly. So instead, build the "base" arguments and then if
the support*Option bool capability is set, add in the arguments on the fly.

Then append those arguments to the basic command and send to qemu.
2016-12-05 18:12:08 -05:00
John Ferlan
c84ad82a2d qemu: Adjust maxparams logic for qemuDomainGetBlockIoTune
Rather than using negative logic and setting the maxparams to a lesser
value based on which capabilities exist, alter the logic to modify the
maxparams based on a base value plus the found capabilities. Reduces the
chance that some backported feature produces an incorrect value.
2016-12-05 18:12:08 -05:00
John Ferlan
d3364dfdc8 caps: Add new capability for the iotune group name
Add the capability to detect if the qemu binary can support the feature
to use throttling.group.
2016-12-05 18:12:08 -05:00
Lin Ma
c80e6b96e5 cpu: Add support for pku and ospke Intel features for Memory Protection Keys
qemu commit: f74eefe0
https://lwn.net/Articles/667156/

Signed-off-by: Lin Ma <lma@suse.com>
2016-12-05 22:18:28 +01:00
Lin Ma
2922cd9bbc cpu: Add support for more AVX512 Intel features
These features are included:
AVX512DQ, AVX512IFMA, AVX512BW, AVX512VL, AVX512VBMI, AVX512_4VNNIW and
AVX512_4FMAPS.

qemu commits: cc728d14 and 95ea69fb

Signed-off-by: Lin Ma <lma@suse.com>
2016-12-05 13:38:17 +01:00
John Ferlan
d3bba70771 storage: Fix type PLOOP type check for storageVolUpload
Commit id '03e750f3' added support for checking the PLOOP type; however,
it used 'target.type' which no storage code ever fills in, so it will
never be set.  Change to just vol->type (could use vol->target.format
as well).
2016-12-05 06:44:04 -05:00
Marc Hartmayer
0f2721d044 conf: add global check for duplicate drive addresses
Add a global check for duplicate drive addresses. This will fix the
problem of duplicate disk and hostdev drive addresses.

Example for duplicate drive addresses:
<disk>
  ...
  <target name='sda'/>
</disk>
<disk>
  ...
  <target name='sdb'/>
  <address type='drive' controller=0 bus=0 target=0 unit=0/>
</disk>

Another example:
<hostdev mode='subsystem' type='scsi' managed='no'>
  <source>
  ...
  </source>
  <address type='drive' controller='0' bus='0' target='0' unit='0'/>
</hostdev>
<hostdev mode='subsystem' type='scsi' managed='no'>
  <source>
  ...
  </source>
  <address type='drive' controller='0' bus='0' target='0' unit='0'/>
</hostdev>

Unfortunately the fixes (1b08cc170a,
8d46386bfe) weren't enough to catch these
cases and it isn't possible to add additional checks in
virDomainDeviceDefPostParseInternal() for SCSI hostdevs or
virDomainDiskDefAssignAddress() for SCSI/IDE/FDC/SATA disks without
adding another parse flag (virDomainDefParseFlags) to disable this
validation while updating or detaching a disk or hostdev.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-05 10:45:46 +01:00
Marc Hartmayer
dc8fb25734 conf: virDomainDriveAddressIsUsedByDisk: Rename type to bus_type
Comparing the parameter 'type' against the member 'bus' instead of
against the member 'type' is quite confusing. Rename the parameter
'type' to 'bus_type' to clarify its meaning.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-12-05 10:45:46 +01:00
Marc Hartmayer
c344d4b73f conf: simplify functions virDomainSCSIDriveAddressIsUsedBy*()
Pass the virDomainDeviceDriveAddress as a struct instead of individual
arguments. Reworked the function descriptions.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-05 10:45:46 +01:00
Yuri Chornoivan
ff8e021225 Fix minor typos 2016-12-02 09:25:13 +01:00
gaohaifeng
f81b33b50c qemuDomainAttachNetDevice: pass mq and vectors for vhost-user with multiqueue
Two reasons:
1.in none hotplug, we will pass it. We can see from libvirt function
qemuBuildVhostuserCommandLine
2.qemu will use this vetcor num to init msix table. If we don't pass, qemu
will use default value, this will cause VM can only use default value
interrupts at most.

Signed-off-by: gaohaifeng <gaohaifeng.gao@huawei.com>
2016-12-01 15:02:35 +01:00
Eric Farman
655429a0d4 qemu: Prevent detaching SCSI controller used by hostdev
Consider the following XML snippets:

  $ cat scsicontroller.xml
      <controller type='scsi' model='virtio-scsi' index='0'/>
  $ cat scsihostdev.xml
      <hostdev mode='subsystem' type='scsi'>
        <source>
          <adapter name='scsi_host0'/>
          <address bus='0' target='8' unit='1074151456'/>
        </source>
      </hostdev>

If we create a guest that includes the contents of scsihostdev.xml,
but forget the virtio-scsi controller described in scsicontroller.xml,
one is silently created for us.  The same holds true when attaching
a hostdev before the matching virtio-scsi controller.
(See qemuDomainFindOrCreateSCSIDiskController for context.)

Detaching the hostdev, followed by the controller, works well and the
guest behaves appropriately.

If we detach the virtio-scsi controller device first, any associated
hostdevs are detached for us by the underlying virtio-scsi code (this
is fine, since the connection is broken).  But all is not well, as the
guest is unable to receive new virtio-scsi devices (the attach commands
succeed, but devices never appear within the guest), nor even be
shutdown, after this point.

While this is not libvirt's problem, we can prevent falling into this
scenario by checking if a controller is being used by any hostdev
devices.  The same is already done for disk elements today.

Applying this patch and then using the XML snippets from earlier:

  $ virsh detach-device guest_01 scsicontroller.xml
  error: Failed to detach device from scsicontroller.xml
  error: operation failed: device cannot be detached: device is busy

  $ virsh detach-device guest_01 scsihostdev.xml
  Device detached successfully

  $ virsh detach-device guest_01 scsicontroller.xml
  Device detached successfully

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-11-30 17:16:47 -05:00
Laine Stump
70249927b7 qemu: assign VFIO devices to PCIe addresses when appropriate
Although nearly all host devices that are assigned to guests using
VFIO ("<hostdev>" devices in libvirt) are physically PCI Express
devices, until now libvirt's PCI address assignment has always
assigned them addresses on legacy PCI controllers in the guest, even
if the guest's machinetype has a PCIe root bus (e.g. q35 and
aarch64/virt).

This patch tries to assign them to an address on a PCIe controller
instead, when appropriate. First we do some preliminary checks that
might allow setting the flags without doing any extra work, and if
those conditions aren't met (and if libvirt is running privileged so
that it has proper permissions), we perform the (relatively) time
consuming task of reading the device's PCI config to see if it is an
Express device. If this is successful, the connect flags are set based
on the result, but if we aren't able to read the PCI config (most
likely due to the device not being present on the system at the time
of the check) we assume it is (or will be) an Express device, since
that is almost always the case anyway.
2016-11-30 15:41:57 -05:00
Laine Stump
9b0848d523 qemu: propagate virQEMUDriver object to qemuDomainDeviceCalculatePCIConnectFlags
If libvirtd is running unprivileged, it can open a device's PCI config
data in sysfs, but can only read the first 64 bytes. But as part of
determining whether a device is Express or legacy PCI,
qemuDomainDeviceCalculatePCIConnectFlags() will be updated in a future
patch to call virPCIDeviceIsPCIExpress(), which tries to read beyond
the first 64 bytes of the PCI config data and fails with an error log
if the read is unsuccessful.

In order to avoid creating a parallel "quiet" version of
virPCIDeviceIsPCIExpress(), this patch passes a virQEMUDriverPtr down
through all the call chains that initialize the
qemuDomainFillDevicePCIConnectFlagsIterData, and saves the driver
pointer with the rest of the iterdata so that it can be used by
qemuDomainDeviceCalculatePCIConnectFlags(). This pointer isn't used
yet, but will be used in an upcoming patch (that detects Express vs
legacy PCI for VFIO assigned devices) to examine driver->privileged.
2016-11-30 15:28:07 -05:00
Laine Stump
bfdc145153 util: new function virPCIDeviceGetConfigPath()
The path to the config file for a PCI device is conventiently stored
in a virPCIDevice object, but that object's contents aren't directly
visible outside of virpci.c, so we need to have an accessor function
for it if anyone needs to look at it.
2016-11-30 15:24:35 -05:00
Laine Stump
e026563f01 util: new function virFileLength()
This new function just calls fstat() (if provided with a valid fd) or
stat() (if fd is -1) and returns st_size (or -1 if there is an
error). We may decide we want this function to be more complex, and
handle things like block devices - this is a placeholder (that works)
for any more complicated function.
2016-11-30 15:18:57 -05:00
Jiri Denemark
4d8d7c02d7 cpu: Add alternative feature spellings to CPU map
We can't change feature names for compatibility reasons even if they
contain typos or other software uses different names for the same
features. By adding alternative spellings in our CPU map we at least
allow anyone to grep for them and find the correct libvirt's name.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-30 14:19:40 +01:00
Jiri Denemark
29cabba3d7 cpu: Remove useless comments from CPU map
They didn't really help anything.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-30 14:19:40 +01:00
Ján Tomko
2650d5e1f5 qemu: error out on USB ports out of range
My overly sophisticated address reservation code forgot
to add an error message for user-requested ports out of range.

https://bugzilla.redhat.com/show_bug.cgi?id=1399260
2016-11-30 10:59:01 +01:00
Christian Ehrhardt
dffdac06c0 virt-aa-helper: fix parsing security labels by introducing VIR_DOMAIN_DEF_PARSE_SKIP_SECLABEL
When virt-aa-helper parses xml content it can fail on security labels.

It fails by requiring to parse active domain content on seclabels that
are not yet filled in.

Testcase with virt-aa-helper on a minimal xml:
 $ cat << EOF > /tmp/test.xml
<domain type='kvm'>
    <name>test-seclabel</name>
    <uuid>12345678-9abc-def1-2345-6789abcdef00</uuid>
    <memory unit='KiB'>1</memory>
    <os><type arch='x86_64'>hvm</type></os>
    <seclabel type='dynamic' model='apparmor' relabel='yes'/>
    <seclabel type='dynamic' model='dac' relabel='yes'/>
</domain>
EOF
 $ /usr/lib/libvirt/virt-aa-helper -d -r -p 0 \
   -u libvirt-12345678-9abc-def1-2345-6789abcdef00 < /tmp/test.xml

Current Result:
 virt-aa-helper: error: could not parse XML
 virt-aa-helper: error: could not get VM definition
Expected Result is a valid apparmor profile

Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Guido Günther <agx@sigxcpu.org>
2016-11-30 08:15:57 +01:00
Jiri Denemark
939c50c390 Consolidate documentation of virDomainMigrate{,ToURI}{,2,3}
Only the latest APIs are fully documented and the documentation of the
older variants (which are just limited versions of the new APIs anyway)
points to the newest APIs.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-29 16:52:10 +01:00
Jiri Denemark
0355de2e77 qemuProcessReconnect: Avoid relabeling images after migration
Restarting libvirtd on the source host at the end of migration when a
domain is already running on the destination would cause image labels to
be reset effectively killing the domain. Commit e8d0166e1d fixed similar
issue on the destination host, but kept the source always resetting the
labels, which was mostly correct except for the specific case handled by
this patch.

https://bugzilla.redhat.com/show_bug.cgi?id=1343858

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-29 12:37:04 +01:00
Jiri Denemark
ee3ea86b37 qemu: Report tunnelled post-copy migration as unsupported
Post-copy migration needs bi-directional communication between the
source and the destination QEMU processes, which is not supported by
tunnelled migration.

https://bugzilla.redhat.com/show_bug.cgi?id=1371358

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-29 12:31:25 +01:00
Chen Hanxiao
17879605fe storage_backend_rbd: check the return value of rados_conf_set
We had a lot of rados_conf_set and check works.
Use helper virStorageBackendRBDRADOSConfSet for them.

Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
2016-11-28 07:51:08 -05:00
Peter Krempa
b87a11340f qemu: capabilities: Don't partially reprope caps on process reconnect
Thanks to the complex capability caching code virQEMUCapsProbeQMP was
never called when we were starting a new qemu VM. On the other hand,
when we are reconnecting to the qemu process we reload the capability
list from the status XML file. This means that the flag preventing the
function being called was not set and thus we partially reprobed some of
the capabilities.

The recent addition of CPU hotplug clears the
QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS if the machine does not support it.
The partial re-probe on reconnect results into attempting to call the
unsupported command and then killing the VM.

Remove the partial reprobe and depend on the stored capabilities. If it
will be necessary to reprobe the capabilities in the future, we should
do a full reprobe rather than this partial one.
2016-11-28 10:02:36 +01:00
Jiri Denemark
a1adfb0f06 qemu: Add support for unavailable-features
QEMU 2.8.0 adds support for unavailable-features in
query-cpu-definitions reply. The unavailable-features array lists CPU
features which prevent a corresponding CPU model from being usable on
current host. It can only be used when all the unavailable features are
disabled. Empty array means the CPU model can be used without
modifications.

We can use unavailable-features for providing CPU model usability info
in domain capabilities XML:

    <domainCapabilities>
      ...
      <cpu>
        <mode name='host-passthrough' supported='yes'/>
        <mode name='host-model' supported='yes'>
          <model fallback='allow'>Skylake-Client</model>
          ...
        </mode>
        <mode name='custom' supported='yes'>
          <model usable='yes'>qemu64</model>
          <model usable='yes'>qemu32</model>
          <model usable='no'>phenom</model>
          <model usable='yes'>pentium3</model>
          <model usable='yes'>pentium2</model>
          <model usable='yes'>pentium</model>
          <model usable='yes'>n270</model>
          <model usable='yes'>kvm64</model>
          <model usable='yes'>kvm32</model>
          <model usable='yes'>coreduo</model>
          <model usable='yes'>core2duo</model>
          <model usable='no'>athlon</model>
          <model usable='yes'>Westmere</model>
          <model usable='yes'>Skylake-Client</model>
          ...
        </mode>
      </cpu>
      ...
    </domainCapabilities>

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-28 09:11:22 +01:00
Jiri Denemark
73411a7ff1 qemu: Avoid reporting "host" as a supported CPU model
"host" CPU model is supported by a special host-passthrough CPU mode and
users is not allowed to specify this model directly with custom mode.
Thus we should not advertise "host" CPU model in domain capabilities.
This worked well on architectures for which libvirt provides a list of
supported CPU models in cpu_map.xml (since "host" is not in the list).
But we need to explicitly filter "host" model out for all other
architectures.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-25 20:59:19 +01:00
Jiri Denemark
7bf6f345e0 qemu: Probe CPU models for KVM and TCG
CPU models (and especially some additional details which we will start
probing for later) differ depending on the accelerator. Thus we need to
call query-cpu-definitions in both KVM and TCG mode to get all data we
want.

Tests in tests/domaincapstest.c are temporarily switched to TCG to avoid
having to squash even more stuff into this single patch. They will all
be switched back later in separate commits.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-25 20:34:27 +01:00
Jiri Denemark
7c95619cb1 qemu: Introduce virQEMUCapsFormatCPUModels
This patch moves the CPU models formatting code from
virQEMUCapsFormatCache into a separate function.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-25 20:34:26 +01:00
Jiri Denemark
1bdcd7a4ee qemu: Introduce virQEMUCapsLoadCPUModels
This patch moves the CPU models parsing code from virQEMUCapsLoadCache
into a separate function.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-25 20:34:26 +01:00
Jiri Denemark
f9d57f2b57 qemu: Refresh caps in virQEMUCapsCacheLookupByArch
The function just returned cached capabilities without checking whether
they are still valid. We should check that and refresh the capabilities
to make sure we don't return stale data. In other words, we should do
what all other lookup functions do.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-25 20:34:26 +01:00
Jiri Denemark
72e5aa4e1e qemu: Refactor virQEMUCapsCacheLookup
The function is made a little bit more readable and the code which
refreshes cached capabilities if they are not valid any more was moved
into a separate function (virQEMUCapsCacheValidate) so that it can be
reused in other places.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-25 20:34:26 +01:00
Jiri Denemark
cd51b90fbf qemu: Don't return unusable virttype in domain capabilities
If a user asked for a KVM domain capabilities when KVM is not available,
we would happily return data we got when probing through TCG and
pretended they were relevant for KVM. Let's just report KVM is not
supported to avoid confusion.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-25 20:34:26 +01:00
Jiri Denemark
8f55eef246 qemu: Use saner defaults for domain capabilities
When domain capabilities were introduced we did not have enough data to
decide whether KVM works on the host or not and thus working legacy/VFIO
device assignment was used as a witness. Now that we know whether KVM
was enabled when probing QEMU capabilities (and thus we know it's
working), we can use this knowledge to provide better default value for
virttype.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-25 20:34:26 +01:00
Jiri Denemark
d87df9bd39 qemu: Discard caps cache when KVM availability changes
Since some may depend on the accelerator used when probing QEMU the
cache becomes invalid when KVM becomes available or if it is not
available anymore.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-25 20:34:26 +01:00
Jiri Denemark
25ba9c31f5 qemu: Enable KVM when probing capabilities
CPU related capabilities may differ depending on accelerator used when
probing. Let's use KVM if available when probing QEMU and fall back to
TCG. The created capabilities already contain all we need to distinguish
whether KVM or TCG was used:

    - KVM was used when probing capabilities:
        QEMU_CAPS_KVM is set
        QEMU_CAPS_ENABLE_KVM is not set

    - TCG was used and QEMU supports KVM, but it failed (e.g., missing
      kernel module or wrong /dev/kvm permissions)
        QEMU_CAPS_KVM is not set
        QEMU_CAPS_ENABLE_KVM is set

    - KVM was not used and QEMU does not support it
        QEMU_CAPS_KVM is not set
        QEMU_CAPS_ENABLE_KVM is not set

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-25 20:34:26 +01:00
Jiri Denemark
429a7b231c qemu: Probe KVM state earlier
Let's set QEMU_CAPS_KVM and QEMU_CAPS_ENABLE_KVM early so that the rest
of the probing code can use these capabilities to handle KVM/TCG replies
differently.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-25 20:34:26 +01:00
Jiri Denemark
e73447f693 qemu: Use -machine when probing capabilities via QMP
Using -machine instead of -M for QMP probing is safe because any QEMU
binary which is capable of QMP probing supports -machine.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-25 20:34:26 +01:00
Jiri Denemark
4c5d05ea8a qemu: Make QMP probing process reusable
The code that runs a new QEMU process to be used for probing
capabilities is separated into four reusable functions so that any code
that wants to probe a QEMU process may just follow a few simple steps:

    cmd = virQEMUCapsInitQMPCommandNew(...);
    virQEMUCapsInitQMPCommandRun(cmd);

    /* talk to the running QEMU process using its QMP monitor */

    if (reprobeIsRequired) {
        virQEMUCapsInitQMPCommandAbort(cmd, ...);
        virQEMUCapsInitQMPCommandRun(cmd);

        /* talk to the running QEMU process again */
    }

    virQEMUCapsInitQMPCommandFree(cmd);

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-25 20:34:26 +01:00
Maxim Nestratov
745263589f Revert "vz: fixed race in vzDomainAttach/DettachDevice"
This reverts commit 3a6cf6fc16.

Mistakenly this commit was pushed because I thought I missed the
corret one b880ff42dd while in fact I didn't.

Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
2016-11-25 17:26:55 +03:00
Michal Privoznik
c2a5a4e7ea virstring: Unify string list function names
We have couple of functions that operate over NULL terminated
lits of strings. However, our naming sucks:

virStringJoin
virStringFreeList
virStringFreeListCount
virStringArrayHasString
virStringGetFirstWithPrefix

We can do better:

virStringListJoin
virStringListFree
virStringListFreeCount
virStringListHasString
virStringListGetFirstWithPrefix

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-11-25 13:54:05 +01:00
Boris Fiuczynski
b178fa8ecb qemu: fix internal error: NUMA isn't available on this host
If libvirt is compiled without NUMACTL support starting libvirtd
reports a libvirt internal error "NUMA isn't available on this host"
without checking if NUMA support is compiled into the libvirt binaries.
This patch adds the missing NUMA support check to prevent the internal error.
It also includes a check if the cgroup controller cpuset is available before
using it.

The error was noticed when libvirtd was restarted with running domains and
on libvirtd start the qemuConnectCgroup gets called during qemuProcessReconnect.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
2016-11-25 09:48:41 +01:00
Eric Farman
ae5d30a0b3 conf: Wire up the vhost-scsi connection from/to XML
With the QEMU components in place, provide the XML parsing to
invoke that code when given the following XML snippet:

    <hostdev mode='subsystem' type='scsi_host'>
      <source protocol='vhost' wwpn='naa.501234567890abcd'/>
    </hostdev>

An optional address element can be specified within the hostdev
(pick CCW or PCI as necessary):

    <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0625'/>
    <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>

Add basic vhost-scsi tests which were cloned from hostdev-scsi-virtio-scsi
in both xml2argv and xml2xml. Added ones for both vhost-scsi-ccw and
vhost-scsi-pci since the syntaxes are slightly different between them.

Also adjusted the docs to describe the changes.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-11-24 12:22:25 -05:00
Eric Farman
81a206f52b security: Include vhost-scsi in security labels
Ensure that the vhost-scsi wwpn information is passed to the
different security policies.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
2016-11-24 12:16:26 -05:00
Eric Farman
8c6d365373 qemu: Allow hotplug of vhost-scsi device
Adjust the device string that is built for vhost-scsi devices so that it
can be invoked from hotplug.

From the QEMU command line, the file descriptors are expect to be numeric only.
However, for hotplug, the file descriptors are expected to begin with at least
one alphabetic character else this error occurs:

  # virsh attach-device guest_0001 ~/vhost.xml
  error: Failed to attach device from /root/vhost.xml
  error: internal error: unable to execute QEMU command 'getfd':
  Parameter 'fdname' expects a name not starting with a digit

We also close the file descriptor in this case, so that shutting down the
guest cleans up the host cgroup entries and allows future guests to use
vhost-scsi devices.  (Otherwise the guest will silently end.)

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
2016-11-24 12:16:23 -05:00
Eric Farman
9cc26dc622 qemu: Add vhost-scsi string for -device parameter
Open /dev/vhost-scsi, and record the resulting file descriptor, so that
the guest has access to the host device outside of the libvirt daemon.
Pass this information, along with data parsed from the XML file, to build
a device string for the qemu command line.  That device string will be
for either a vhost-scsi-ccw device in the case of an s390 machine, or
vhost-scsi-pci for any others.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
2016-11-24 12:16:19 -05:00
Eric Farman
629544be0f util: Management routines for scsi_host devices
For a new hostdev type='scsi_host' we have a number of
required functions for managing, adding, and removing the
host device to/from guests.  Provide the basic infrastructure
for these tasks.

The name "SCSIVHost" (and its variants) is chosen to avoid
conflicts with existing code named "SCSIHost" to refer to
a hostdev type='scsi' protcol='none'.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
2016-11-24 12:15:26 -05:00
Eric Farman
fc0e627bac Introduce framework for a hostdev SCSI_host subsystem type
We already have a "scsi" hostdev subsys type, which refers to a single
LUN that is passed through to a guest.  But what of things where
multiple LUNs are passed through via a single SCSI HBA, such as with
the vhost-scsi target?  Create a new hostdev subsys type that will
carry this.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
2016-11-24 12:15:26 -05:00
Eric Farman
c271fc1f35 qemu: Introduce vhost-scsi capability
Do all the stuff for the vhost-scsi capability in QEMU,
so it's in place for our checks later.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-11-24 12:15:26 -05:00
Roman Bogorodskiy
e4cb660160 genprotocol.pl: add darwin to fixup list 2016-11-24 17:17:43 +03:00
Dawid Zamirski
6358653596 vbox: get rid of g_pVBoxGlobalData
now that we have a new global vboxDriver object, remove the old
vboxGlobalData struct and all references to it.
2016-11-23 14:47:21 -05:00
Dawid Zamirski
04518c364b vbox: change how vbox API is initialized.
* add vboxDriver object to serve as a singleton global object that
  holds references to IVirtualBox and ISession to be shared among
  multiple connections. The vbox_driver is instantiated only once in
  the first call vboxGetDriverConnection function that is guarded by
  a mutex.

* call vbox API initialize only when the first connection is
  established, and likewise uninitialize when last connection
  disconnects. The prevents each subsequent connection from overwriting
  IVirtualBox/ISession instances of any other active connection that
  led to libvirtd segfaults. The virConnectOpen and virConnectClose
  implementations are guarded by mutex on the global vbox_driver_lock
  where the global vbox_driver object counts connectios and decides
  when it's safe to call vbox's init/uninit routines.

* add IVirutalBoxClient to vboxDriver and use it to in tandem with newer
  pfnClientInitialize/pfnClientUninitalize APIs for vbox versions that
  support it, to avoid usage of the old pfnComInitialize/Uninitialize.
2016-11-23 14:38:14 -05:00
Marc Hartmayer
b270ef9981 qemu: Removed an outdated comment in qemuDomainSaveImageStartVM()
Removed the comment 'Set the migration source' as it isn't valid anymore
and 'start it up' isn't useful as qemuProcessStart() is already a
speaking name.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
2016-11-23 12:33:38 -05:00
Bjoern Walk
fdb060f0b5 virutil: fix trailing '/' for path prefixes
The path prefixes for sysfs trees are always prepended by paths
beginning with a slash, making the trailing slash in the prefix
redundant.

Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-11-23 12:33:38 -05:00
Marc Hartmayer
7ab1fd91a4 virfile: Only generate a warning if there is something to report
Only generate a warning if there is something to report.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-11-23 12:33:38 -05:00
Nitesh Konkar
d276da48bc Fix typos and grammar
Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
2016-11-23 12:08:15 -05:00
Michal Privoznik
5d9c2c7081 qemu: Update cgroup on chardev hotplug
Just like in the previous commit, we are not updating CGroups on
chardev hot(un-)plug and thus leaving qemu unable to access any
non-default device users are trying to hotplug.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-11-23 16:38:02 +01:00
Michal Privoznik
085692c8bb qemu: Update cgroup on RNG hotplug
If users try to hotplug RNG device with a backend different to
/dev/random or /dev/urandom the whole operation fails as qemu is
unable to access the device. The problem is we don't update
device CGroups during the operation.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-11-23 16:37:57 +01:00
Nikolay Shirokovskiy
aaf2992d90 qemu: agent: fix unsafe agent access
qemuDomainObjExitAgent is unsafe.

First it accesses domain object without domain lock.
Second it uses outdated logic that goes back to commit 79533da1 of
year 2009 when code was quite different. (unref function
instead of unreferencing only unlocked and disposed object
in case of last reference and leaved unlocking to the caller otherwise).
Nowadays this logic may lead to disposing locked object
i guess.

Another problem is that the callers of qemuDomainObjEnterAgent
use domain object again (namely priv->agent) without domain lock.

This patch address these two problems.

qemuDomainGetAgent is dropped as unused.
2016-11-23 11:31:28 +03:00
Nikolay Shirokovskiy
3c1c56781d qemu: drop write-only agentStart 2016-11-23 11:31:14 +03:00
Nikolay Shirokovskiy
6ba861ae36 qemu: agent: cleanup agent error flag correctly
Sometimes after domain restart agent is unavailabe even
if it is up and running in guest. Diagnostic message is
"QEMU guest agent is not available due to an error"
that is 'priv->agentError' is set. Investiagion shows that
'priv->agent' is not NULL, so error flag is set probably
during domain shutdown process and not cleaned up eventually.

The patch is quite simple - just clean up error flag unconditionally
upon domain stop.

Other hunks address other cases when error flag is not cleaned up.

1. processSerialChangedEvent. We need to clean error flag
unconditionally here too. For example if upon first 'connected' event we
fail to connect and set error flag and then connect on second
'connected' event then error flag will remain set erroneously
and make agent unavailable.

2. qemuProcessHandleAgentEOF. If error flag is set and we get
EOF we need to change state (and diagnostic) from 'error' to
'not connected'.
2016-11-23 11:14:44 +03:00
Nikolay Shirokovskiy
f5109f20ff qemu: agent: remove redundant check 2016-11-23 11:14:28 +03:00
Nikolay Shirokovskiy
851ae08e3e qemu: agent: handle agent connection errors in one place
qemuConnectAgent return -1 or -2 in case of different errors.
A. -1 is a case of unsuccessuful connection to guest agent.
B. -2 is a case of destoyed domain during connection attempt.

All qemuConnectAgent callers handle the first error the same way
so let's move this logic into qemuConnectAgent itself. Patched
function returns 0 in case A and -1 in case B.
2016-11-23 11:14:11 +03:00
Nikolay Shirokovskiy
01079727fe libvirtd: systemd: add special target for system shutdown
It is already discussed in "[RFC] daemon: remove hardcode dep on libvirt-guests" [1].

Mgmt can use means to save/restore domains on system shutdown/boot other than
libvirt-guests.service. Thus we need to specify appropriate ordering dependency between
libvirtd, domains and save/restore service. This patch takes approach suggested
in RFC and introduces a systemd target, so that ordering can be built next way:

libvirtd -> domain -> virt-guest-shutdown.target -> save-restore.service.

This way domains are decoupled from specific shutdown service via intermediate
target.

[1] https://www.redhat.com/archives/libvir-list/2016-September/msg01353.html
2016-11-23 11:13:53 +03:00
Marc Hartmayer
1c122e737e Refactoring: Use virHostdevIsSCSIDevice()
Use the util function virHostdevIsSCSIDevice() to simplify if
statements.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-11-22 14:37:36 +01:00
Marc Hartmayer
20bf8ea693 util: Add virHostdevIsSCSIDevice()
Add the function virHostdevIsSCSIDevice() which detects whether a
hostdev is a SCSI device or not.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-11-22 14:37:36 +01:00
Marc Hartmayer
505bc9b025 qemu: Fix improper union member access on hostdevs
Add missing checks if a hostdev is a subsystem/SCSI device before access
the union member 'subsys'/'scsi'.  Also fix indentation and simplify
qemuDomainObjCheckHostdevTaint().

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-11-22 14:37:36 +01:00
Sławek Kapłoński
ae381879f3 Forbid new-line char in name of new storagepool
New line character in name of storagepool is now forbidden because it
mess virsh output and can be confusing for users.
Validation of name is done in driver, after parsing XML to avoid
problems with dissappeared pools which was already created with
new-line char in name.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-11-22 14:36:47 +01:00
Sławek Kapłoński
6c98ac2c62 Forbid new-line char in name of new domain
New line character in name of domain is now forbidden because it
mess virsh output and can be confusing for users.
Validation of name is done in drivers, after parsing XML to avoid
problems with dissappeared domains which was already created with
new-line char in name.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-11-22 14:35:14 +01:00
Peter Krempa
b6afa9a8b5 qemu: monitor: Properly propagate the 'qemu_id' field through the matcher
Commit 3f71c79768 added 'qemu_id' field to track the id of the cpu
as reported by query-cpus. The patch did not include changes necessary
to propagate the id through the functions matching the data to the
libvirt cpu structures and thus all vcpus had id 0.
2016-11-22 10:44:17 +01:00
Roman Bogorodskiy
0b4c3bd307 bhyve: cleanup bhyveBuildNetArgStr error handling
Use 'goto cleanup'-style error handling instead of explicitly
freeing variables in every error path.
2016-11-21 20:17:41 +03:00
Peter Krempa
0df2524acb qemu: domain: Refresh vcpu halted state using qemuMonitorGetCpuHalted
Don't use qemuMonitorGetCPUInfo which does a lot of matching to get the
full picture which is not necessary and would be mostly discarded.

Refresh only the vcpu halted state using data from query-cpus.
2016-11-21 17:19:48 +01:00
Peter Krempa
5d885f4ff3 qemu: monitor: Extract halted state to a bitmap indexed by cpu id
We don't need to call qemuMonitorGetCPUInfo which is very inefficient to
get data required to update the vcpu 'halted' state.

Add a monitor helper that will retrieve the halted state and return it
in a bitmap so that it can be indexed easily.
2016-11-21 17:19:48 +01:00
Peter Krempa
3f71c79768 qemu: monitor: Extract qemu cpu id along with other data
Storing of the ID will allow simpler extraction of data present only in
query-cpus without the need to call qemuMonitorGetCPUInfo in statistics
paths.
2016-11-21 17:19:48 +01:00
Jiri Denemark
2e0d6cdec4 qemu_monitor_json: Don't check existence of "return" object
Whenever qemuMonitorJSONCheckError returns 0, the "return" object is
guaranteed to exist. Thus virJSONValueObjectGetObject will never fail to
get it. On the other hand, virJSONValueObjectGetArray may fail since the
"return" object may not be an array.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-21 16:14:52 +01:00
Peter Krempa
4fa7ba0b32 qemu: process: Set current vcpu count to maximum if it was not specified
Mimic qemu's behavior on the given command line.
2016-11-21 14:35:20 +01:00
Peter Krempa
d3734b7a1d qemu: parse: Assign maximum cpu count from topology if not provided
qemu uses this if 'maxcpus' is not present. Do the same in the parsing
code.
2016-11-21 14:35:20 +01:00
Peter Krempa
0d9a76de6d qemu: parse: Assign topology info earlier
Qemu can also use the topology to calculate the total vcpu count. To
allow parsing this move the assignment earlier.
2016-11-21 14:35:20 +01:00
Peter Krempa
d78a8c26c2 qemu: parse: Allow the 'cpus=' prefix for current cpu number
qemu allows following syntax:

  -smp [cpus=]n[,cores=cores][,threads=threads][,sockets=sockets][,maxcpus=maxcpus]

Allow the "cpus" prefix.
2016-11-21 14:35:20 +01:00
Peter Krempa
4d72d80665 qemu: parse: Validate that the VM has at least one cpu
Libvirt's code relies on this fact so don't allow parsing a command line
which would have none.

Libvirtd would crash in the post parse callback on such config.
2016-11-21 14:35:20 +01:00
Michal Privoznik
0c1bfd2c8d tests: Adapt to gluster_debug_level in qemu.conf
After a944bd92 we gained support for setting gluster debug level.
However, due to a space we haven't tested whether augeas file
actually works.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-11-21 10:50:48 +01:00
Martin Kletzander
8158a19fd7 util: Print pid_t as long long
After commit f2bf5fbb04, MinGW strikes again.  Simply print pid as any
other place after commit b7d2d4af2b.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-11-20 21:46:21 +01:00
Martin Kletzander
f2bf5fbb04 Fix scheduler support check
Commit 94cc577807 tried fixing build on systems that did not have
SCHED_BATCH or SCHED_IDLE defined.  But instead of changing it to
conditional support, it rather completely disabled the support for
setting any scheduler.  Since then, such old systems are not
supported, but rather than reverting that commit, let's change that to
the conditional support.  That way any addition to the list of
schedulers can follow the same style so that we're consistent in the
future.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-11-18 16:08:52 +01:00
John Ferlan
135e77d32f fs: Add proper switch to create filesystem with overwrite
https://bugzilla.redhat.com/show_bug.cgi?id=1366460

When using the --overwrite switch on a pool-build or pool-create, the
The mkfs.ext{2|3|4} commands use mke2fs which requires using the '-F' switch
in order to force overwriting the current filesystem on the whole disk.

Likewise, the mkfs.vfat command uses mkfs.fat which requires using the '-I'
switch in order to force overwriting the current filesystem on the whole disk.
2016-11-16 06:52:35 -05:00
Jiri Denemark
29e17f65a8 Avoid compiler warnings in virCPUDefStealModel
Old GCC on CentOS 6 thinks vendor and vendor_id might be used
uninitialized in virCPUDefStealModel. The compiler is wrong, though.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-16 09:04:51 +01:00
Pino Toscano
22eaee8e01 remote: expose a new libssh transport
Implement in virtNetClient and VirNetSocket the needed functions to
expose a new libssh transport, providing all the options that the
libssh2 transport supports.
2016-11-15 15:50:51 +01:00
Pino Toscano
6917467c2b libssh_transport: add new libssh-based transport
Implement a new libssh transport, which uses libssh to communicate with
remote hosts, and add all the build system stuff (search of libssh,
private symbols, etc) to built it.

This new transport supports all the common ssh authentication methods,
making use of libvirt's auth callbacks for interaction with the user.
2016-11-15 15:50:51 +01:00
Pino Toscano
24ee5dc907 virnetsocket: improve search for default SSH key
Add a couple of helper functions to check whether one of the default
names of SSH keys (as documented in ssh-keygen(1)) exists, and use them
to specify a key for the libssh2 transport if none was passed.
2016-11-15 15:50:51 +01:00
Pino Toscano
f0e7f90bff virerror: add error for libssh transport
Add a new error domain and number for a new libssh-based transport.
2016-11-15 15:50:51 +01:00
Pino Toscano
0e9fec979d virNetSocket: allow to not close FD
Add an internal variable to mark the FD as "not owned" by the
virNetSocket, in case the internal implementation takes the actual
ownership of the descriptor; this avoids a warning when closing the
socket, as the FD would be invalid.
2016-11-15 15:50:51 +01:00
Jiri Denemark
03fa904c0c cpu: Drop cpuGuestData
The API is not used anywhere in the code.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-15 15:49:16 +01:00
Jiri Denemark
98b7c37d37 cpu: Avoid adding <vendor> to custom CPUs
Guest CPU definitions with mode='custom' and missing <vendor> are
expected to run on a host CPU from any vendor as long as the required
CPU model can be used as a guest CPU on the host. But even though no CPU
vendor was explicitly requested we would sometimes force it due to a bug
in virCPUUpdate and virCPUTranslate.

The bug would effectively forbid cross vendor migrations even if they
were previously working just fine.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-15 15:49:16 +01:00
Jiri Denemark
d73422c186 cpu: Introduce virCPUConvertLegacy API
PPC driver needs to convert POWERx_v* legacy CPU model names into POWERx
to maintain backward compatibility with existing domains. This patch
adds a new step into the guest CPU configuration work flow which CPU
drivers can use to convert legacy CPU definitions.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-15 15:49:16 +01:00
Jiri Denemark
2a2ce08a6d cpu: Make models array in virCPUTranslate constant
The API doesn't change the array so let's make it constant.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-15 15:49:16 +01:00
Jiri Denemark
53a5986ad6 cpu: Rename cpuDataFormat
The new name is virCPUDataFormat.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-15 15:49:15 +01:00
Jiri Denemark
be57e68954 cpu: Rename cpuDataParse
The new name is virCPUDataParse.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-15 15:49:15 +01:00
Jiri Denemark
938ec1620a cpu: Rename and document cpuModelIsAllowed
The new name is virCPUModelIsAllowed.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-15 15:49:15 +01:00
Jiri Denemark
b7011dfe44 cpu: Rename cpuGetModels
The new name is virCPUGetModels.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-11-15 15:49:15 +01:00
Maxim Nestratov
007fb4388f qemu: fix libvirtd crash when querying halted cpus info
It was introduced by commit 7a51d9ebb, which started to use
monitor commands without job acquiring, which is unsafe and leads
to simultaneous access to vm->mon structure by different threads.

Crash backtrace is the following (shortened):

Program received signal SIGSEGV, Segmentation fault.
qemuMonitorSend (mon=mon@entry=0x7f4ef4000d20, msg=msg@entry=0x7f4f18e78640) at qemu/qemu_monitor.c:1011
1011        while (!mon->msg->finished) {

0  qemuMonitorSend () at qemu/qemu_monitor.c:1011
1  0x00007f691abdc720 in qemuMonitorJSONCommandWithFd () at qemu/qemu_monitor_json.c:298
2  0x00007f691abde64a in qemuMonitorJSONCommand at qemu/qemu_monitor_json.c:328
3  qemuMonitorJSONQueryCPUs at qemu/qemu_monitor_json.c:1408
4  0x00007f691abcaebd in qemuMonitorGetCPUInfo g@entry=false) at qemu/qemu_monitor.c:1931
5  0x00007f691ab96863 in qemuDomainRefreshVcpuHalted at qemu/qemu_domain.c:6309
6  0x00007f691ac0af99 in qemuDomainGetStatsVcpu at qemu/qemu_driver.c:18945
7  0x00007f691abef921 in qemuDomainGetStats  at qemu/qemu_driver.c:19469
8  qemuConnectGetAllDomainStats at qemu/qemu_driver.c:19559
9  0x00007f693382e806 in virConnectGetAllDomainStats at libvirt-domain.c:11546
10 0x00007f6934470c40 in remoteDispatchConnectGetAllDomainStats at remote.c:6267

(gdb) p mon->msg
$1 = (qemuMonitorMessagePtr) 0x0

This change fixes it by calling qemuDomainRefreshVcpuHalted only when job is acquired.

Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
2016-11-15 17:39:24 +03:00
Laine Stump
70d15c9ac6 qemu: initially reserve one open pcie-root-port for hotplug
For machinetypes with a pci-root bus (all legacy PCI), libvirt will
make a "fake" reservation for one extra slot prior to assigning
addresses to unaddressed PCI endpoint devices in the domain. This will
trigger auto-adding of a pci-bridge for the final device to be
assigned an address *if that device would have otherwise instead been
the last device on the last available pci-bridge*; thus it assures
that there will always be at least one slot left open in the domain's
bus topology for expansion (which is important both for hotplug (since
a new pci-bridge can't be added while the guest is running) as well as
for offline additions to the config (since adding a new device might
otherwise in some cases require re-addressing existing devices, which
we want to avoid)).

It's important to note that for the above case (legacy PCI), we must
check for the special case of all slots on all buses being occupied
*prior to assigning any addresses*, and avoid attempting to reserve
the extra address in that case, because there is no free address in
the existing topology, so no place to auto-add a pci-bridge for
expansion (i.e. it would always fail anyway). Since that condition can
only be reached by manual intervention, this is acceptable.

For machinetypes with pcie-root (Q35, aarch64 virt), libvirt's
methodology for automatically expanding the bus topology is different
- pcie-root-ports are plugged into slots (soon to be functions) of
pcie-root as needed, and the new endpoint devices are assigned to the
single slot in each pcie-root-port. This is done so that the devices
are, by default, hotpluggable (the slots of pcie-root don't support
hotplug, but the single slot of the pcie-root-port does). Since
pcie-root-ports can only be plugged into pcie-root, and we don't
auto-assign endpoint devices to the pcie-root slots, this means
topology expansion doesn't compete with endpoint devices for slots, so
we don't need to worry about checking for all "useful" slots being
free *prior* to assigning addresses to new endpoint devices - as a
matter of fact, if we attempt to reserve the open slots before the
used slots, it can lead to errors.

Instead this patch just reserves one slot for a "future potential"
PCIe device after doing the assignment for actual devices, but only
if the only PCI controller defined prior to starting address
assignment was pcie-root, and only if we auto-added at least one PCI
controller during address assignment. This assures two things:

1) that reserving the open slots will only be done when the domain is
   initially defined, never at any time after, and

2) that if the user understands enough about PCI controllers that they
   are adding them manually, that we don't mess up their plan by
   adding extras - if they know enough to add one pcie-root-port, or
   to manually assign addresses such that no pcie-root-ports are
   needed, they know enough to add extra pcie-root-ports if they want
   them (this could be called the "libguestfs clause", since
   libguestfs needs to be able to create domains with as few
   devices/controllers as possible).

This is set to reserve a single free port for now, but could be
increased in the future if public sentiment goes in that direction
(it's easy to increase later, but essentially impossible to decrease)
2016-11-14 14:23:48 -05:00
Laine Stump
8d873a5a47 qemu: try to put ich9 sound device at 00:1B.0
Real Q35 hardware has an ICH9 chip that includes several integrated
devices at particular addresses (see the file docs/q35-chipset.cfg in
the qemu source). libvirt already attempts to put the first two sets
of ich9 USB2 controllers it finds at 00:1D.* and 00:1A.* to match the
real hardware. This patch does the same for the ich9 "HD audio"
device.

The main inspiration for this patch is that currently the *only*
device in a reasonable "workstation" type virtual machine config that
requires a legacy PCI slot is the audio device, Without this patch,
the standard Q35 machine created by virt-manager will have a
dmi-to-pci-bridge and a pci-bridge just for the sound device; with the
patch (and if you change the sound device model from the default
"ich6" to "ich9"), the machine definition constructed by virt-manager
has absolutely no legacy PCI controllers - any legacy PCI devices
(e.g. video and sound) are on pcie-root as integrated devices.
2016-11-14 14:23:01 -05:00
Laine Stump
d8bd837669 qemu: add a USB3 controller to Q35 domains by default
Previously we added a set of EHCI+UHCI controllers to Q35 machines to
mimic real hardware as closely as possible, but recent discussions
have pointed out that the nec-usb-xhci (USB3) controller is much more
virtualization-friendly (uses less CPU), so this patch switches the
default for Q35 machinetypes to add an XHCI instead (if it's
supported, which it of course *will* be).

Since none of the existing test cases left out USB controllers in the
input XML, a new Q35 test case was added which has *no* devices, so
ends up with only the defaults always put in by qemu, plus those added
by libvirt.
2016-11-14 14:22:23 -05:00
Laine Stump
807232203a qemu: don't force-add a dmi-to-pci-bridge just on principle
Now the a dmi-to-pci-bridge is automatically added just as it's needed
(when a pci-bridge is being added), we no longer have any need to
force-add one to every single Q35 domain.
2016-11-14 14:21:43 -05:00
Laine Stump
0702f48ef4 qemu: auto-add pcie-root-port/dmi-to-pci-bridge controllers as needed
Previously libvirt would only add pci-bridge devices automatically
when an address was requested for a device that required a legacy PCI
slot and none was available. This patch expands that support to
dmi-to-pci-bridge (which is needed in order to add a pci-bridge on a
machine with a pcie-root), and pcie-root-port (which is needed to add
a hotpluggable PCIe device). It does *not* automatically add
pcie-switch-upstream-ports or pcie-switch-downstream-ports (and
currently there are no plans for that).

Given the existing code to auto-add pci-bridge devices, automatically
adding pcie-root-ports is fairly straightforward. The
dmi-to-pci-bridge support is a bit tricky though, for a few reasons:

1) Although the only reason to add a dmi-to-pci-bridge is so that
   there is a reasonable place to plug in a pci-bridge controller,
   most of the time it's not the presence of a pci-bridge *in the
   config* that triggers the requirement to add a dmi-to-pci-bridge.
   Rather, it is the presence of a legacy-PCI device in the config,
   which triggers auto-add of a pci-bridge, which triggers auto-add of
   a dmi-to-pci-bridge (this is handled in
   virDomainPCIAddressSetGrow() - if there's a request to add a
   pci-bridge we'll check if there is a suitable bus to plug it into;
   if not, we first add a dmi-to-pci-bridge).

2) Once there is already a single dmi-to-pci-bridge on the system,
   there won't be a need for any more, even if it's full, as long as
   there is a pci-bridge with an open slot - you can also plug
   pci-bridges into existing pci-bridges. So we have to make sure we
   don't add a dmi-to-pci-bridge unless there aren't any
   dmi-to-pci-bridges *or* any pci-bridges.

3) Although it is strongly discouraged, it is legal for a pci-bridge
   to be directly plugged into pcie-root, and we don't want to
   auto-add a dmi-to-pci-bridge if there is already a pci-bridge
   that's been forced directly into pcie-root.

Although libvirt will now automatically create a dmi-to-pci-bridge
when it's needed, the code still remains for now that forces a
dmi-to-pci-bridge on all domains with pcie-root (in
qemuDomainDefAddDefaultDevices()). That will be removed in a future
patch.

For now, the pcie-root-ports are added one to a slot, which is a bit
wasteful and means it will fail after 31 total PCIe devices (30 if
there are also some PCI devices), but helps keep the changeset down
for this patch. A future patch will have 8 pcie-root-ports sharing the
functions on a single slot.
2016-11-14 14:19:36 -05:00
Laine Stump
b2c887844f qemu: only force an available legacy-PCI slot on domains with pci-root
Andrea had the right idea when he disabled the "reserve an extra
unused slot" bit for aarch64/virt. For *any* PCI Express-based
machine, it is pointless since 1) an extra legacy-PCI slot can't be
used for hotplug, since hotplug into legacy PCI slots doesn't work on
PCI Express machinetypes, and 2) even for "coldplug" expansion,
everybody will want to expand using Express controllers, not legacy
PCI.

This patch eliminates the extra slot reserve unless the system has a
pci-root (i.e. legacy PCI)
2016-11-14 14:18:49 -05:00
Laine Stump
5266426b21 qemu: assign nec-xhci (USB3) controller to a PCIe address when appropriate
The nec-usb-xhci device (which is a USB3 controller) has always
presented itself as a PCI device when plugged into a legacy PCI slot,
and a PCIe device when plugged into a PCIe slot, but libvirt has
always auto-assigned it to a legacy PCI slot.

This patch changes that behavior to auto-assign to a PCIe slot on
systems that have pcie-root (e.g. Q35 and aarch64/virt).

Since we don't yet auto-create pcie-*-port controllers on demand, this
means a config with an nec-xhci USB controller that has no PCI address
assigned will also need to have an otherwise-unused pcie-*-port
controller specified:

   <controller type='pci' model='pcie-root-port'/>
   <controller type='usb' model='nec-xhci'/>

(this assumes there is an otherwise-unused slot on pcie-root to accept
the pcie-root-port)
2016-11-14 14:18:06 -05:00
Laine Stump
9dfe733e99 qemu: assign e1000e network devices to PCIe slots when appropriate
The e1000e is an emulated network device based on the Intel 82574,
present in qemu 2.7.0 and later. Among other differences from the
e1000, it presents itself as a PCIe device rather than legacy PCI. In
order to get it assigned to a PCIe controller, this patch updates the
flags setting for network devices when the model name is "e1000e".

(Note that for some reason libvirt has never validated the network
device model names other than to check that there are no dangerous
characters in them. That should probably change, but is the subject of
another patch.)

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1343094
2016-11-14 14:17:14 -05:00
Laine Stump
c7fc151eec qemu: assign virtio devices to PCIe slot when appropriate
libvirt previously assigned nearly all devices to a "hotpluggable"
legacy PCI slot even on machines with a PCIe root bus (and even though
most such machines don't even support hotplug on legacy PCI slots!)
Forcing all devices onto legacy PCI slots means that the domain will
need a dmi-to-pci-bridge (to convert from PCIe to legacy PCI) and a
pci-bridge (to provide hotpluggable legacy PCI slots which, again,
usually aren't hotpluggable anyway).

To help reduce the need for these legacy controllers, this patch tries
to assign virtio-1.0-capable devices to PCIe slots whenever possible,
by setting appropriate connectFlags in
virDomainCalculateDevicePCIConnectFlags(). Happily, when that function
was written (just a few commits ago) it was created with a
"virtioFlags" argument, set by both of its callers, which is the
proper connectFlags to set for any virtio-*-pci device - depending on
the arch/machinetype of the domain, and whether or not the qemu binary
supports virtio-1.0, that flag will have either been set to PCI or
PCIe. This patch merely enables the functionality by setting the flags
for the device to whatever is in virtioFlags if the device is a
virtio-*-pci device.

NB: the first virtio video device will be placed directly on bus 0
slot 1 rather than on a pcie-root-port due to the override for primary
video devices in qemuDomainValidateDevicePCISlotsQ35(). Whether or not
to change that is a topic of discussion, but this patch doesn't change
that particular behavior.

NB2: since the slot must be hotpluggable, and pcie-root (the PCIe root
complex) does *not* support hotplug, this means that suitable
controllers must also be in the config (i.e. either pcie-root-port, or
pcie-downstream-port). For now, libvirt doesn't add those
automatically, so if you put virtio devices in a config for a qemu
that has PCIe-capable virtio devices, you'll need to add extra
pcie-root-ports yourself. That requirement will be eliminated in a
future patch, but for now, it's simple to do this:

   <controller type='pci' model='pcie-root-port'/>
   <controller type='pci' model='pcie-root-port'/>
   <controller type='pci' model='pcie-root-port'/>
   ...

Partially Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1330024
2016-11-14 14:16:12 -05:00
Laine Stump
b27375a9b8 qemu: set pciConnectFlags to 0 instead of PCI|HOTPLUGGABLE if device isn't PCI
This patch cleans up the connect flags for certain types/models of
devices that aren't PCI to return 0. In the future that may be used as
an indicator to the caller about whether or not a device needs a PCI
address. For now it's just ignored, except for in
virDomainPCIAddressEnsureAddr() - called during device hotplug - (and
in some cases actually needs to be re-set to PCI|HOTPLUGGABLE just in
case someone (in some old config) has manually set a PCI address for a
device that isn't PCI.
2016-11-14 14:14:38 -05:00