Commit Graph

773 Commits

Author SHA1 Message Date
Erik Skultety
c8e6775f30 qemu: Bump the memory locking limit for mdevs as well
Since mdevs are just another type of VFIO devices, we should increase
the memory locking limit the same way we do for VFIO PCI devices.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2017-03-27 15:39:35 +02:00
Erik Skultety
de4e8bdbc7 qemu: cgroup: Adjust cgroups' logic to allow mediated devices
As goes for all the other hostdev device types, grant the qemu process
access to /dev/vfio/<mediated_device_iommu_group>.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2017-03-27 15:39:35 +02:00
Erik Skultety
ec783d7c77 conf: Introduce new hostdev device type mdev
A mediated device will be identified by a UUID (with 'model' now being
a mandatory <hostdev> attribute to represent the mediated device API) of
the user pre-created mediated device. We also need to make sure that if
user explicitly provides a guest address for a mdev device, the address
type will be matching the device API supported on that specific mediated
device and error out with an incorrect XML message.

The resulting device XML:
<devices>
  <hostdev mode='subsystem' type='mdev' model='vfio-pci'>
    <source>
      <address uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'>
    </source>
  </hostdev>
</devices>

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2017-03-27 15:39:35 +02:00
Peter Krempa
9b93c4c264 qemu: domain: Add helper to look up disk soruce by the backing store string 2017-03-27 10:18:16 +02:00
Peter Krempa
4e1618ce72 qemu: domain: Add helper to generate indexed backing store names
The code is currently simple, but if we later add node names, it will be
necessary to generate the names based on the node name. Add a helper so
that there's a central point to fix once we add self-generated node
names.
2017-03-27 09:29:57 +02:00
Peter Krempa
1a5e2a8098 qemu: domain: Add helper to lookup disk by node name
Looks up a disk and its corresponding backing chain element by node
name.
2017-03-27 09:29:57 +02:00
John Ferlan
1a6b6d9a56 qemu: Set up the migration TLS objects for target
If the migration flags indicate this migration will be using TLS,
then set up the destination during the prepare phase once the target
domain has been started to add the TLS objects to perform the migration.

This will create at least an "-object tls-creds-x509,endpoint=server,..."
for TLS credentials and potentially an "-object secret,..." to handle the
passphrase response to access the TLS credentials. The alias/id used for
the TLS objects will contain "libvirt_migrate".

Once the objects are created, the code will set the "tls-creds" and
"tls-hostname" migration parameters to signify usage of TLS.

During the Finish phase we'll be sure to attempt to clear the
migration parameters and delete those objects (whether or not they
were created). We'll also perform the same reset during recovery
if we've reached FINISH3.

If the migration isn't using TLS, then be sure to check if the
migration parameters exist and clear them if so.
2017-03-25 08:19:49 -04:00
Jiri Denemark
fcd56ce866 qemu: Set default values for CPU check attribute
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2017-03-17 11:50:48 +01:00
Michal Privoznik
7b89f857d9 qemu: Namespaces for NVDIMM
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-03-15 17:04:33 +01:00
Michal Privoznik
1bc173199e qemu: Implement NVDIMM
So, majority of the code is just ready as-is. Well, with one
slight change: differentiate between dimm and nvdimm in places
like device alias generation, generating the command line and so
on.

Speaking of the command line, we also need to append 'nvdimm=on'
to the '-machine' argument so that the nvdimm feature is
advertised in the ACPI tables properly.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-03-15 14:16:32 +01:00
Michal Privoznik
b4e8a49f8d Introduce NVDIMM memory model
NVDIMM is new type of memory introduced into QEMU 2.6. The idea
is that we have a Non-Volatile memory module that keeps the data
persistent across domain reboots.

At the domain XML level, we already have some representation of
'dimm' modules. Long story short, NVDIMM will utilize the
existing <memory/> element that lives under <devices/> by adding
a new attribute 'nvdimm' to the existing @model and introduce a
new <path/> element for <source/> while reusing other fields. The
resulting XML would appear as:

    <memory model='nvdimm'>
      <source>
        <path>/tmp/nvdimm</path>
      </source>
      <target>
        <size unit='KiB'>523264</size>
        <node>0</node>
      </target>
      <address type='dimm' slot='0'/>
    </memory>

So far, this is just a XML parser/formatter extension. QEMU
driver implementation is in the next commit.

For more info on NVDIMM visit the following web page:

    http://pmem.io/

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-03-15 13:30:58 +01:00
Michal Privoznik
290a00e41d qemuDomainBuildNamespace: Handle file mount points
https://bugzilla.redhat.com/show_bug.cgi?id=1431112

Yeah, that's right. A mount point doesn't have to be a directory.
It can be a file too. However, the code that tries to preserve
mount points under /dev for new namespace for qemu does not count
with that option.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-03-13 13:32:45 +01:00
Michal Privoznik
e915942b05 qemuProcessHandleMonitorEOF: Disable namespace for domain
https://bugzilla.redhat.com/show_bug.cgi?id=1430634

If a qemu process has died, we get EOF on its monitor. At this
point, since qemu process was the only one running in the
namespace kernel has already cleaned the namespace up. Any
attempt of ours to enter it has to fail.

This really happened in the bug linked above. We've tried to
attach a disk to qemu and while we were in the monitor talking to
qemu it just died. Therefore our code tried to do some roll back
(e.g. deny the device in cgroups again, restore labels, etc.).
However, during the roll back (esp. when restoring labels) we
still thought that domain has a namespace. So we used secdriver's
transactions. This failed as there is no namespace to enter.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-03-10 16:02:34 +01:00
Pavel Hrdina
cd4a8b9304 conf: store "autoGenerated" for graphics listen in status XML
When libvirtd is started we call qemuDomainRecheckInternalPaths
to detect whether a domain has VNC socket path generated by libvirt
based on option from qemu.conf.  However if we are parsing status XML
for running domain the existing socket path can be generated also if
the config XML uses the new <listen type='socket'/> element without
specifying any socket.

The current code doesn't make difference how the socket was generated
and always marks it as "fromConfig".  We need to store the
"autoGenerated" value in the status XML in order to preserve that
information.

The difference between "fromConfig" and "autoGenerated" is important
for migration, because if the socket is based on "fromConfig" we don't
print it into the migratable XML and we assume that user has properly
configured qemu.conf on both hosts.  However if the socket is based
on "autoGenerated" it means that a new feature was used and therefore
we need to leave the socket in migratable XML to make sure that if
this feature is not supported on destination the migration will fail.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2017-03-09 10:22:43 +01:00
John Ferlan
b2e5de96c7 qemu: Rename variable
Rename 'secretUsageType' to 'usageType' since it's superfluous in an
API qemu*Secret*
2017-03-08 14:37:05 -05:00
John Ferlan
7c2b7891cc qemu: Introduce qemuDomainSecretInfoTLSNew
Building upon the qemuDomainSecretInfoNew, create a helper which will
build the secret used for TLS.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-03-08 14:31:09 -05:00
John Ferlan
c9a7b7b6ea qemu: Introduce qemuDomainSecretInfoNew
Create a helper which will create the secinfo used for disks, hostdevs,
and chardevs.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-03-08 14:31:07 -05:00
Pavel Hrdina
3ffea19acd qemu_domain: cleanup the controller post parse code
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2017-03-07 16:50:35 +01:00
Pavel Hrdina
57404ff7a7 qemu_domain: move controller post parse code into its own function
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2017-03-07 16:50:34 +01:00
Michal Privoznik
4da534c0b9 qemu: Enforce qemuSecurity wrappers
Now that we have some qemuSecurity wrappers over
virSecurityManager APIs, lets make sure everybody sticks with
them. We have them for a reason and calling virSecurityManager
API directly instead of wrapper may lead into accidentally
labelling a file on the host instead of namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-03-06 08:54:28 +01:00
Marc Hartmayer
e22de286b1 qemu: Fix deadlock across fork() in QEMU driver
The functions in virCommand() after fork() must be careful with regard
to accessing any mutexes that may have been locked by other threads in
the parent process. It is possible that another thread in the parent
process holds the lock for the virQEMUDriver while fork() is called.
This leads to a deadlock in the child process when
'virQEMUDriverGetConfig(driver)' is called and therefore the handshake
never completes between the child and the parent process. Ultimately
the virDomainObjectPtr will never be unlocked.

It gets much worse if the other thread of the parent process, that
holds the lock for the virQEMUDriver, tries to lock the already locked
virDomainObject. This leads to a completely unresponsive libvirtd.

It's possible to reproduce this case with calling 'virsh start XXX'
and 'virsh managedsave XXX' in a tight loop for multiple domains.

This commit fixes the deadlock in the same way as it is described in
commit 61b52d2e38.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2017-02-21 15:47:32 +01:00
Michal Privoznik
5c74cf1f44 qemu: Allow @rendernode for virgl domains
When enabling virgl, qemu opens /dev/dri/render*. So far, we are
not allowing that in devices CGroup nor creating the file in
domain's namespace and thus requiring users to set the paths in
qemu.conf. This, however, is suboptimal as it allows access to
ALL qemu processes even those which don't have virgl configured.
Now that we have a way to specify render node that qemu will use
we can be more cautious and enable just that.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-20 10:44:22 +01:00
Michal Privoznik
1bb787fdc9 qemuDomainGetHostdevPath: Report /dev/vfio/vfio less frequently
So far, qemuDomainGetHostdevPath has no knowledge of the reasong
it is called and thus reports /dev/vfio/vfio for every VFIO
backed device. This is suboptimal, as we want it to:

a) report /dev/vfio/vfio on every addition or domain startup
b) report /dev/vfio/vfio only on last VFIO device being unplugged

If a domain is being stopped then namespace and CGroup die with
it so no need to worry about that. I mean, even when a domain
that's exiting has more than one VFIO devices assigned to it,
this function does not clean /dev/vfio/vfio in CGroup nor in the
namespace. But that doesn't matter.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-20 07:21:59 +01:00
Michal Privoznik
b8e659aa98 qemuDomainGetHostdevPath: Create /dev/vfio/vfio iff needed
So far, we are allowing /dev/vfio/vfio in the devices cgroup
unconditionally (and creating it in the namespace too). Even if
domain has no hostdev assignment configured. This is potential
security hole. Therefore, when starting the domain (or
hotplugging a hostdev) create & allow /dev/vfio/vfio too (if
needed).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-20 07:21:58 +01:00
Michal Privoznik
9d92f533f8 qemuSetupHostdevCgroup: Use qemuDomainGetHostdevPath
Since these two functions are nearly identical (with
qemuSetupHostdevCgroup actually calling virCgroupAllowDevicePath)
we can have one function call the other and thus de-duplicate
some code.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-20 07:21:58 +01:00
Michal Privoznik
b57bd206b9 qemu_conf: Check for namespaces availability more wisely
The bare fact that mnt namespace is available is not enough for
us to allow/enable qemu namespaces feature. There are other
requirements: we must copy all the ACL & SELinux labels otherwise
we might grant access that is administratively forbidden or vice
versa.
At the same time, the check for namespace prerequisites is moved
from domain startup time to qemu.conf parser as it doesn't make
much sense to allow users to start misconfigured libvirt just to
find out they can't start a single domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-15 12:43:23 +01:00
Andrea Bolognani
ee6ec7824d qemu: Call chmod() after mknod()
mknod() is affected my the current umask, so we're not
guaranteed the newly-created device node will have the
right permissions.

Call chmod(), which is not affected by the current umask,
immediately afterwards to solve the issue.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1421036
2017-02-14 19:23:05 +01:00
Ján Tomko
723fef99c0 qemu: enforce maximum ports value for nec-xhci
This controller only allows up to 15 ports.

https://bugzilla.redhat.com/show_bug.cgi?id=1375417
2017-02-13 16:34:09 +01:00
Michal Privoznik
c2130c0d47 qemu_security: Introduce ImageLabel APIs
Just like we need wrappers over other virSecurityManager APIs, we
need one for virSecurityManagerSetImageLabel and
virSecurityManagerRestoreImageLabel. Otherwise we might end up
relabelling device in wrong namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-09 08:04:57 +01:00
Michal Privoznik
b7feabbfdc qemuDomainNamespaceSetupDisk: Simplify disk check
Firstly, instead of checking for next->path the
virStorageSourceIsEmpty() function should be used which also
takes disk type into account.
Secondly, not every disk source passed has the correct type set
(due to our laziness). Therefore, instead of checking for
virStorageSourceIsBlockLocal() and also S_ISBLK() the former can
be refined to just virStorageSourceIsLocalStorage().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:56:21 +01:00
Michal Privoznik
786d8d91b4 qemuDomainDiskChainElement{Prepare,Revoke}: manage /dev entry
Again, one missed bit. This time without this commit there is no
/dev entry  in the namespace of the qemu process when doing disk
snapshots or block-copy.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:56:13 +01:00
Michal Privoznik
18ce9d139d qemuDomainNamespace{Setup,Teardown}Disk: Don't pass pointer to full disk
These functions do not need to see the whole virDomainDiskDef.
Moreover, they are going to be called from places where we don't
have access to the full disk definition. Sticking with
virStorageSource is more than enough.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:56:05 +01:00
Michal Privoznik
76d491ef14 qemuDomainNamespaceSetupDisk: Drop useless @src variable
Since its introduction in 81df21507b this variable was never
used.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:55:56 +01:00
Michal Privoznik
8dc867e978 qemu_domain: Don't pass virDomainDeviceDefPtr to ns helpers
There is no need for this. None of the namespace helpers uses it.
Historically it was used when calling secdriver APIs, but we
don't to that anymore.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:55:52 +01:00
Andrea Bolognani
c2e60ad0e5 qemu: Forbid <memoryBacking><locked> without <memtune><hard_limit>
In order for memory locking to work, the hard limit on memory
locking (and usage) has to be set appropriately by the user.

The documentation mentions the requirement already: with this
patch, it's going to be enforced by runtime checks as well,
by forbidding a non-compliant guest from being defined as well
as edited and started.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1316774
2017-02-07 18:43:10 +01:00
Michal Privoznik
7f0b382522 qemuDomainAttachDeviceMknod: Don't loop endlessly
When working with symlinks it is fairly easy to get into a loop.
Don't.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 13:20:19 +01:00
Michal Privoznik
3f5fcacf89 qemuDomainAttachDeviceMknod: Deal with symlinks
Similarly to one of the previous commits, we need to deal
properly with symlinks in hotplug case too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 13:20:17 +01:00
Michal Privoznik
4ac847f93b qemuDomainCreateDevice: Don't loop endlessly
When working with symlinks it is fairly easy to get into a loop.
Don't.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 13:18:32 +01:00
Michal Privoznik
54ed672214 qemuDomainCreateDevice: Properly deal with symlinks
Imagine you have a disk with the following source set up:

/dev/disk/by-uuid/$uuid (symlink to) -> /dev/sda

After cbc45525cb the transitive end of the symlink chain is
created (/dev/sda), but we need to create any item in chain too.
Others might rely on that.
In this case, /dev/disk/by-uuid/$uuid comes from domain XML thus
it is this path that secdriver tries to relabel. Not the resolved
one.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 13:18:10 +01:00
Michal Privoznik
b621291f5c qemuDomain{Attach,Detach}Device NS helpers: Don't relabel devices
After previous commit this has become redundant step.
Also setting up devices in namespace and setting their label
later on are two different steps and should be not done at once.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 10:40:53 +01:00
Michal Privoznik
572eda12ad qemu: Implement mtu on interface
Not only we should set the MTU on the host end of the device but
also let qemu know what MTU did we set.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-26 10:00:01 +01:00
Michal Privoznik
b020cf73fe domain_conf: Introduce <mtu/> to <interface/>
So far we allow to set MTU for libvirt networks. However, not all
domain interfaces have to be plugged into a libvirt network and
even if they are, they might want to have a different MTU (e.g.
for testing purposes).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-26 09:59:56 +01:00
Chen Hanxiao
980f2a35c7 qemu_domain: add timestamp in tainting of guests log
We lacked of timestamp in tainting of guests log,
which bring troubles for finding guest issues:
such as whether a guest powerdown caused by qemu-monitor-command
or others issues inside guests.
If we had timestamp in tainting of guests log,
it would be helpful when checking guest's /var/log/messages.

Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
2017-01-21 12:34:19 -05:00
Michal Privoznik
57b5e27d3d qemu: set default vhost-user ifname
Based on work of Mehdi Abaakouk <sileht@sileht.net>.

When parsing vhost-user interface XML and no ifname is found we
can try to fill it in in post parse callback. The way this works
is we try to make up interface name from given socket path and
then ask openvswitch whether it knows the interface.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-20 15:42:12 +01:00
Michal Privoznik
d0baf54e53 qemu: Actually unshare() iff running as root
https://bugzilla.redhat.com/show_bug.cgi?id=1413922

While all the code that deals with qemu namespaces correctly
detects whether we are running as root (and turn into NO-OP for
qemu:///session) the actual unshare() call is not guarded with
such check. Therefore any attempt to start a domain under
qemu:///session shall fail as unshare() is reserved for root.

The fix consists of moving unshare() call (for which we have a
wrapper called virProcessSetupPrivateMountNS) into
qemuDomainBuildNamespace() where the proper check is performed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
2017-01-17 13:23:56 +01:00
Michal Privoznik
93a062c3b2 qemu: Copy SELinux labels for namespace too
When creating new /dev/* for qemu, we do chown() and copy ACLs to
create the exact copy from the original /dev. I though that
copying SELinux labels is not necessary as SELinux will chose the
sane defaults. Surprisingly, it does not leaving namespace with
the following labels:

crw-rw-rw-. root root system_u:object_r:tmpfs_t:s0     random
crw-------. root root system_u:object_r:tmpfs_t:s0     rtc0
drwxrwxrwt. root root system_u:object_r:tmpfs_t:s0     shm
crw-rw-rw-. root root system_u:object_r:tmpfs_t:s0     urandom

As a result, domain is unable to start:

error: internal error: process exited while connecting to monitor: Error in GnuTLS initialization: Failed to acquire random data.
qemu-kvm: cannot initialize crypto: Unable to initialize GNUTLS library: Failed to acquire random data.

The solution is to copy the SELinux labels as well.

Reported-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-13 14:45:52 +01:00
Michal Privoznik
cbc45525cb qemuDomainCreateDevice: Canonicalize paths
So far the decision whether /dev/* entry is created in the qemu
namespace is really simple: does the path starts with "/dev/"?
This can be easily fooled by providing path like the following
(for any considered device like disk, rng, chardev, ..):

  /dev/../var/lib/libvirt/images/disk.qcow2

Therefore, before making the decision the path should be
canonicalized.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 18:08:13 +01:00
Michal Privoznik
49f326edc0 qemu: Use namespaces iff available on the host kernel
So far the namespaces were turned on by default unconditionally.
For all non-Linux platforms we provided stub functions that just
ignored whatever namespaces setting there was in qemu.conf and
returned 0 to indicate success. Moreover, we didn't really check
if namespaces are available on the host kernel.

This is suboptimal as we might have ignored user setting.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 18:07:43 +01:00
Michal Privoznik
41816751a7 util: Introduce virFileMoveMount
This is a simple wrapper over mount(). However, not every system
out there is capable of moving a mount point. Therefore, instead
of having to deal with this fact in all the places of our code we
can have a simple wrapper and deal with this fact at just one
place.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 18:06:30 +01:00
Michal Privoznik
2ff8c30548 qemuDomainSetupAllInputs: Update debug message
Due to a copy-paste error, the debug message reads:

  Setting up disks

It should have been:

  Setting up inputs.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 17:39:24 +01:00
Michal Privoznik
269589146c qemu_domain: Move qemuDomainGetPreservedMounts
This function is used only from code compiled on Linux. Therefore
on non-Linux platforms it triggers compilation error:

../../src/qemu/qemu_domain.c:209:1: error: unused function 'qemuDomainGetPreservedMounts' [-Werror,-Wunused-function]

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 19:23:49 +01:00
Michal Privoznik
406e390962 qemu: Drop qemuDomainDeleteNamespace
After previous commits, this function is no longer needed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:57 +01:00
Michal Privoznik
5d198c2b2c qemuDomainCreateNamespace: move mkdir to qemuDomainBuildNamespace
Again, there is no need to create /var/lib/libvirt/$domain.*
directories in CreateNamespace(). It is sufficient to create them
as soon as we need them which is in BuildNamespace. This way we
don't leave them around for the whole lifetime of domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:57 +01:00
Michal Privoznik
5d30057695 qemuDomainGetPreservedMounts: Do not special case /dev
The c1140eb9e got me thinking. We don't want to special case /dev
in qemuDomainGetPreservedMounts(), but in all other places in the
code we special case it anyway. I mean,
/var/run/libvirt/$domain.dev path is constructed separately just
so that it is not constructed here. It makes only a little sense
(if any at all).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:57 +01:00
Michal Privoznik
40ebbf72d5 qemuDomainCreateNamespace: s/unlink/rmdir/
If something goes wrong in this function we try a rollback. That
is unlink all the directories we created earlier. For some weird
reason unlink() was called instead of rmdir().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:57 +01:00
Martin Kletzander
c1140eb9ed qemu: Remove /dev mount info properly
Just so it doesn't bite us in the future, even though it's unlikely.

And fix the comment above it as well.  Commit e08ee7cd34 took the
info from the function it's calling, but that was lie itself in the
first place.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2017-01-05 16:24:55 +01:00
Michal Privoznik
e08ee7cd34 qemuDomainGetPreservedMounts: Fetch list of /dev/* mounts dynamically
With my namespace patches, we are spawning qemu in its own
namespace so that we can manage /dev entries ourselves. However,
some filesystems mounted under /dev needs to be preserved in
order to be shared with the parent namespace (e.g. /dev/pts).
Currently, the list of mount points to preserve is hardcoded
which ain't right - on some systems there might be less or more
items under real /dev that on our list. The solution is to parse
/proc/mounts and fetch the list from there.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-05 16:00:20 +01:00
Michal Privoznik
dd78da09b0 qemuDomainCreateDevice: Be more careful about device path
Again, not something that I'd hit, but there is a chance in
theory that this might bite us. Currently the way we decide
whether or not to create /dev entry for a device is by marching
first four characters of path with "/dev". This might be not
enough. Just imagine somebody has a disk image stored under
"/devil/path/to/disk". We ought to be matching against "/dev/".

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-04 15:36:42 +01:00
Michal Privoznik
ce01a2b11c qemuDomainAttachDeviceMknodHelper: Don't unlink() so often
Not that I'd encounter any bug here, but the code doesn't look
100% correct. Imagine, somebody is trying to attach a device to a
domain, and the device's /dev entry already exists in the qemu
namespace. This is handled gracefully and the control continues
with setting up ACLs and calling security manager to set up
labels. Now, if any of these steps fail, control jump on the
'cleanup' label and unlink() the file straight away. Even when it
was not us who created the file in the first place. This can be
possibly dangerous.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-04 15:36:42 +01:00
Michal Privoznik
3aae99fe71 qemu: Handle EEXIST gracefully in qemuDomainCreateDevice
https://bugzilla.redhat.com/show_bug.cgi?id=1406837

Imagine you have a domain configured in such way that you are
assigning two PCI devices that fall into the same IOMMU group.
With mount namespace enabled what happens is that for the first
PCI device corresponding /dev/vfio/X entry is created and when
the code tries to do the same for the second mknod() fails as
/dev/vfio/X already exists:

2016-12-21 14:40:45.648+0000: 24681: error :
qemuProcessReportLogError:1792 : internal error: Process exited
prior to exec: libvirt: QEMU Driver error : Failed to make device
/var/run/libvirt/qemu/windoze.dev//vfio/22: File exists

Worse, by default there are some devices that are created in the
namespace regardless of domain configuration (e.g. /dev/null,
/dev/urandom, etc.). If one of them is set as backend for some
guest device (e.g. rng, chardev, etc.) it's the same story as
described above.

Weirdly, in attach code this is already handled.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-04 15:36:42 +01:00
John Ferlan
7f7d990483 qemu: Don't assume secret provided for LUKS encryption
https://bugzilla.redhat.com/show_bug.cgi?id=1405269

If a secret was not provided for what was determined to be a LUKS
encrypted disk (during virStorageFileGetMetadata processing when
called from qemuDomainDetermineDiskChain as a result of hotplug
attach qemuDomainAttachDeviceDiskLive), then do not attempt to
look it up (avoiding a libvirtd crash) and do not alter the format
to "luks" when adding the disk; otherwise, the device_add would
fail with a message such as:

   "unable to execute QEMU command 'device_add': Property 'scsi-hd.drive'
    can't find value 'drive-scsi0-0-0-0'"

because of assumptions that when the format=luks that libvirt would have
provided the secret to decrypt the volume.

Access to unlock the volume will thus be left to the application.
2017-01-03 12:59:18 -05:00
Marc Hartmayer
fb2cd32c9a qemu: qemuDomainDiskChangeSupported: Add missing 'address' check
Disk->info is not live updatable so add a check for this. Otherwise
libvirt reports success even though no data was updated.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-12-20 11:22:44 +01:00
Michal Privoznik
ab41ce7f4e qemu: Mark more namespace code linux-only
Some of the functions are not called on non-linux platforms
which makes them useless there.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-16 11:51:06 +00:00
Michal Privoznik
f444faa94a qemu: Enable mount namespace
https://bugzilla.redhat.com/show_bug.cgi?id=1404952

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
661887f558 qemu: Let users opt-out from containerization
Given how intrusive previous patches are, it might happen that
there's a bug or imperfection. Lets give users a way out: if they
set 'namespaces' to an empty array in qemu.conf the feature is
suppressed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
f95c5c48d4 qemu: Manage /dev entry on RNG hotplug
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
f5fdf23a68 qemu: Manage /dev entry on chardev hotplug
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
6e57492839 qemu: Manage /dev entry on hostdev hotplug
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
81df21507b qemu: Manage /dev entry on disk hotplug
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
2160f338a7 qemu: Prepare RNGs when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
8ec8a8c5ff qemu: Prepare inputs when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
2c654490f3 qemu: Prepare TPM when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
4e4451019c qemu: Prepare chardevs when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
73267cec46 qemu: Prepare hostdevs when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
054202d020 qemu: Prepare disks when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
bb4e529664 qemu: Spawn qemu under mount namespace
Prime time. When it comes to spawning qemu process and
relabelling all the devices it's going to touch, there's inherent
race with other applications in the system (e.g. udev). Instead
of trying convincing udev to not touch libvirt managed devices,
we can create a separate mount namespace for the qemu, and mount
our own /dev there. Of course this puts more work onto us as we
have to maintain /dev files on each domain start and device
hot(un-)plug. On the other hand, this enhances security also.

From technical POV, on domain startup process the parent
(libvirtd) creates:

  /var/lib/libvirt/qemu/$domain.dev
  /var/lib/libvirt/qemu/$domain.devpts

The child (which is going to be qemu eventually) calls unshare()
to create new mount namespace. From now on anything that child
does is invisible to the parent. Child then mounts tmpfs on
$domain.dev (so that it still sees original /dev from the host)
and creates some devices (as explained in one of the previous
patches). The devices have to be created exactly as they are in
the host (including perms, seclabels, ACLs, ...). After that it
moves $domain.dev mount to /dev.

What's the $domain.devpts mount there for then you ask? QEMU can
create PTYs for some chardevs. And historically we exposed the
host ends in our domain XML allowing users to connect to them.
Therefore we must preserve devpts mount to be shared with the
host's one.

To make this patch as small as possible, creating of devices
configured for domain in question is implemented in next patches.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
7ed6934f3b virDomainObjGetShortName: take virDomainDef
So far this function takes virDomainObjPtr which:
1) is an overkill,
2) might be not available in all the places we will use it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-08 15:45:52 +01:00
Laine Stump
9b0848d523 qemu: propagate virQEMUDriver object to qemuDomainDeviceCalculatePCIConnectFlags
If libvirtd is running unprivileged, it can open a device's PCI config
data in sysfs, but can only read the first 64 bytes. But as part of
determining whether a device is Express or legacy PCI,
qemuDomainDeviceCalculatePCIConnectFlags() will be updated in a future
patch to call virPCIDeviceIsPCIExpress(), which tries to read beyond
the first 64 bytes of the PCI config data and fails with an error log
if the read is unsuccessful.

In order to avoid creating a parallel "quiet" version of
virPCIDeviceIsPCIExpress(), this patch passes a virQEMUDriverPtr down
through all the call chains that initialize the
qemuDomainFillDevicePCIConnectFlagsIterData, and saves the driver
pointer with the rest of the iterdata so that it can be used by
qemuDomainDeviceCalculatePCIConnectFlags(). This pointer isn't used
yet, but will be used in an upcoming patch (that detects Express vs
legacy PCI for VFIO assigned devices) to examine driver->privileged.
2016-11-30 15:28:07 -05:00
Michal Privoznik
c2a5a4e7ea virstring: Unify string list function names
We have couple of functions that operate over NULL terminated
lits of strings. However, our naming sucks:

virStringJoin
virStringFreeList
virStringFreeListCount
virStringArrayHasString
virStringGetFirstWithPrefix

We can do better:

virStringListJoin
virStringListFree
virStringListFreeCount
virStringListHasString
virStringListGetFirstWithPrefix

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-11-25 13:54:05 +01:00
Nikolay Shirokovskiy
aaf2992d90 qemu: agent: fix unsafe agent access
qemuDomainObjExitAgent is unsafe.

First it accesses domain object without domain lock.
Second it uses outdated logic that goes back to commit 79533da1 of
year 2009 when code was quite different. (unref function
instead of unreferencing only unlocked and disposed object
in case of last reference and leaved unlocking to the caller otherwise).
Nowadays this logic may lead to disposing locked object
i guess.

Another problem is that the callers of qemuDomainObjEnterAgent
use domain object again (namely priv->agent) without domain lock.

This patch address these two problems.

qemuDomainGetAgent is dropped as unused.
2016-11-23 11:31:28 +03:00
Nikolay Shirokovskiy
3c1c56781d qemu: drop write-only agentStart 2016-11-23 11:31:14 +03:00
Marc Hartmayer
1c122e737e Refactoring: Use virHostdevIsSCSIDevice()
Use the util function virHostdevIsSCSIDevice() to simplify if
statements.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-11-22 14:37:36 +01:00
Marc Hartmayer
505bc9b025 qemu: Fix improper union member access on hostdevs
Add missing checks if a hostdev is a subsystem/SCSI device before access
the union member 'subsys'/'scsi'.  Also fix indentation and simplify
qemuDomainObjCheckHostdevTaint().

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-11-22 14:37:36 +01:00
Peter Krempa
0df2524acb qemu: domain: Refresh vcpu halted state using qemuMonitorGetCpuHalted
Don't use qemuMonitorGetCPUInfo which does a lot of matching to get the
full picture which is not necessary and would be mostly discarded.

Refresh only the vcpu halted state using data from query-cpus.
2016-11-21 17:19:48 +01:00
Peter Krempa
3f71c79768 qemu: monitor: Extract qemu cpu id along with other data
Storing of the ID will allow simpler extraction of data present only in
query-cpus without the need to call qemuMonitorGetCPUInfo in statistics
paths.
2016-11-21 17:19:48 +01:00
Laine Stump
d8bd837669 qemu: add a USB3 controller to Q35 domains by default
Previously we added a set of EHCI+UHCI controllers to Q35 machines to
mimic real hardware as closely as possible, but recent discussions
have pointed out that the nec-usb-xhci (USB3) controller is much more
virtualization-friendly (uses less CPU), so this patch switches the
default for Q35 machinetypes to add an XHCI instead (if it's
supported, which it of course *will* be).

Since none of the existing test cases left out USB controllers in the
input XML, a new Q35 test case was added which has *no* devices, so
ends up with only the defaults always put in by qemu, plus those added
by libvirt.
2016-11-14 14:22:23 -05:00
Laine Stump
807232203a qemu: don't force-add a dmi-to-pci-bridge just on principle
Now the a dmi-to-pci-bridge is automatically added just as it's needed
(when a pci-bridge is being added), we no longer have any need to
force-add one to every single Q35 domain.
2016-11-14 14:21:43 -05:00
Laine Stump
50adb8a660 qemu: new functions qemuDomainMachineHasPCI[e]Root()
These functions provide a simple one line method of learning if the
current domain has a pci-root or pcie-root bus.
2016-11-14 14:03:09 -05:00
Martin Kletzander
acf0ec024a qemu: Save various defaults for shmem
We're keeping some things at default and that's not something we want to
do intentionally.  Let's save some sensible defaults upfront in order to
avoid having problems later.  The details for the defaults (of the newer
implementation) can be found in qemu's commit 5400c02b90bb:

  http://git.qemu.org/?p=qemu.git;a=commit;h=5400c02b90bb

Since we are merely saving the defaults it will not change the guest ABI
and thanks to the fact that we're doing it in the PostParse callback it
will not break the ABI stability checks.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-11-02 16:05:39 +01:00
John Ferlan
daf5c651f0 qemu: Add a secret object to/for a char source dev
Add the secret object so the 'passwordid=' can be added if the command line
if there's a secret defined in/on the host for TCP chardev TLS objects.

Preparation for the secret involves adding the secinfo to the char source
device prior to command line processing. There are multiple possibilities
for TCP chardev source backend usage.

Add test for at least a serial chardev as an example.
2016-10-26 07:18:25 -04:00
Viktor Mihajlovski
08f22976b1 qemu: Add domain support for VCPU halted state
Adding a field to the domain's private vcpu object to hold the halted
state information.
Adding two functions in support of the halted state:
- qemuDomainGetVcpuHalted: retrieve the halted state from a
  private vcpu object
- qemuDomainRefreshVcpuHalted: obtain the per-vcpu halted states
  via qemu monitor and store the results in the private vcpu objects

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Hao QingFeng <haoqf@linux.vnet.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-10-24 18:52:36 -04:00
Pavel Hrdina
7c8df1e82f domain: fix migration to older libvirt
Since TLS was introduced hostwide for libvirt 2.3.0 and a domain
configurable haveTLS was implemented for libvirt 2.4.0, we have to
modify the migratable XML for specific case where the 'tls' attribute
is based on setting from qemu.conf.

The "tlsFromConfig" is libvirt internal attribute and is stored only in
status XML to ensure that when libvirtd is restarted this internal flag
is not lost by the restart.

That flag is used to decide whether we should put *tls* attribute to
migratable XML or not.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-10-24 16:29:26 +02:00
Pavel Hrdina
0298531b29 domain: Add optional 'tls' attribute for TCP chardev
Add an optional "tls='yes|no'" attribute for a TCP chardev.

For QEMU, this will allow for disabling the host config setting of the
'chardev_tls' for a domain chardev channel by setting the value to "no" or
to attempt to use a host TLS environment when setting the value to "yes"
when the host config 'chardev_tls' setting is disabled, but a TLS environment
is configured via either the host config 'chardev_tls_x509_cert_dir' or
'default_tls_x509_cert_dir'

Signed-off-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-10-24 16:05:33 +02:00
John Ferlan
7bd8312e7f conf: Move the privateData from virDomainChrDef to virDomainChrSourceDef
Commit id '5f2a132786' should have placed the data in the host source
def structure since that's also used by smartcard, redirdev, and rng in
order to provide a backend tcp channel.  The data in the private structure
will be necessary in order to provide the secret properly.

This also renames the previous names from "Chardev" to "ChrSource" for
the private data structures and API's
2016-10-21 16:42:59 -04:00
John Ferlan
77a12987a4 Introduce virDomainChrSourceDefNew for virDomainChrDefPtr
Change the virDomainChrDef to use a pointer to 'source' and allocate
that pointer during virDomainChrDefNew.

This has tremendous "fallout" in the rest of the code which mainly
has to change source.$field to source->$field.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-10-21 14:03:36 -04:00
John Ferlan
5f2a132786 qemu: Introduce qemuDomainChardevPrivatePtr
Modeled after the qemuDomainHostdevPrivatePtr (commit id '27726d8c'),
create a privateData pointer in the _virDomainChardevDef to allow storage
of private data for a hypervisor in order to at least temporarily store
secret data for usage during qemuBuildCommandLine.

NB: Since the qemu_parse_command (qemuParseCommandLine) code is not
expecting to restore the secret data, there's no need to add code
code to handle this new structure there.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-10-19 15:40:29 -04:00
John Ferlan
6262a9b282 qemu: Remove unnecessary NULL arg check
qemuDomainSecret{Disk|Hostdev}Prepare has a prototype that checks for
ATTRIBUTE_NONNULL(1) for 'conn'.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-10-17 15:38:32 -04:00
Pavel Hrdina
fb8f3b1c22 qemu_command: add support to use virtio as secondary video device
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1369633

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-10-12 17:46:48 +02:00
Pavel Hrdina
4c029e8cfa qemu_command: properly detect which model to use for video device
This improves commit 706b5b6277 in a way that we check qemu capabilities
instead of what architecture we are running on to detect whether we can
use *virtio-vga* model or not.  This is not a case only for arm/aarch64.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-10-12 17:46:48 +02:00
Pavel Hrdina
133fb1401f qemu_domain: move video validation out of qemu_command
All definition validation that doesn't depend on qemu capabilities
and was allowed previously as valid definition should be placed into
qemuDomainDefValidate.

The check whether video type is supported or not was based on an enum
that translates type into model.  Use switch to ensure that if new
video type is added, it will be properly handled.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-10-12 17:46:47 +02:00
Peter Krempa
043ba4a40a qemu: Reuse virDomainDeGetVcpusTopology to calculate total vcpu count
Rather than multiplying sockets, cores, and threads use the new helper
for getting the vcpu count resulting from the topology.
2016-10-11 13:52:09 +02:00
Michal Privoznik
8cfdd6e4f5 Revert "conf: Skip post parse callbacks when creating copy"
This breaks vCPU hotplug, because when starting a domain, we
create a copy of domain definition (which becomes live XML) and
during the post parse callbacks we might adjust some tunings so
that vCPU hotplug is possible.

This reverts commit 581b7756af.
2016-10-04 18:00:02 +02:00
Michal Privoznik
581b7756af conf: Skip post parse callbacks when creating copy
When creating a copy of virDomainDef we save ourselves the
trouble of writing deep-copy functions and just format and parse
back domain/device XML. However, the XML we are parsing was
already fully formatted - there is no reason to run post parse
callbacks (which fill in blanks - there are none!).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Michal Privoznik
4172ae371b qemuDomainDefAssignAddresses: Fetch caps from domain object
Just like we did two commits ago, don't try to fetch capabilities
for non-existing binary. Re-use the ones we have for running
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Michal Privoznik
1e501043f7 qemuDomainDeviceDefPostParse: Fetch caps from domain object
Just like we did two commits ago, don't try to fetch capabilities
for non-existing binary. Re-use the ones we have for running
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Michal Privoznik
70b36a7b7e qemuDomainDefPostParse: Fetch qemuCaps from domain object
We can't rely on def->emulator path. It may be provided by user
as we give them opportunity to provide their own XML for
migration. Therefore the path may point to just whatever binary
(or even to a non-existent file). Moreover, this path is meant
for destination, but the capabilities lookup is done on source.
What we can do is to assume same capabilities for post parse
callbacks as the running domain has. They will be used just to
add some default models/controllers/devices/... anyway.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Michal Privoznik
cf198684a8 conf: Extend virDomainDefAssignAddressesCallback for parseOpaque
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Michal Privoznik
78ab5dcea0 conf: Extend virDomainDeviceDefPostParse for parseOpaque
Just like virDomainDefPostParseCallback has gained new
parseOpaque argument, we need to follow the logic with
virDomainDeviceDefPostParse.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Michal Privoznik
2e056b5c51 virDomainDefCopy: Introduce @parseOpaque argument
We want to pass the proper opaque pointer instead of NULL to
virDomainDefParseString.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Michal Privoznik
c41b989112 virDomainDefParse{File,String}: Introduce @parseOpaque argument
We want to pass the proper opaque pointer instead of NULL to
virDomainDefParse and subsequently virDomainDefParseNode too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Michal Privoznik
940d91c55b virDomainDefPostParse: Introduce @parseOpaque argument
Some callers might want to pass yet another pointer to opaque
data to post parse callbacks. The driver generic one is not
enough because two threads executing post parse callback might
want to see different data (e.g. domain object pointer that
domain def belongs to).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Jiri Denemark
7ce711a30e qemu: Update guest CPU def in live XML
Storing the updated CPU definition in the live domain definition saves
us from having to update it over and over when we need it. Not to
mention that we will soon further update the CPU definition according to
QEMU once it's started.

A highly wanted side effect of this patch, libvirt will pass all CPU
features explicitly specified in domain XML to QEMU, even those that are
already included in the host model.

This patch should fix the following bugs:
    https://bugzilla.redhat.com/show_bug.cgi?id=1207095
    https://bugzilla.redhat.com/show_bug.cgi?id=1339680
    https://bugzilla.redhat.com/show_bug.cgi?id=1371039
    https://bugzilla.redhat.com/show_bug.cgi?id=1373849
    https://bugzilla.redhat.com/show_bug.cgi?id=1375524
    https://bugzilla.redhat.com/show_bug.cgi?id=1377913

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-09-22 15:40:09 +02:00
Jiri Denemark
3b6be3c0c5 cpu: Rework cpuUpdate
The reworked API is now called virCPUUpdate and it should change the
provided CPU definition into a one which can be consumed by the QEMU
command line builder:

    - host-passthrough remains unchanged
    - host-model is turned into custom CPU with a model and features
      copied from host
    - custom CPU with minimum match is converted similarly to host-model
    - optional features are updated according to host's CPU

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-09-22 15:40:09 +02:00
Jiri Denemark
b27adaed37 qemu: Propagate virCapsPtr to virQEMUCapsNewForBinaryInternal
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-09-22 15:40:08 +02:00
Martin Kletzander
0f61d7b5f2 qemu: Abstract shmem socket path preparation
Put it into qemuDomainPrepareShmemChardev() so it can be used later.
Also don't fill in the path unless the server option is enabled.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-09-20 15:42:43 +02:00
Peter Krempa
64bc75f756 qemu: domain: Don't infer vcpu state
Use the state information (online, hotpluggable) provided by the monitor
code rather than trying to infer it. This fixes an issue where on
architectures that require hotplug of multiple threads at once the
sub-cores would get updated as offline on daemon restart thus creating
an invalid configuration.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1375783
2016-09-14 12:56:57 +02:00
Jiri Denemark
56258a388f qemu: Don't use query-migrate on destination
When migration fails, we need to poke QEMU monitor to check for a reason
of the failure. We did this using query-migrate QMP command, which is
not supposed to return any meaningful result on the destination side.
Thus if the monitor was still functional when we detected the migration
failure, parsing the answer from query-migrate always failed with the
following error message:

    "info migration reply was missing return status"

This irrelevant message was then used as the reason for the migration
failure replacing any message we might have had.

Let's use harmless query-status for poking the monitor to make sure we
only get an error if the monitor connection is broken.

https://bugzilla.redhat.com/show_bug.cgi?id=1374613

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-09-12 15:56:10 +02:00
Peter Krempa
6e19cc59a6 qemu: domain: Clear startup policy for dropped removable media
When a source image is dropped when missing due to startup policy the
policy needs to be cleared since it was relevant only for the given
storage source. New sources need to update it if needed.
2016-09-12 09:54:36 +02:00
Michal Privoznik
c56cdf2593 conf: Add support for virtio-net.rx_queue_size
https://bugzilla.redhat.com/show_bug.cgi?id=1366989

QEMU added another virtio-net tunable [1]. It basically allows
users to set the size of RX virtio ring. But because virtio-net
uses two separate ring buffers to pass data from/to guest they
named it explicitly rx_queue_size. We should expose it in our XML
too.

1: http://lists.nongnu.org/archive/html/qemu-devel/2016-08/msg02029.html

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-09 16:16:59 +02:00
Nikolay Shirokovskiy
c62e79c8ca qemu: Filter cur_balloon ABI check for certain transactions
Since the domain lock is not held during preparation of an external XML
config, it is possible that the value can change resulting in unexpected
failures during ABI consistency checking for some save and migrate
operations.

This patch adds a new flag to skip the checking of the cur_balloon value
and then sets the destination value to the source value to ensure
subsequent checks without the skip flag will succeed.

This way it is protected from forges and is keeped up to date too.

Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
2016-09-02 16:54:42 -04:00
Peter Krempa
9eb9106ea5 qemu: command: Add support for sparse vcpu topologies
Add support for using the new approach to hotplug vcpus using device_add
during startup of qemu to allow sparse vcpu topologies.

There are a few limitations imposed by qemu on the supported
configuration:
- vcpu0 needs to be always present and not hotpluggable
- non-hotpluggable cpus need to be ordered at the beginning
- order of the vcpus needs to be unique for every single hotpluggable
  entity

Qemu also doesn't really allow to query the information necessary to
start a VM with the vcpus directly on the commandline. Fortunately they
can be hotplugged during startup.

The new hotplug code uses the following approach:
- non-hotpluggable vcpus are counted and put to the -smp option
- qemu is started
- qemu is queried for the necessary information
- the configuration is checked
- the hotpluggable vcpus are hotplugged
- vcpus are started

This patch adds a lot of checking code and enables the support to
specify the individual vcpu element with qemu.
2016-08-24 15:44:47 -04:00
Peter Krempa
20ef1232ec qemu: process: Copy final vcpu order information into the vcpu definition
The vcpu order information is extracted only for hotpluggable entities,
while vcpu definitions belonging to the same hotpluggable entity need
to all share the order information.

We also can't overwrite it right away in the vcpu info detection code as
the order is necessary to add the hotpluggable vcpus enabled on boot in
the correct order.

The helper will store the order information in places where we are
certain that it's necessary.
2016-08-24 15:44:47 -04:00
Peter Krempa
48e3d42889 qemu: migration: Prepare for non-contiguous vcpu configurations
Introduce a new migration cookie flag that will be used for any
configurations that are not compatible with libvirt that would not
support the specific vcpu hotplug approach. This will make sure that old
libvirt does not fail to reproduce the configuration correctly.
2016-08-24 15:44:47 -04:00
Peter Krempa
5847bc5c64 conf: Add XML for individual vCPU hotplug
Individual vCPU hotplug requires us to track the state of any vCPU. To
allow this add the following XML:

<domain>
  ...
  <vcpu current='2'>3</vcpu>
  <vcpus>
    <vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='1' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='1' enabled='no' hotpluggable='yes'/>
  </vcpus>
  ...

The 'enabled' attribute allows to control the state of the vcpu.
'hotpluggable' controls whether given vcpu can be hotplugged and 'order'
allows to specify the order to add the vcpus.
2016-08-24 15:44:47 -04:00
Peter Krempa
133be0a9e2 qemu: domain: Prepare for VCPUs vanishing while libvirt is not running
Similarly to devices the guest may allow unplug of the VCPU if libvirt
is down. To avoid problems, refresh the vcpu state on reconnect. Don't
mess with the vcpu state otherwise.
2016-08-24 15:44:47 -04:00
Peter Krempa
6b4a23ff6c qemu: domain: Extract cpu-hotplug related data
Now that the monitor code gathers all the data we can extract it to
relevant places either in the definition or the private data of a vcpu.

As only thread id is broken for TCG guests we may extract the rest of
the data and just skip assigning of the thread id. In case where qemu
would allow cpu hotplug in TCG mode this will make it work eventually.
2016-08-24 15:44:47 -04:00
Peter Krempa
9bbbc88a8f qemu: monitor: Add algorithm for combining query-(hotpluggable-)-cpus data
For hotplug purposes it's necessary to retrieve data using
query-hotpluggable-cpus while the old query-cpus API report thread IDs
and order of hotplug.

This patch adds code that merges the data using a rather non-trivial
algorithm and fills the data to the qemuMonitorCPUInfo structure for
adding to appropriate place in the domain definition.
2016-08-24 15:44:47 -04:00
Peter Krempa
ffa536e0f8 qemu: Forbid config when topology based cpu count doesn't match the config
As of qemu commit:
commit a32ef3bfc12c8d0588f43f74dcc5280885bbdb30
Author: Thomas Huth <thuth@redhat.com>
Date:   Wed Jul 22 15:59:50 2015 +0200

    vl: Add another sanity check to smp_parse() function

v2.4.0-952-ga32ef3b

configuration where the maximum CPU count doesn't match the topology is
rejected. Prior to that only configurations where the topology would
contain more cpus than the maximum count would be rejected.

Use QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS as a relevant recent enough
witness to avoid breaking old configs.
2016-08-24 15:44:47 -04:00
Peter Krempa
5b5f494a1b qemu: monitor: Return structures from qemuMonitorGetCPUInfo
The function will gradually add more returned data. Return a struct for
every vCPU containing the data.
2016-08-24 15:44:47 -04:00
Andrea Bolognani
31de0fab93 qemu: domain: Drop piix3-ohci controller for migration
Now that the default USB controller model is explicit rather
than implicit for i440fx machines, we have to tweak the
conditions for dropping it in order to keep migration towards
libvirt <= 0.9.4 working.
2016-08-12 17:38:02 +02:00
Andrea Bolognani
f55eaccb0c qemu: domain: Reflect USB controller model in guest XML
When the user doesn't specify any model for a USB controller,
we use an architecture-dependent default, but we don't reflect
it in the guest XML.

Pick the default USB controller model when parsing the guest
XML instead of when creating the QEMU command line, so that
our choice is saved back to disk.
2016-08-12 17:38:02 +02:00
Michal Privoznik
9c1524a01c qemu: Enable secure boot
In qemu, enabling this feature boils down to adding the following
onto the command line:

  -global driver=cfi.pflash01,property=secure,value=on

However, there are some constraints resulting from the
implementation. For instance, System Management Mode (SMM) is
required to be enabled, the machine type must be q35-2.4 or
later, and the guest should be x86_64. While technically it is
possible to have 32 bit guests with secure boot, some non-trivial
CPU flags tuning is required (for instance lm and nx flags must
be prohibited). Given complexity of our CPU driver, this is not
trivial. Therefore I've chosen to forbid 32 bit guests for now.
If there's ever need, we can refine the check later.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-08-04 17:22:20 +02:00
Peter Krempa
041f35340b qemu: domain: Simplify return values of qemuDomainRefreshVcpuInfo
Call the vcpu thread info validation separately to decrease complexity
of returned values by qemuDomainRefreshVcpuInfo.

This function now returns 0 on success and -1 on error. Certain
failures of qemu to report data are still considered as success. Any
error reported now is fatal.
2016-08-04 08:08:40 +02:00
Peter Krempa
2bdc300a34 qemu: domain: Improve vCPU data checking in qemuDomainRefreshVcpu
Validate the presence of the thread id according to state of the vCPU
rather than just checking the vCPU count. Additionally put the new
validation code into a separate function so that the information
retrieval can be split from the validation.
2016-08-04 08:08:31 +02:00
Peter Krempa
8f56b5baaf qemu: domain: Rename qemuDomainDetectVcpuPids to qemuDomainRefreshVcpuInfo
The function will eventually do more useful stuff than just detection of
thread ids.
2016-08-04 08:03:58 +02:00
John Ferlan
dd0dbe1d66 qemu: Make QEMU_DRIVE_HOST_PREFIX more private
Move QEMU_DRIVE_HOST_PREFIX into the qemu_alias.c to dissuade future
callers from using it. Create qemuAliasDiskDriveSkipPrefix in order
to handle the current consumers that desire to check if an alias has
the drive- prefix and "get beyond it" in order to get the disk alias.
2016-08-02 10:11:11 -04:00
Chunyan Liu
c6f0e177a3 qemuDomainDeviceDefPostParse: add USB controller model check
To sync with virDomainControllerModelUSB, we add two models
in qemuControllerModelUSB 'qusb1' and 'qusb2', but those
models are not supported in qemu driver. So add check in
device post parse to report errors if 'qusb1' and 'qusb2'
are specified.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
2016-08-02 14:02:21 +02:00
Martin Kletzander
a2b97a8d91 qemu: Fix support for startupPolicy with volume/pool disks
Until now we simply errored out when the translation from pool+volume
failed.  However, we should instead check whether that disk is needed or
not since there is an option for that.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1168453

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-08-02 13:21:01 +02:00
Martin Kletzander
779a4ea906 qemu: Remove unnecessary label and its only reference
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-08-02 13:21:01 +02:00
Martin Kletzander
e2705cfb6e qemu: Make qemuDomainCheckDiskStartupPolicy self-contained
There is an error reset following the function and check for
startupPolicy before that.  Let's reflect those things inside that
function so that future code doesn't have to be that complex.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-08-02 13:21:01 +02:00
Boris Fiuczynski
230c631917 qemu: remove panic dev models s390 and pseries when migrating
The panic devices with models s390 and pseries are autogenerated.
For backwards compatibility reasons the devices are to be removed
when migrating.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2016-08-01 14:15:08 +02:00
Michal Privoznik
1e05846373 conf: Catch invalid memory model earlier
Consider the following XML snippet:

    <memory model=''>
      <target>
        <size unit='KiB'>523264</size>
        <node>0</node>
      </target>
    </memory>

Whats wrong you ask? The @model attribute. This should result in
an error thrown into users faces during virDomainDefine phase.
Except it doesn't. The XML validation catches this error, but if
users chose to ignore that, they will end up with invalid XML.
Well, they won't be able to start the machine - that's when error
is produced currently. But it would be nice if we could catch the
error like this earlier.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-07-29 11:03:24 +02:00
Daniel P. Berrange
a48c714115 storage: remove "luks" storage volume type
The current LUKS support has a "luks" volume type which has
a "luks" encryption format.

This partially makes sense if you consider the QEMU shorthand
syntax only requires you to specify a format=luks, and it'll
automagically uses "raw" as the next level driver. QEMU will
however let you override the "raw" with any other driver it
supports (vmdk, qcow, rbd, iscsi, etc, etc)

IOW the intention though is that the "luks" encryption format
is applied to all disk formats (whether raw, qcow2, rbd, gluster
or whatever). As such it doesn't make much sense for libvirt
to say the volume type is "luks" - we should be saying that it
is a "raw" file, but with "luks" encryption applied.

IOW, when creating a storage volume we should use this XML

  <volume>
    <name>demo.raw</name>
    <capacity>5368709120</capacity>
    <target>
      <format type='raw'/>
      <encryption format='luks'>
        <secret type='passphrase' uuid='0a81f5b2-8403-7b23-c8d6-21ccd2f80d6f'/>
      </encryption>
    </target>
  </volume>

and when configuring a guest disk we should use

  <disk type='file' device='disk'>
    <driver name='qemu' type='raw'/>
    <source file='/home/berrange/VirtualMachines/demo.raw'/>
    <target dev='sda' bus='scsi'/>
    <encryption format='luks'>
      <secret type='passphrase' uuid='0a81f5b2-8403-7b23-c8d6-21ccd2f80d6f'/>
    </encryption>
  </disk>

This commit thus removes the "luks" storage volume type added
in

  commit 318ebb36f1
  Author: John Ferlan <jferlan@redhat.com>
  Date:   Tue Jun 21 12:59:54 2016 -0400

    util: Add 'luks' to the FileTypeInfo

The storage file probing code is modified so that it can probe
the actual encryption formats explicitly, rather than merely
probing existance of encryption and letting the storage driver
guess the format.

The rest of the code is then adapted to deal with
VIR_STORAGE_FILE_RAW w/ VIR_STORAGE_ENCRYPTION_FORMAT_LUKS
instead of just VIR_STORAGE_FILE_LUKS.

The commit mentioned above was included in libvirt v2.0.0.
So when querying volume XML this will be a change in behaviour
vs the 2.0.0 release - it'll report 'raw' instead of 'luks'
for the volume format, but still report 'luks' for encryption
format.  I think this change is OK because the storage driver
did not include any support for creating volumes, nor starting
guets with luks volumes in v2.0.0 - that only since then.
Clearly if we change this we must do it before v2.1.0 though.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-07-27 18:59:15 +01:00
Tomasz Flendrich
1aa5e66cf3 qemu: remove ccwaddrs caching
Dropping the caching of ccw address set.
The cached set is not required anymore, because the set is now being
recalculated from the domain definition on demand, so the cache
can be deleted.
2016-07-26 13:04:46 +02:00
Tomasz Flendrich
19a148b7c8 qemu: remove vioserialaddrs caching
Dropping the caching of virtio serial address set.
The cached set is not required anymore, because the set is now being
recalculated from the domain definition on demand, so the cache
can be deleted.

Credit goes to Cole Robinson.
2016-07-26 13:04:46 +02:00
Ján Tomko
ddd31fd7dc Reserve existing USB addresses
Check if they fit on the USB controllers the domain has,
and error out if two devices try to use the same address.
2016-07-21 08:30:26 +02:00
John Ferlan
a53349e6c6 qemu: Disallow usage of luks encryption if aes secret not possible
Resolves a CI test integration failure with a RHEL6/Centos6 environment.

In order to use a LUKS encrypted device, the design decision was to
generate an encrypted secret based on the master key. However, commit
id 'da86c6c' missed checking for that specifically.

When qemuDomainSecretSetup was implemented, a design decision was made
to "fall back" to a plain text secret setup if the specific cipher was
not available (e.g. virCryptoHaveCipher(VIR_CRYPTO_CIPHER_AES256CBC))
as well as the QEMU_CAPS_OBJECT_SECRET. For the luks encryption setup
there is no fall back to the plaintext secret, thus if that gets set
up by qemuDomainSecretSetup, then we need to fail.

Also, while the qemuxml2argvtest has set the QEMU_CAPS_OBJECT_SECRET
bit, it didn't take into account the second requirement that the
ability to generate the encrypted secret is possible. So modify the
test to not attempt to run the luks-disk if we know we don't have
the encryption algorithm.
2016-07-20 06:07:11 -04:00
John Ferlan
da86c6c226 qemu: Add luks support for domain disk
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1301021

Generate the luks command line using the AES secret key to encrypt the
luks secret. A luks secret object will be in addition to a an AES secret.

For hotplug, check if the encinfo exists and if so, add the AES secret
for the passphrase for the secret object used to decrypt the device.

Modify/augment the fakeSecret* in qemuxml2argvtest in order to handle
find a uuid or a volume usage with a specific path prefix in the XML
(corresponds to the already generated XML tests). Add error message
when the 'usageID' is not 'mycluster_myname'. Commit id '1d632c39'
altered the error message generation to rely on the errors from the
secret_driver (or it's faked replacement).

Add the .args output for adding the LUKS disk to the domain

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-07-19 09:40:10 -04:00
John Ferlan
b7b3a51e8a qemu: Alter the qemuDomainGetSecretAESAlias to add new arg
Soon we will be adding luks encryption support. Since a volume could require
both a luks secret and a secret to give to the server to use of the device,
alter the alias generation to create a slightly different alias so that
we don't have two objects with the same alias.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-07-19 09:40:10 -04:00
Jiri Denemark
08d566a0cf qemu: Drop default channel path during migration
Migration to an older libvirt (pre v1.3.0-175-g7140807) is broken
because older versions of libvirt generated different channel paths and
they didn't drop the default paths when parsing domain XMLs. We'd get
such a nice error message:

    internal error: process exited while connecting to monitor:
    2016-07-08T15:28:02.665706Z qemu-kvm: -chardev socket,
    id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/
    domain-3-nest/org.qemu.guest_agent.0,server,nowait: Failed to bind
    socket to /var/lib/libvirt/qemu/channel/target/domain-3-nest/
    org.qemu.guest_agent.0: No such file or directory

That said, we should not even format the default paths when generating a
migratable XML.

https://bugzilla.redhat.com/show_bug.cgi?id=1320470

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-07-18 09:05:12 +02:00
Jiri Denemark
b1305a6b8f qemu: Copy complete domain def in qemuDomainDefFormatBuf
Playing directly with our live definition, updating it, and reverting it
back once we are done is very nice and it's quite dangerous too. Let's
just make a copy of the domain definition if needed and do all tricks on
the copy.

https://bugzilla.redhat.com/show_bug.cgi?id=1320470

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-07-18 09:05:12 +02:00
Michal Privoznik
6b6e2cf92b qemuDomainObjPrivateFree: Free @masterKey too
This one's a bit more complicated. In qemuProcessPrepareDomain()
a master key for encrypting secret for ciphered disks is created.
This object lives within qemuDomainObjPrivate object. It is freed
in qemuProcessStop(), but if nobody calls it (for instance like
our qemuxml2argvtest does), the key object leaks.

==17078== 32 bytes in 1 blocks are definitely lost in loss record 633 of 707
==17078==    at 0x4C2C070: calloc (vg_replace_malloc.c:623)
==17078==    by 0xAD924DF: virAllocN (viralloc.c:191)
==17078==    by 0x5050BA6: virCryptoGenerateRandom (qemuxml2argvmock.c:166)
==17078==    by 0x453DC8: qemuDomainMasterKeyCreate (qemu_domain.c:678)
==17078==    by 0x47A36B: qemuProcessPrepareDomain (qemu_process.c:4913)
==17078==    by 0x47C728: qemuProcessCreatePretendCmd (qemu_process.c:5542)
==17078==    by 0x433698: testCompareXMLToArgvFiles (qemuxml2argvtest.c:332)
==17078==    by 0x4339AC: testCompareXMLToArgvHelper (qemuxml2argvtest.c:413)
==17078==    by 0x446E7A: virTestRun (testutils.c:179)
==17078==    by 0x445BD9: mymain (qemuxml2argvtest.c:2022)
==17078==    by 0x44886F: virTestMain (testutils.c:969)
==17078==    by 0x445D9B: main (qemuxml2argvtest.c:2036)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-07-11 16:24:04 +02:00
Daniel P. Berrange
ed1fbd7c5b Fix logic in qemuDomainObjPrivateXMLParseVcpu
The code in qemuDomainObjPrivateXMLParseVcpu for parsing
the 'idstr' string was comparing the overall boolean
result against 0 which was always true

qemu/qemu_domain.c: In function 'qemuDomainObjPrivateXMLParseVcpu':
qemu/qemu_domain.c:1482:59: error: comparison of constant '0' with boolean expression is always false [-Werror=bool-compare]
     if ((idstr && virStrToLong_uip(idstr, NULL, 10, &idx)) < 0 ||
                                                           ^

It was further performing two distinct error checks in
the same conditional and reporting a single error message,
which was misleading in one of the two cases.

This splits the conditional check into two parts with
distinct error messages and fixes the logic error.

Fixes the bug in

  commit 5184f398b4
  Author: Peter Krempa <pkrempa@redhat.com>
  Date:   Fri Jul 1 14:56:14 2016 +0200

    qemu: Store vCPU thread ids in vcpu private data objects

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-07-11 10:32:50 +01:00
Peter Krempa
5184f398b4 qemu: Store vCPU thread ids in vcpu private data objects
Rather than storing them in an external array store them directly.
2016-07-11 10:44:09 +02:00
Peter Krempa
3f57ce4a76 qemu: Add cpu ID to the vCPU pid list in the status XML
Note the vcpu ID so that once we allow non-contiguous vCPU topologies it
will be possible to pair thread id's with the vcpus.
2016-07-11 10:44:09 +02:00
Peter Krempa
b91335afe4 qemu: domain: Extract formating and parsing of vCPU thread ids
Further patches will be adding index and modifying the source variables
so this will make it more clear.
2016-07-11 10:44:09 +02:00
Peter Krempa
2540c93203 qemu: domain: Add vcpu private data structure
Members will be added in follow-up patches.
2016-07-11 10:44:09 +02:00
Jiri Denemark
a16ea1a0f3 qemu: Properly reset spiceMigration flag
Otherwise migration during which we didn't send client_migrate_info QMP
command will get stuck waiting for SPICE migration to finish if libvirtd
sent the QMP command in a previous migration attempt.

Broken by bd7c8a69.

https://bugzilla.redhat.com/show_bug.cgi?id=1151723

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-07-08 13:35:17 +02:00
Andrea Bolognani
cd89d3451b qemu: Memory locking is only required for KVM guests on ppc64
Due to the way the hardware works, KVM on ppc64 always requires
memory locking; however, that is not the case for non-KVM ppc64
guests, eg. ppc64 guests that are running on x86_64 with TCG.

Only require memory locking for ppc64 guests if they are using
KVM or, as it's the case for all architectures, they have host
devices assigned using VFIO.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1350772
2016-07-04 10:46:27 +02:00
John Ferlan
60c40ce3be qemu: Introduce helper qemuDomainSecretDiskCapable
Introduce a helper to help determine if a disk src could be possibly used
for a disk secret... Going to need this for hot unplug.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-07-01 15:46:57 -04:00
Jiri Denemark
fa3c558596 qemuDomainDeviceDefValidate: Drop unused qemuCaps
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-06-28 13:33:05 +02:00
Laine Stump
d987f63a45 qemu: forbid setting guest-side IP address/route info of <interface>
libvirt's qemu driver doesn't have direct access to the config on the
guest side of a network interface, and currently doesn't have any
method in place to even inform the guest of the desired config. In the
future, an unenforceable attempt to set the guest-side IP info could
be made by adding a static host entry to the appropriate dnsmasq
configuration (or changing the default dhcp client address on the qemu
commandline for type='user' interfaces), or enhancing the guest agent
to allow setting an IP address, but for now it can't have any effect,
and we don't want to give the illusion that it does.

To prevent the "disappearance" of any existing configs with ip
address/route info (due to parser failure), this check is added in the
newly implemented qemuDomainDeviceDefValidate(), which is only called
when a domain is defined or started, *not* when it is reread from disk
at libvirtd startup.
2016-06-26 19:33:09 -04:00
John Ferlan
8be83eef60 qemu: Remove authdef from secret setup
Rather than pass authdef, pass the 'authdef->username' and the
'&authdef->secdef'

Note that a username may be NULL.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-06-24 13:23:02 -04:00
John Ferlan
23c5f1b0a1 qemu: Change protocol parameter for secret setup
Rather than assume/pass the protocol to the qemuDomainSecretPlainSetup
and qemuDomainSecretAESSetup, set and pass the secretUsageType based
on the src->protocol type. This will eventually be used by the
virSecretGetSecretString call

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-06-24 13:23:02 -04:00
Andrea Bolognani
177ecaa598 qemu: Introduce qemuDomainMachineIsPSeries()
This new function checks for both the architecture and the
machine type, so we can use it instead of writing the same
checks over and over again.
2016-06-24 10:17:59 +02:00
Andrea Bolognani
210acdb1a5 qemu: Add architecture checks to qemuDomainMachineIsVirt()
Remove all external architecture checks that have been
made redundant by this change.
2016-06-24 10:17:59 +02:00
John Ferlan
dc428f450a qemu: Add new secret info type
Add 'encinfo' to the extended disk structure. This will contain the
encryption secret (if present).

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-06-23 12:30:28 -04:00
John Ferlan
1eca5f6581 secret: Move virStorageSecretType and rename
Move the enum into a new src/util/virsecret.h, rename it to be
virSecretLookupType. Add a src/util/virsecret.h in order to perform
a couple of simple operations on the secret XML and virSecretLookupTypeDef
for clearing and copying.

This includes quite a bit of collateral damage, but the goal is to remove
the "virStorage*" and replace with the virSecretLookupType so that it's
easier to to add new lookups that aren't necessarily storage pool related.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-06-23 12:30:27 -04:00
Ján Tomko
8b04ce598d Add newDomain parameter to qemuDomainAssignAddresses
Pass 'true' if we are not dealing with a migration.
2016-06-23 07:45:31 +02:00
Jiri Denemark
d85c3a5451 Report auto convergence throttle rate in migration stats
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-06-22 15:54:21 +02:00
Jiri Denemark
0b224cce8b qemu: Fix reference leak in qemuDomainDefPostParse
The function gets a reference on virQEMUDriverConfig which needs to be
released before returning.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-06-22 10:29:15 +02:00
Ján Tomko
ff52e9d43a Remove separator argument from virBitmapParse
Most the callers pass 0 in one form or another, including
vircapstest which used VIR_ARCH_NONE.
2016-06-20 12:09:52 +02:00
Andrea Bolognani
86a68bdb0c qemu: Permit PCI-free aarch64 mach-virt guests
There has been some progress lately in enabling virtio-pci on
aarch64 guests; however, guest OS support is still spotty at best,
so most guests are going to be using virtio-mmio instead.

Currently, mach-virt guests are closely modeled after q35 guests,
and that includes always adding a dmi-to-pci-bridge that's just
impossible to get rid of. While that's acceptable (if suboptimal)
for q35, where you will always need some kind of PCI device anyway,
mach-virt guests should be allowed to avoid it.
2016-06-17 18:30:04 +02:00
Andrea Bolognani
b31b3eee85 qemu: Fix alignment in virDomainDefAddController() call 2016-06-17 13:06:52 +02:00
Peter Krempa
f8d565bf86 conf: Rename virDomainDefGetMemoryActual to virDomainDefGetMemoryTotal 2016-06-17 10:39:40 +02:00
Peter Krempa
a877a1635b conf: Remove pre-calculation of initial memory size
While we need to know the difference between the total memory stored in
<memory> and the actual size not included in the possible memory modules
we can't pre-calculate it reliably. This is due to the fact that
libvirt's XML is copied via formatting and parsing the XML and the
initial memory size can be reliably calculated only when certain
conditions are met due to backwards compatibility.

This patch removes the storage of 'initial_memory' and fixes the helpers
to recalculate the initial memory size all the time from the total
memory size. This conversion is possible when we also make sure that
memory hotplug accounts properly for the update of the total memory size
and thus the helpers for inserting and removing memory devices need to
be tweaked too.

This fixes a bug where a cold-plug and cold-remove of a memory device
would increase the size reported in <memory> in the XML by the size of
the memory device. This would happen as the persistent definition is
copied before attaching the device and this would lead to the loss of
data in 'initial_memory'.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1344892
2016-06-17 10:36:41 +02:00
Laine Stump
d5fb8f4564 qemu: don't add pci-bridge to Q35/arm domains unless it's needed
Until now, a Q35 domain (or arm/virt, or any other domain that has a
pcie-root bus) would always have a pci-bridge added, so that there
would be a hotpluggable standard PCI slot available to plug in any PCI
devices that might be added. This patch removes the explicit add,
instead relying on the pci-bridge being auto-added during PCI address
assignment (it will add a pci-bridge if there are no free slots).

This doesn't eliminate the dmi-to-pci-bridge controller that is
explicitly added whether or not a standard PCI slot is required (and
that is almost never used as anything other than a converter between
pcie.0's PCIe slots and standard PCI). That will be done separately.
2016-06-16 13:48:25 -04:00
Laine Stump
97b215a450 qemu: don't be as insistent about adding dmi-to-pci-bridge or pci-bridge
Previously there was no way to have a Q35 domain that didn't have
these two controllers. This patch skips their creation as long as
there are some other kinds of pci controllers at index 1 and 2
(e.g. some pcie-root-port controllers).

I'm hoping that soon we won't add them at all, plugging all devices
into auto-added pcie-*-port ports instead, but in the meantime this
makes it easier to experiment with alternative bus hierarchies.
2016-06-16 13:32:11 -04:00
Pavel Hrdina
acc83afe33 vnc: add support for listen type 'socket'
VNC graphics already supports sockets but only via 'socket' attribute.
This patch coverts that attribute into listen type 'socket'.

For backward compatibility we need to handle listen type 'socket' and 'socket'
attribute properly to support old XMLs and new XMLs.  If both are provided they
have to match, if only one of them is provided we need to be able to parse that
configuration too.

To not break migration back to old libvirt if the socket is provided by user we
need to generate migratable XML without the listen element and use only 'socket'
attribute.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-06-09 14:42:48 +02:00
Pavel Hrdina
17271d04e7 vnc: rename socketAutogenerated to socketFromConfig
Even though it's auto-generated it's based on qemu.conf option and listen type
address already uses "fromConfig" to carry this information.  Following commits
will convert the socket to listen element so this rename is required because
there will be also an option to get socket auto-generated independently on the
qemu.conf option.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-06-09 14:22:30 +02:00
Martin Kletzander
f670008b58 qemu: Move channel path generation out of command creation
Put it into separate function called qemuDomainPrepareChannel() and call
it from the new qemuProcessPrepareDomain().

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-06-09 13:23:15 +02:00
Peter Krempa
1e467f6622 qemu: domain: Sanitize return value handling in disk presence checker
One of the functions is returning always 0 and the second one uses
unnecessary labels.
2016-06-08 08:15:11 +02:00
Peter Krempa
91a6eacc8f qemu: domain: Implement helper for one-shot log entries to the VM log file
Along with the virtlogd addition of the log file appending API implement
a helper for logging one-shot entries to the log file including the
fallback approach of using direct file access.

This will be used for noting the shutdown of the qemu proces and
possibly other actions such as VM migration and other critical VM
lifecycle events.
2016-06-07 18:10:29 +02:00
Peter Krempa
5972f185e1 qemu: Move check that validates 'min_guarantee' to qemuDomainDefValidate
Introduce a validation callback for qemu and move checking of
min_guarantee to the new callback.
2016-06-07 13:02:20 +02:00
Peter Krempa
b394af162a conf: Add infrastructure for adding configuration validation
Until now we weren't able to add checks that would reject configuration
once accepted by the parser. This patch adds a new callback and
infrastructure to add such checks. In this patch all the places where
rejecting a now-invalid configuration wouldn't be a good idea are marked
with a new parser flag.
2016-06-07 13:02:20 +02:00
Ján Tomko
5c4b6e8f5f qemu: assume QEMU_CAPS_DEVICE almost everywhere
Remove more checks that are no longer necessary.
2016-05-23 09:39:40 +02:00
John Ferlan
a1344f70a1 qemu: Utilize qemu secret objects for RBD auth/secret
https://bugzilla.redhat.com/show_bug.cgi?id=1182074

If they're available and we need to pass secrets to qemu, then use the
qemu domain secret object in order to pass the secrets for RBD volumes
instead of passing the base64 encoded secret on the command line.

The goal is to make AES secrets the default and have no user interaction
required in order to allow using the AES mechanism. If the mechanism
is not available, then fall back to the current plain mechanism using
a base64 encoded secret.

New APIs:

qemu_domain.c:
  qemuDomainGetSecretAESAlias:
    Generate/return the secret object alias for an AES Secret Info type.
    This will be called from qemuDomainSecretAESSetup.

  qemuDomainSecretAESSetup: (private)
    This API handles the details of the generation of the AES secret
    and saves the pieces that need to be passed to qemu in order for
    the secret to be decrypted. The encrypted secret based upon the
    domain master key, an initialization vector (16 byte random value),
    and the stored secret. Finally, the requirement from qemu is the IV
    and encrypted secret are to be base64 encoded.

qemu_command.c:
  qemuBuildSecretInfoProps: (private)
    Generate/return a JSON properties object for the AES secret to
    be used by both the command building and eventually the hotplug
    code in order to add the secret object. Code was designed so that
    in the future perhaps hotplug could use it if it made sense.

  qemuBuildObjectSecretCommandLine (private)
    Generate and add to the command line the -object secret for the
    secret. This will be required for the subsequent RBD reference
    to the object.

  qemuBuildDiskSecinfoCommandLine (private)
    Handle adding the AES secret object.

Adjustments:

qemu_domain.c:
  The qemuDomainSecretSetup was altered to call either the AES or Plain
  Setup functions based upon whether AES secrets are possible (we have
  the encryption API) or not, we have secrets, and of course if the
  protocol source is RBD.

qemu_command.c:
  Adjust the qemuBuildRBDSecinfoURI API's in order to generate the
  specific command options for an AES secret, such as:

    -object secret,id=$alias,keyid=$masterKey,data=$base64encodedencrypted,
            format=base64
    -drive file=rbd:pool/image:id=myname:auth_supported=cephx\;none:\
           mon_host=mon1.example.org\:6321,password-secret=$alias,...

  where the 'id=' value is the secret object alias generated by
  concatenating the disk alias and "-aesKey0". The 'keyid= $masterKey'
  is the master key shared with qemu, and the -drive syntax will
  reference that alias as the 'password-secret'. For the -drive
  syntax, the 'id=myname' is kept to define the username, while the
  'key=$base64 encoded secret' is removed.

  While according to the syntax described for qemu commit '60390a21'
  or as seen in the email archive:

    https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg04083.html

  it is possible to pass a plaintext password via a file, the qemu
  commit 'ac1d8878' describes the more feature rich 'keyid=' option
  based upon the shared masterKey.

Add tests for checking/comparing output.

NB: For hotplug, since the hotplug code doesn't add command line
    arguments, passing the encoded secret directly to the monitor
    will suffice.
2016-05-20 11:09:05 -04:00
John Ferlan
97868a2b85 qemu: Introduce qemuDomainSecretSetup
Currently just a shim to call qemuDomainSecretPlainSetup, but soon to be more
2016-05-20 11:09:05 -04:00
John Ferlan
238032505f util: Introduce virCryptoGenerateRandom
Move the logic from qemuDomainGenerateRandomKey into this new
function, altering the comments, variable names, and error messages
to keep things more generic.

NB: Although perhaps more reasonable to add soemthing to virrandom.c.
    The virrandom.c was included in the setuid_rpc_client, so I chose
    placement in vircrypto.
2016-05-20 11:09:05 -04:00
Pavel Hrdina
ed7683f4d6 qemu_domain: add a empty listen type address if we remove socket for VNC
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-05-20 10:05:55 +02:00
Cole Robinson
5d7314bbcf qemu: Assign device addresses in PostParse
This wires up qemuDomainAssignAddresses into the new
virDomainDefAssignAddressesCallback, so it's always triggered
via virDomainDefPostParse. We are essentially doing this already
with open coded calls sprinkled about.

qemu argv parse output changes slightly since previously it wasn't
hitting qemuDomainAssignAddresses.
2016-05-18 14:33:58 -04:00
Andrea Bolognani
1a012c9a51 qemu: Automatically choose usable GIC version
When the <gic/> element in not present in the domain XML, use the
domain capabilities to figure out what GIC version is usable and
choose that one automatically.

This allows guests to be created on hardware that only supports
GIC v3 without having to update virt-manager and similar tools.

Keep using the default GIC version if the <gic/> element has been
added to the domain XML but no version has been specified, as not
to break existing guests.
2016-05-18 11:27:50 +02:00
John Ferlan
abd2272c02 secret: Alter virSecretGetSecretString
Rather than returning a "char *" indicating perhaps some sized set of
characters that is NUL terminated, alter the function to return 0 or -1
for success/failure and add two parameters to handle returning the
buffer and it's size.

The function no longer encodes the returned secret, rather it returns
the unencoded secret forcing callers to make the necessary adjustments.

Alter the callers to handle the adjusted model.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-05-16 12:58:48 +02:00
Peter Krempa
fb1dddfb00 qemu: domain: Fix names for functions that clear security info
They don't free the structure itself so they should be called *Clear
rather than *Free.
2016-05-16 12:58:48 +02:00
Peter Krempa
1d632c3924 secret: util: Refactor virSecretGetSecretString
Call the internal driver callbacks rather than the public APIs to avoid
calling unnecessarily the error dispatching code and don't overwrite
the error messages provided by the APIs. They are good enough to
describe which secret is missing either by UUID or the usage (basically
name).
2016-05-16 12:58:48 +02:00
Shivaprasad G Bhat
be1a7e6d31 leave out the default USB controller only on i440fx during migration
Further followup discussions in list on commit 192a53e concluded
that we should be leaving out the USB controller only for
i440fx machines as default USB can be used by someone on q35
at random slots.

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
2016-05-13 10:11:00 +02:00
John Ferlan
677b94f487 qemu: Change from SecretIV or _IV to SecretAES or _AES
The preferred name will be AES not IV, change current references

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-05-12 09:30:08 -04:00
Laine Stump
e5aecc2f80 conf: log error when incorrect PCI root controller is added to domain
libvirt may automatically add a pci-root or pcie-root controller to a
domain, depending on the arch/machinetype, and it hopefully always
makes the right decision about which to add (since in all cases these
controllers are an implicit part of the virtual machine).

But it's always possible that someone will create a config that
explicitly supplies the wrong type of PCI controller for the selected
machinetype. In the past that would lead to an error later when
libvirt was trying to assign addresses to other devices, for example:

  XML error: PCI bus is not compatible with the device at
  0000:00:02.0. Device requires a PCI Express slot, which is not
  provided by bus 0000:00

(that's the error message that appears if you replace the pcie-root
controller in a Q35 domain with a pci-root controller).

This patch adds a check at the same place that the implicit
controllers are added (to ensure that the same logic is used to check
which type of pci root is correct). If a pci controller with index='0'
is already present, we verify that it is of the model that we would
have otherwise added automatically; if not, an error is logged:

  The PCI controller with index='0' must be " model='pcie-root' for
  this machine type, " but model='pci-root' was found instead.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1004602
2016-05-10 17:03:24 -04:00
John Ferlan
fc5c1e7fe9 qemu: Add extra checks for secret destroy API's
Remove the possibility that a NULL hostdev->privateData or a
disk->privateData could crash libvirtd by checking for NULL
before dereferencing for the secinfo structure in the
qemuDomainSecret{Disk|Hostdev}Destroy functions. The hostdevPriv
could be NULL if qemuProcessNetworkPrepareDevices adds a new
hostdev during virDomainNetGetActualHostdev that then gets
inserted via virDomainHostdevInsert. The hostdevPriv was added
by commit id '27726d8' and is currently only used by scsi hostdev.
2016-05-10 15:48:08 -04:00
Peter Krempa
a391a9c5b1 qemu: domain: Don't treat unknown storage type as not having backing chain
qemuDomainCheckDiskPresence has short-circuit code to skip the
determination of the disk backing chain for storage formats that can't
have backing volumes. The code treats VIR_STORAGE_FILE_NONE as not
having backing chain and skips the call to qemuDomainDetermineDiskChain.

This is wrong as qemuDomainDetermineDiskChain is responsible for storage
format detection and has logic to determine the default type if format
detection is disabled.

This allows to storage passed via <disk type="volume"> to circumvent the
enforcement to have correct storage format or that we shall default to
format='raw', since we don't set the default type via the post parse
callback for "volume" backed disks as the translation code could come up
with a better guess.

This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1328003
2016-05-09 13:40:17 +02:00