Commit Graph

727 Commits

Author SHA1 Message Date
Peter Krempa
20ee78bf9b qemu: domain: Properly lookup top of chain in qemuDomainGetStorageSourceByDevstr
When idx is 0 virStorageFileChainLookup returns the base (bottom) of the
backing chain rather than the top. This is expected by the callers of
qemuDomainGetStorageSourceByDevstr.

Add a special case for idx == 0
2017-03-29 16:56:05 +02:00
Andrea Bolognani
7e667664d2 qemu: Fix memory locking limit calculation
For guests that use <memoryBacking><locked>, our only option
is to remove the memory locking limit altogether.

Partially-resolves: https://bugzilla.redhat.com/1431793
2017-03-28 10:54:49 +02:00
Andrea Bolognani
1f7661af8c qemu: Remove qemuDomainRequiresMemLock()
Instead of having a separate function, we can simply return
zero from the existing qemuDomainGetMemLockLimitBytes() to
signal the caller that the memory locking limit doesn't need
to be set for the guest.

Having a single function instead of two makes it less likely
that we will use the wrong value, which is exactly what
happened when we started applying the limit that was meant
for VFIO-using guests to <memoryBacking><locked>-using
guests.
2017-03-28 10:54:47 +02:00
Andrea Bolognani
4b67e7a377 Revert "qemu: Forbid <memoryBacking><locked> without <memtune><hard_limit>"
This reverts commit c2e60ad0e5.

Turns out this check is excessively strict: there are ways
other than <memtune><hard_limit> to raise the memory locking
limit for QEMU processes, one prominent example being
tweaking /etc/security/limits.conf.

Partially-resolves: https://bugzilla.redhat.com/1431793
2017-03-28 10:44:25 +02:00
Erik Skultety
c8e6775f30 qemu: Bump the memory locking limit for mdevs as well
Since mdevs are just another type of VFIO devices, we should increase
the memory locking limit the same way we do for VFIO PCI devices.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2017-03-27 15:39:35 +02:00
Erik Skultety
de4e8bdbc7 qemu: cgroup: Adjust cgroups' logic to allow mediated devices
As goes for all the other hostdev device types, grant the qemu process
access to /dev/vfio/<mediated_device_iommu_group>.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2017-03-27 15:39:35 +02:00
Erik Skultety
ec783d7c77 conf: Introduce new hostdev device type mdev
A mediated device will be identified by a UUID (with 'model' now being
a mandatory <hostdev> attribute to represent the mediated device API) of
the user pre-created mediated device. We also need to make sure that if
user explicitly provides a guest address for a mdev device, the address
type will be matching the device API supported on that specific mediated
device and error out with an incorrect XML message.

The resulting device XML:
<devices>
  <hostdev mode='subsystem' type='mdev' model='vfio-pci'>
    <source>
      <address uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'>
    </source>
  </hostdev>
</devices>

Signed-off-by: Erik Skultety <eskultet@redhat.com>
2017-03-27 15:39:35 +02:00
Peter Krempa
9b93c4c264 qemu: domain: Add helper to look up disk soruce by the backing store string 2017-03-27 10:18:16 +02:00
Peter Krempa
4e1618ce72 qemu: domain: Add helper to generate indexed backing store names
The code is currently simple, but if we later add node names, it will be
necessary to generate the names based on the node name. Add a helper so
that there's a central point to fix once we add self-generated node
names.
2017-03-27 09:29:57 +02:00
Peter Krempa
1a5e2a8098 qemu: domain: Add helper to lookup disk by node name
Looks up a disk and its corresponding backing chain element by node
name.
2017-03-27 09:29:57 +02:00
John Ferlan
1a6b6d9a56 qemu: Set up the migration TLS objects for target
If the migration flags indicate this migration will be using TLS,
then set up the destination during the prepare phase once the target
domain has been started to add the TLS objects to perform the migration.

This will create at least an "-object tls-creds-x509,endpoint=server,..."
for TLS credentials and potentially an "-object secret,..." to handle the
passphrase response to access the TLS credentials. The alias/id used for
the TLS objects will contain "libvirt_migrate".

Once the objects are created, the code will set the "tls-creds" and
"tls-hostname" migration parameters to signify usage of TLS.

During the Finish phase we'll be sure to attempt to clear the
migration parameters and delete those objects (whether or not they
were created). We'll also perform the same reset during recovery
if we've reached FINISH3.

If the migration isn't using TLS, then be sure to check if the
migration parameters exist and clear them if so.
2017-03-25 08:19:49 -04:00
Jiri Denemark
fcd56ce866 qemu: Set default values for CPU check attribute
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2017-03-17 11:50:48 +01:00
Michal Privoznik
7b89f857d9 qemu: Namespaces for NVDIMM
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-03-15 17:04:33 +01:00
Michal Privoznik
1bc173199e qemu: Implement NVDIMM
So, majority of the code is just ready as-is. Well, with one
slight change: differentiate between dimm and nvdimm in places
like device alias generation, generating the command line and so
on.

Speaking of the command line, we also need to append 'nvdimm=on'
to the '-machine' argument so that the nvdimm feature is
advertised in the ACPI tables properly.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-03-15 14:16:32 +01:00
Michal Privoznik
b4e8a49f8d Introduce NVDIMM memory model
NVDIMM is new type of memory introduced into QEMU 2.6. The idea
is that we have a Non-Volatile memory module that keeps the data
persistent across domain reboots.

At the domain XML level, we already have some representation of
'dimm' modules. Long story short, NVDIMM will utilize the
existing <memory/> element that lives under <devices/> by adding
a new attribute 'nvdimm' to the existing @model and introduce a
new <path/> element for <source/> while reusing other fields. The
resulting XML would appear as:

    <memory model='nvdimm'>
      <source>
        <path>/tmp/nvdimm</path>
      </source>
      <target>
        <size unit='KiB'>523264</size>
        <node>0</node>
      </target>
      <address type='dimm' slot='0'/>
    </memory>

So far, this is just a XML parser/formatter extension. QEMU
driver implementation is in the next commit.

For more info on NVDIMM visit the following web page:

    http://pmem.io/

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-03-15 13:30:58 +01:00
Michal Privoznik
290a00e41d qemuDomainBuildNamespace: Handle file mount points
https://bugzilla.redhat.com/show_bug.cgi?id=1431112

Yeah, that's right. A mount point doesn't have to be a directory.
It can be a file too. However, the code that tries to preserve
mount points under /dev for new namespace for qemu does not count
with that option.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-03-13 13:32:45 +01:00
Michal Privoznik
e915942b05 qemuProcessHandleMonitorEOF: Disable namespace for domain
https://bugzilla.redhat.com/show_bug.cgi?id=1430634

If a qemu process has died, we get EOF on its monitor. At this
point, since qemu process was the only one running in the
namespace kernel has already cleaned the namespace up. Any
attempt of ours to enter it has to fail.

This really happened in the bug linked above. We've tried to
attach a disk to qemu and while we were in the monitor talking to
qemu it just died. Therefore our code tried to do some roll back
(e.g. deny the device in cgroups again, restore labels, etc.).
However, during the roll back (esp. when restoring labels) we
still thought that domain has a namespace. So we used secdriver's
transactions. This failed as there is no namespace to enter.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-03-10 16:02:34 +01:00
Pavel Hrdina
cd4a8b9304 conf: store "autoGenerated" for graphics listen in status XML
When libvirtd is started we call qemuDomainRecheckInternalPaths
to detect whether a domain has VNC socket path generated by libvirt
based on option from qemu.conf.  However if we are parsing status XML
for running domain the existing socket path can be generated also if
the config XML uses the new <listen type='socket'/> element without
specifying any socket.

The current code doesn't make difference how the socket was generated
and always marks it as "fromConfig".  We need to store the
"autoGenerated" value in the status XML in order to preserve that
information.

The difference between "fromConfig" and "autoGenerated" is important
for migration, because if the socket is based on "fromConfig" we don't
print it into the migratable XML and we assume that user has properly
configured qemu.conf on both hosts.  However if the socket is based
on "autoGenerated" it means that a new feature was used and therefore
we need to leave the socket in migratable XML to make sure that if
this feature is not supported on destination the migration will fail.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2017-03-09 10:22:43 +01:00
John Ferlan
b2e5de96c7 qemu: Rename variable
Rename 'secretUsageType' to 'usageType' since it's superfluous in an
API qemu*Secret*
2017-03-08 14:37:05 -05:00
John Ferlan
7c2b7891cc qemu: Introduce qemuDomainSecretInfoTLSNew
Building upon the qemuDomainSecretInfoNew, create a helper which will
build the secret used for TLS.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-03-08 14:31:09 -05:00
John Ferlan
c9a7b7b6ea qemu: Introduce qemuDomainSecretInfoNew
Create a helper which will create the secinfo used for disks, hostdevs,
and chardevs.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2017-03-08 14:31:07 -05:00
Pavel Hrdina
3ffea19acd qemu_domain: cleanup the controller post parse code
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2017-03-07 16:50:35 +01:00
Pavel Hrdina
57404ff7a7 qemu_domain: move controller post parse code into its own function
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2017-03-07 16:50:34 +01:00
Michal Privoznik
4da534c0b9 qemu: Enforce qemuSecurity wrappers
Now that we have some qemuSecurity wrappers over
virSecurityManager APIs, lets make sure everybody sticks with
them. We have them for a reason and calling virSecurityManager
API directly instead of wrapper may lead into accidentally
labelling a file on the host instead of namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-03-06 08:54:28 +01:00
Marc Hartmayer
e22de286b1 qemu: Fix deadlock across fork() in QEMU driver
The functions in virCommand() after fork() must be careful with regard
to accessing any mutexes that may have been locked by other threads in
the parent process. It is possible that another thread in the parent
process holds the lock for the virQEMUDriver while fork() is called.
This leads to a deadlock in the child process when
'virQEMUDriverGetConfig(driver)' is called and therefore the handshake
never completes between the child and the parent process. Ultimately
the virDomainObjectPtr will never be unlocked.

It gets much worse if the other thread of the parent process, that
holds the lock for the virQEMUDriver, tries to lock the already locked
virDomainObject. This leads to a completely unresponsive libvirtd.

It's possible to reproduce this case with calling 'virsh start XXX'
and 'virsh managedsave XXX' in a tight loop for multiple domains.

This commit fixes the deadlock in the same way as it is described in
commit 61b52d2e38.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2017-02-21 15:47:32 +01:00
Michal Privoznik
5c74cf1f44 qemu: Allow @rendernode for virgl domains
When enabling virgl, qemu opens /dev/dri/render*. So far, we are
not allowing that in devices CGroup nor creating the file in
domain's namespace and thus requiring users to set the paths in
qemu.conf. This, however, is suboptimal as it allows access to
ALL qemu processes even those which don't have virgl configured.
Now that we have a way to specify render node that qemu will use
we can be more cautious and enable just that.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-20 10:44:22 +01:00
Michal Privoznik
1bb787fdc9 qemuDomainGetHostdevPath: Report /dev/vfio/vfio less frequently
So far, qemuDomainGetHostdevPath has no knowledge of the reasong
it is called and thus reports /dev/vfio/vfio for every VFIO
backed device. This is suboptimal, as we want it to:

a) report /dev/vfio/vfio on every addition or domain startup
b) report /dev/vfio/vfio only on last VFIO device being unplugged

If a domain is being stopped then namespace and CGroup die with
it so no need to worry about that. I mean, even when a domain
that's exiting has more than one VFIO devices assigned to it,
this function does not clean /dev/vfio/vfio in CGroup nor in the
namespace. But that doesn't matter.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-20 07:21:59 +01:00
Michal Privoznik
b8e659aa98 qemuDomainGetHostdevPath: Create /dev/vfio/vfio iff needed
So far, we are allowing /dev/vfio/vfio in the devices cgroup
unconditionally (and creating it in the namespace too). Even if
domain has no hostdev assignment configured. This is potential
security hole. Therefore, when starting the domain (or
hotplugging a hostdev) create & allow /dev/vfio/vfio too (if
needed).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-20 07:21:58 +01:00
Michal Privoznik
9d92f533f8 qemuSetupHostdevCgroup: Use qemuDomainGetHostdevPath
Since these two functions are nearly identical (with
qemuSetupHostdevCgroup actually calling virCgroupAllowDevicePath)
we can have one function call the other and thus de-duplicate
some code.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-02-20 07:21:58 +01:00
Michal Privoznik
b57bd206b9 qemu_conf: Check for namespaces availability more wisely
The bare fact that mnt namespace is available is not enough for
us to allow/enable qemu namespaces feature. There are other
requirements: we must copy all the ACL & SELinux labels otherwise
we might grant access that is administratively forbidden or vice
versa.
At the same time, the check for namespace prerequisites is moved
from domain startup time to qemu.conf parser as it doesn't make
much sense to allow users to start misconfigured libvirt just to
find out they can't start a single domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-15 12:43:23 +01:00
Andrea Bolognani
ee6ec7824d qemu: Call chmod() after mknod()
mknod() is affected my the current umask, so we're not
guaranteed the newly-created device node will have the
right permissions.

Call chmod(), which is not affected by the current umask,
immediately afterwards to solve the issue.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1421036
2017-02-14 19:23:05 +01:00
Ján Tomko
723fef99c0 qemu: enforce maximum ports value for nec-xhci
This controller only allows up to 15 ports.

https://bugzilla.redhat.com/show_bug.cgi?id=1375417
2017-02-13 16:34:09 +01:00
Michal Privoznik
c2130c0d47 qemu_security: Introduce ImageLabel APIs
Just like we need wrappers over other virSecurityManager APIs, we
need one for virSecurityManagerSetImageLabel and
virSecurityManagerRestoreImageLabel. Otherwise we might end up
relabelling device in wrong namespace.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-09 08:04:57 +01:00
Michal Privoznik
b7feabbfdc qemuDomainNamespaceSetupDisk: Simplify disk check
Firstly, instead of checking for next->path the
virStorageSourceIsEmpty() function should be used which also
takes disk type into account.
Secondly, not every disk source passed has the correct type set
(due to our laziness). Therefore, instead of checking for
virStorageSourceIsBlockLocal() and also S_ISBLK() the former can
be refined to just virStorageSourceIsLocalStorage().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:56:21 +01:00
Michal Privoznik
786d8d91b4 qemuDomainDiskChainElement{Prepare,Revoke}: manage /dev entry
Again, one missed bit. This time without this commit there is no
/dev entry  in the namespace of the qemu process when doing disk
snapshots or block-copy.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:56:13 +01:00
Michal Privoznik
18ce9d139d qemuDomainNamespace{Setup,Teardown}Disk: Don't pass pointer to full disk
These functions do not need to see the whole virDomainDiskDef.
Moreover, they are going to be called from places where we don't
have access to the full disk definition. Sticking with
virStorageSource is more than enough.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:56:05 +01:00
Michal Privoznik
76d491ef14 qemuDomainNamespaceSetupDisk: Drop useless @src variable
Since its introduction in 81df21507b this variable was never
used.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:55:56 +01:00
Michal Privoznik
8dc867e978 qemu_domain: Don't pass virDomainDeviceDefPtr to ns helpers
There is no need for this. None of the namespace helpers uses it.
Historically it was used when calling secdriver APIs, but we
don't to that anymore.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-08 15:55:52 +01:00
Andrea Bolognani
c2e60ad0e5 qemu: Forbid <memoryBacking><locked> without <memtune><hard_limit>
In order for memory locking to work, the hard limit on memory
locking (and usage) has to be set appropriately by the user.

The documentation mentions the requirement already: with this
patch, it's going to be enforced by runtime checks as well,
by forbidding a non-compliant guest from being defined as well
as edited and started.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1316774
2017-02-07 18:43:10 +01:00
Michal Privoznik
7f0b382522 qemuDomainAttachDeviceMknod: Don't loop endlessly
When working with symlinks it is fairly easy to get into a loop.
Don't.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 13:20:19 +01:00
Michal Privoznik
3f5fcacf89 qemuDomainAttachDeviceMknod: Deal with symlinks
Similarly to one of the previous commits, we need to deal
properly with symlinks in hotplug case too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 13:20:17 +01:00
Michal Privoznik
4ac847f93b qemuDomainCreateDevice: Don't loop endlessly
When working with symlinks it is fairly easy to get into a loop.
Don't.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 13:18:32 +01:00
Michal Privoznik
54ed672214 qemuDomainCreateDevice: Properly deal with symlinks
Imagine you have a disk with the following source set up:

/dev/disk/by-uuid/$uuid (symlink to) -> /dev/sda

After cbc45525cb the transitive end of the symlink chain is
created (/dev/sda), but we need to create any item in chain too.
Others might rely on that.
In this case, /dev/disk/by-uuid/$uuid comes from domain XML thus
it is this path that secdriver tries to relabel. Not the resolved
one.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 13:18:10 +01:00
Michal Privoznik
b621291f5c qemuDomain{Attach,Detach}Device NS helpers: Don't relabel devices
After previous commit this has become redundant step.
Also setting up devices in namespace and setting their label
later on are two different steps and should be not done at once.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-02-07 10:40:53 +01:00
Michal Privoznik
572eda12ad qemu: Implement mtu on interface
Not only we should set the MTU on the host end of the device but
also let qemu know what MTU did we set.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-26 10:00:01 +01:00
Michal Privoznik
b020cf73fe domain_conf: Introduce <mtu/> to <interface/>
So far we allow to set MTU for libvirt networks. However, not all
domain interfaces have to be plugged into a libvirt network and
even if they are, they might want to have a different MTU (e.g.
for testing purposes).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-26 09:59:56 +01:00
Chen Hanxiao
980f2a35c7 qemu_domain: add timestamp in tainting of guests log
We lacked of timestamp in tainting of guests log,
which bring troubles for finding guest issues:
such as whether a guest powerdown caused by qemu-monitor-command
or others issues inside guests.
If we had timestamp in tainting of guests log,
it would be helpful when checking guest's /var/log/messages.

Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
2017-01-21 12:34:19 -05:00
Michal Privoznik
57b5e27d3d qemu: set default vhost-user ifname
Based on work of Mehdi Abaakouk <sileht@sileht.net>.

When parsing vhost-user interface XML and no ifname is found we
can try to fill it in in post parse callback. The way this works
is we try to make up interface name from given socket path and
then ask openvswitch whether it knows the interface.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-20 15:42:12 +01:00
Michal Privoznik
d0baf54e53 qemu: Actually unshare() iff running as root
https://bugzilla.redhat.com/show_bug.cgi?id=1413922

While all the code that deals with qemu namespaces correctly
detects whether we are running as root (and turn into NO-OP for
qemu:///session) the actual unshare() call is not guarded with
such check. Therefore any attempt to start a domain under
qemu:///session shall fail as unshare() is reserved for root.

The fix consists of moving unshare() call (for which we have a
wrapper called virProcessSetupPrivateMountNS) into
qemuDomainBuildNamespace() where the proper check is performed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
2017-01-17 13:23:56 +01:00
Michal Privoznik
93a062c3b2 qemu: Copy SELinux labels for namespace too
When creating new /dev/* for qemu, we do chown() and copy ACLs to
create the exact copy from the original /dev. I though that
copying SELinux labels is not necessary as SELinux will chose the
sane defaults. Surprisingly, it does not leaving namespace with
the following labels:

crw-rw-rw-. root root system_u:object_r:tmpfs_t:s0     random
crw-------. root root system_u:object_r:tmpfs_t:s0     rtc0
drwxrwxrwt. root root system_u:object_r:tmpfs_t:s0     shm
crw-rw-rw-. root root system_u:object_r:tmpfs_t:s0     urandom

As a result, domain is unable to start:

error: internal error: process exited while connecting to monitor: Error in GnuTLS initialization: Failed to acquire random data.
qemu-kvm: cannot initialize crypto: Unable to initialize GNUTLS library: Failed to acquire random data.

The solution is to copy the SELinux labels as well.

Reported-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-13 14:45:52 +01:00
Michal Privoznik
cbc45525cb qemuDomainCreateDevice: Canonicalize paths
So far the decision whether /dev/* entry is created in the qemu
namespace is really simple: does the path starts with "/dev/"?
This can be easily fooled by providing path like the following
(for any considered device like disk, rng, chardev, ..):

  /dev/../var/lib/libvirt/images/disk.qcow2

Therefore, before making the decision the path should be
canonicalized.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 18:08:13 +01:00
Michal Privoznik
49f326edc0 qemu: Use namespaces iff available on the host kernel
So far the namespaces were turned on by default unconditionally.
For all non-Linux platforms we provided stub functions that just
ignored whatever namespaces setting there was in qemu.conf and
returned 0 to indicate success. Moreover, we didn't really check
if namespaces are available on the host kernel.

This is suboptimal as we might have ignored user setting.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 18:07:43 +01:00
Michal Privoznik
41816751a7 util: Introduce virFileMoveMount
This is a simple wrapper over mount(). However, not every system
out there is capable of moving a mount point. Therefore, instead
of having to deal with this fact in all the places of our code we
can have a simple wrapper and deal with this fact at just one
place.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 18:06:30 +01:00
Michal Privoznik
2ff8c30548 qemuDomainSetupAllInputs: Update debug message
Due to a copy-paste error, the debug message reads:

  Setting up disks

It should have been:

  Setting up inputs.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-11 17:39:24 +01:00
Michal Privoznik
269589146c qemu_domain: Move qemuDomainGetPreservedMounts
This function is used only from code compiled on Linux. Therefore
on non-Linux platforms it triggers compilation error:

../../src/qemu/qemu_domain.c:209:1: error: unused function 'qemuDomainGetPreservedMounts' [-Werror,-Wunused-function]

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 19:23:49 +01:00
Michal Privoznik
406e390962 qemu: Drop qemuDomainDeleteNamespace
After previous commits, this function is no longer needed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:57 +01:00
Michal Privoznik
5d198c2b2c qemuDomainCreateNamespace: move mkdir to qemuDomainBuildNamespace
Again, there is no need to create /var/lib/libvirt/$domain.*
directories in CreateNamespace(). It is sufficient to create them
as soon as we need them which is in BuildNamespace. This way we
don't leave them around for the whole lifetime of domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:57 +01:00
Michal Privoznik
5d30057695 qemuDomainGetPreservedMounts: Do not special case /dev
The c1140eb9e got me thinking. We don't want to special case /dev
in qemuDomainGetPreservedMounts(), but in all other places in the
code we special case it anyway. I mean,
/var/run/libvirt/$domain.dev path is constructed separately just
so that it is not constructed here. It makes only a little sense
(if any at all).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:57 +01:00
Michal Privoznik
40ebbf72d5 qemuDomainCreateNamespace: s/unlink/rmdir/
If something goes wrong in this function we try a rollback. That
is unlink all the directories we created earlier. For some weird
reason unlink() was called instead of rmdir().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-10 13:04:57 +01:00
Martin Kletzander
c1140eb9ed qemu: Remove /dev mount info properly
Just so it doesn't bite us in the future, even though it's unlikely.

And fix the comment above it as well.  Commit e08ee7cd34 took the
info from the function it's calling, but that was lie itself in the
first place.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2017-01-05 16:24:55 +01:00
Michal Privoznik
e08ee7cd34 qemuDomainGetPreservedMounts: Fetch list of /dev/* mounts dynamically
With my namespace patches, we are spawning qemu in its own
namespace so that we can manage /dev entries ourselves. However,
some filesystems mounted under /dev needs to be preserved in
order to be shared with the parent namespace (e.g. /dev/pts).
Currently, the list of mount points to preserve is hardcoded
which ain't right - on some systems there might be less or more
items under real /dev that on our list. The solution is to parse
/proc/mounts and fetch the list from there.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-05 16:00:20 +01:00
Michal Privoznik
dd78da09b0 qemuDomainCreateDevice: Be more careful about device path
Again, not something that I'd hit, but there is a chance in
theory that this might bite us. Currently the way we decide
whether or not to create /dev entry for a device is by marching
first four characters of path with "/dev". This might be not
enough. Just imagine somebody has a disk image stored under
"/devil/path/to/disk". We ought to be matching against "/dev/".

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-04 15:36:42 +01:00
Michal Privoznik
ce01a2b11c qemuDomainAttachDeviceMknodHelper: Don't unlink() so often
Not that I'd encounter any bug here, but the code doesn't look
100% correct. Imagine, somebody is trying to attach a device to a
domain, and the device's /dev entry already exists in the qemu
namespace. This is handled gracefully and the control continues
with setting up ACLs and calling security manager to set up
labels. Now, if any of these steps fail, control jump on the
'cleanup' label and unlink() the file straight away. Even when it
was not us who created the file in the first place. This can be
possibly dangerous.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-04 15:36:42 +01:00
Michal Privoznik
3aae99fe71 qemu: Handle EEXIST gracefully in qemuDomainCreateDevice
https://bugzilla.redhat.com/show_bug.cgi?id=1406837

Imagine you have a domain configured in such way that you are
assigning two PCI devices that fall into the same IOMMU group.
With mount namespace enabled what happens is that for the first
PCI device corresponding /dev/vfio/X entry is created and when
the code tries to do the same for the second mknod() fails as
/dev/vfio/X already exists:

2016-12-21 14:40:45.648+0000: 24681: error :
qemuProcessReportLogError:1792 : internal error: Process exited
prior to exec: libvirt: QEMU Driver error : Failed to make device
/var/run/libvirt/qemu/windoze.dev//vfio/22: File exists

Worse, by default there are some devices that are created in the
namespace regardless of domain configuration (e.g. /dev/null,
/dev/urandom, etc.). If one of them is set as backend for some
guest device (e.g. rng, chardev, etc.) it's the same story as
described above.

Weirdly, in attach code this is already handled.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-01-04 15:36:42 +01:00
John Ferlan
7f7d990483 qemu: Don't assume secret provided for LUKS encryption
https://bugzilla.redhat.com/show_bug.cgi?id=1405269

If a secret was not provided for what was determined to be a LUKS
encrypted disk (during virStorageFileGetMetadata processing when
called from qemuDomainDetermineDiskChain as a result of hotplug
attach qemuDomainAttachDeviceDiskLive), then do not attempt to
look it up (avoiding a libvirtd crash) and do not alter the format
to "luks" when adding the disk; otherwise, the device_add would
fail with a message such as:

   "unable to execute QEMU command 'device_add': Property 'scsi-hd.drive'
    can't find value 'drive-scsi0-0-0-0'"

because of assumptions that when the format=luks that libvirt would have
provided the secret to decrypt the volume.

Access to unlock the volume will thus be left to the application.
2017-01-03 12:59:18 -05:00
Marc Hartmayer
fb2cd32c9a qemu: qemuDomainDiskChangeSupported: Add missing 'address' check
Disk->info is not live updatable so add a check for this. Otherwise
libvirt reports success even though no data was updated.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-12-20 11:22:44 +01:00
Michal Privoznik
ab41ce7f4e qemu: Mark more namespace code linux-only
Some of the functions are not called on non-linux platforms
which makes them useless there.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-16 11:51:06 +00:00
Michal Privoznik
f444faa94a qemu: Enable mount namespace
https://bugzilla.redhat.com/show_bug.cgi?id=1404952

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
661887f558 qemu: Let users opt-out from containerization
Given how intrusive previous patches are, it might happen that
there's a bug or imperfection. Lets give users a way out: if they
set 'namespaces' to an empty array in qemu.conf the feature is
suppressed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
f95c5c48d4 qemu: Manage /dev entry on RNG hotplug
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
f5fdf23a68 qemu: Manage /dev entry on chardev hotplug
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
6e57492839 qemu: Manage /dev entry on hostdev hotplug
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
81df21507b qemu: Manage /dev entry on disk hotplug
When attaching a device to a domain that's using separate mount
namespace we must maintain /dev entries in order for qemu process
to see them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
2160f338a7 qemu: Prepare RNGs when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
8ec8a8c5ff qemu: Prepare inputs when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
2c654490f3 qemu: Prepare TPM when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
4e4451019c qemu: Prepare chardevs when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
73267cec46 qemu: Prepare hostdevs when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
054202d020 qemu: Prepare disks when starting a domain
When starting a domain and separate mount namespace is used, we
have to create all the /dev entries that are configured for the
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
bb4e529664 qemu: Spawn qemu under mount namespace
Prime time. When it comes to spawning qemu process and
relabelling all the devices it's going to touch, there's inherent
race with other applications in the system (e.g. udev). Instead
of trying convincing udev to not touch libvirt managed devices,
we can create a separate mount namespace for the qemu, and mount
our own /dev there. Of course this puts more work onto us as we
have to maintain /dev files on each domain start and device
hot(un-)plug. On the other hand, this enhances security also.

From technical POV, on domain startup process the parent
(libvirtd) creates:

  /var/lib/libvirt/qemu/$domain.dev
  /var/lib/libvirt/qemu/$domain.devpts

The child (which is going to be qemu eventually) calls unshare()
to create new mount namespace. From now on anything that child
does is invisible to the parent. Child then mounts tmpfs on
$domain.dev (so that it still sees original /dev from the host)
and creates some devices (as explained in one of the previous
patches). The devices have to be created exactly as they are in
the host (including perms, seclabels, ACLs, ...). After that it
moves $domain.dev mount to /dev.

What's the $domain.devpts mount there for then you ask? QEMU can
create PTYs for some chardevs. And historically we exposed the
host ends in our domain XML allowing users to connect to them.
Therefore we must preserve devpts mount to be shared with the
host's one.

To make this patch as small as possible, creating of devices
configured for domain in question is implemented in next patches.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-15 09:25:16 +01:00
Michal Privoznik
7ed6934f3b virDomainObjGetShortName: take virDomainDef
So far this function takes virDomainObjPtr which:
1) is an overkill,
2) might be not available in all the places we will use it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-12-08 15:45:52 +01:00
Laine Stump
9b0848d523 qemu: propagate virQEMUDriver object to qemuDomainDeviceCalculatePCIConnectFlags
If libvirtd is running unprivileged, it can open a device's PCI config
data in sysfs, but can only read the first 64 bytes. But as part of
determining whether a device is Express or legacy PCI,
qemuDomainDeviceCalculatePCIConnectFlags() will be updated in a future
patch to call virPCIDeviceIsPCIExpress(), which tries to read beyond
the first 64 bytes of the PCI config data and fails with an error log
if the read is unsuccessful.

In order to avoid creating a parallel "quiet" version of
virPCIDeviceIsPCIExpress(), this patch passes a virQEMUDriverPtr down
through all the call chains that initialize the
qemuDomainFillDevicePCIConnectFlagsIterData, and saves the driver
pointer with the rest of the iterdata so that it can be used by
qemuDomainDeviceCalculatePCIConnectFlags(). This pointer isn't used
yet, but will be used in an upcoming patch (that detects Express vs
legacy PCI for VFIO assigned devices) to examine driver->privileged.
2016-11-30 15:28:07 -05:00
Michal Privoznik
c2a5a4e7ea virstring: Unify string list function names
We have couple of functions that operate over NULL terminated
lits of strings. However, our naming sucks:

virStringJoin
virStringFreeList
virStringFreeListCount
virStringArrayHasString
virStringGetFirstWithPrefix

We can do better:

virStringListJoin
virStringListFree
virStringListFreeCount
virStringListHasString
virStringListGetFirstWithPrefix

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-11-25 13:54:05 +01:00
Nikolay Shirokovskiy
aaf2992d90 qemu: agent: fix unsafe agent access
qemuDomainObjExitAgent is unsafe.

First it accesses domain object without domain lock.
Second it uses outdated logic that goes back to commit 79533da1 of
year 2009 when code was quite different. (unref function
instead of unreferencing only unlocked and disposed object
in case of last reference and leaved unlocking to the caller otherwise).
Nowadays this logic may lead to disposing locked object
i guess.

Another problem is that the callers of qemuDomainObjEnterAgent
use domain object again (namely priv->agent) without domain lock.

This patch address these two problems.

qemuDomainGetAgent is dropped as unused.
2016-11-23 11:31:28 +03:00
Nikolay Shirokovskiy
3c1c56781d qemu: drop write-only agentStart 2016-11-23 11:31:14 +03:00
Marc Hartmayer
1c122e737e Refactoring: Use virHostdevIsSCSIDevice()
Use the util function virHostdevIsSCSIDevice() to simplify if
statements.

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-11-22 14:37:36 +01:00
Marc Hartmayer
505bc9b025 qemu: Fix improper union member access on hostdevs
Add missing checks if a hostdev is a subsystem/SCSI device before access
the union member 'subsys'/'scsi'.  Also fix indentation and simplify
qemuDomainObjCheckHostdevTaint().

Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-11-22 14:37:36 +01:00
Peter Krempa
0df2524acb qemu: domain: Refresh vcpu halted state using qemuMonitorGetCpuHalted
Don't use qemuMonitorGetCPUInfo which does a lot of matching to get the
full picture which is not necessary and would be mostly discarded.

Refresh only the vcpu halted state using data from query-cpus.
2016-11-21 17:19:48 +01:00
Peter Krempa
3f71c79768 qemu: monitor: Extract qemu cpu id along with other data
Storing of the ID will allow simpler extraction of data present only in
query-cpus without the need to call qemuMonitorGetCPUInfo in statistics
paths.
2016-11-21 17:19:48 +01:00
Laine Stump
d8bd837669 qemu: add a USB3 controller to Q35 domains by default
Previously we added a set of EHCI+UHCI controllers to Q35 machines to
mimic real hardware as closely as possible, but recent discussions
have pointed out that the nec-usb-xhci (USB3) controller is much more
virtualization-friendly (uses less CPU), so this patch switches the
default for Q35 machinetypes to add an XHCI instead (if it's
supported, which it of course *will* be).

Since none of the existing test cases left out USB controllers in the
input XML, a new Q35 test case was added which has *no* devices, so
ends up with only the defaults always put in by qemu, plus those added
by libvirt.
2016-11-14 14:22:23 -05:00
Laine Stump
807232203a qemu: don't force-add a dmi-to-pci-bridge just on principle
Now the a dmi-to-pci-bridge is automatically added just as it's needed
(when a pci-bridge is being added), we no longer have any need to
force-add one to every single Q35 domain.
2016-11-14 14:21:43 -05:00
Laine Stump
50adb8a660 qemu: new functions qemuDomainMachineHasPCI[e]Root()
These functions provide a simple one line method of learning if the
current domain has a pci-root or pcie-root bus.
2016-11-14 14:03:09 -05:00
Martin Kletzander
acf0ec024a qemu: Save various defaults for shmem
We're keeping some things at default and that's not something we want to
do intentionally.  Let's save some sensible defaults upfront in order to
avoid having problems later.  The details for the defaults (of the newer
implementation) can be found in qemu's commit 5400c02b90bb:

  http://git.qemu.org/?p=qemu.git;a=commit;h=5400c02b90bb

Since we are merely saving the defaults it will not change the guest ABI
and thanks to the fact that we're doing it in the PostParse callback it
will not break the ABI stability checks.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-11-02 16:05:39 +01:00
John Ferlan
daf5c651f0 qemu: Add a secret object to/for a char source dev
Add the secret object so the 'passwordid=' can be added if the command line
if there's a secret defined in/on the host for TCP chardev TLS objects.

Preparation for the secret involves adding the secinfo to the char source
device prior to command line processing. There are multiple possibilities
for TCP chardev source backend usage.

Add test for at least a serial chardev as an example.
2016-10-26 07:18:25 -04:00
Viktor Mihajlovski
08f22976b1 qemu: Add domain support for VCPU halted state
Adding a field to the domain's private vcpu object to hold the halted
state information.
Adding two functions in support of the halted state:
- qemuDomainGetVcpuHalted: retrieve the halted state from a
  private vcpu object
- qemuDomainRefreshVcpuHalted: obtain the per-vcpu halted states
  via qemu monitor and store the results in the private vcpu objects

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Hao QingFeng <haoqf@linux.vnet.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
2016-10-24 18:52:36 -04:00
Pavel Hrdina
7c8df1e82f domain: fix migration to older libvirt
Since TLS was introduced hostwide for libvirt 2.3.0 and a domain
configurable haveTLS was implemented for libvirt 2.4.0, we have to
modify the migratable XML for specific case where the 'tls' attribute
is based on setting from qemu.conf.

The "tlsFromConfig" is libvirt internal attribute and is stored only in
status XML to ensure that when libvirtd is restarted this internal flag
is not lost by the restart.

That flag is used to decide whether we should put *tls* attribute to
migratable XML or not.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-10-24 16:29:26 +02:00
Pavel Hrdina
0298531b29 domain: Add optional 'tls' attribute for TCP chardev
Add an optional "tls='yes|no'" attribute for a TCP chardev.

For QEMU, this will allow for disabling the host config setting of the
'chardev_tls' for a domain chardev channel by setting the value to "no" or
to attempt to use a host TLS environment when setting the value to "yes"
when the host config 'chardev_tls' setting is disabled, but a TLS environment
is configured via either the host config 'chardev_tls_x509_cert_dir' or
'default_tls_x509_cert_dir'

Signed-off-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-10-24 16:05:33 +02:00
John Ferlan
7bd8312e7f conf: Move the privateData from virDomainChrDef to virDomainChrSourceDef
Commit id '5f2a132786' should have placed the data in the host source
def structure since that's also used by smartcard, redirdev, and rng in
order to provide a backend tcp channel.  The data in the private structure
will be necessary in order to provide the secret properly.

This also renames the previous names from "Chardev" to "ChrSource" for
the private data structures and API's
2016-10-21 16:42:59 -04:00
John Ferlan
77a12987a4 Introduce virDomainChrSourceDefNew for virDomainChrDefPtr
Change the virDomainChrDef to use a pointer to 'source' and allocate
that pointer during virDomainChrDefNew.

This has tremendous "fallout" in the rest of the code which mainly
has to change source.$field to source->$field.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-10-21 14:03:36 -04:00
John Ferlan
5f2a132786 qemu: Introduce qemuDomainChardevPrivatePtr
Modeled after the qemuDomainHostdevPrivatePtr (commit id '27726d8c'),
create a privateData pointer in the _virDomainChardevDef to allow storage
of private data for a hypervisor in order to at least temporarily store
secret data for usage during qemuBuildCommandLine.

NB: Since the qemu_parse_command (qemuParseCommandLine) code is not
expecting to restore the secret data, there's no need to add code
code to handle this new structure there.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-10-19 15:40:29 -04:00
John Ferlan
6262a9b282 qemu: Remove unnecessary NULL arg check
qemuDomainSecret{Disk|Hostdev}Prepare has a prototype that checks for
ATTRIBUTE_NONNULL(1) for 'conn'.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-10-17 15:38:32 -04:00
Pavel Hrdina
fb8f3b1c22 qemu_command: add support to use virtio as secondary video device
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1369633

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-10-12 17:46:48 +02:00
Pavel Hrdina
4c029e8cfa qemu_command: properly detect which model to use for video device
This improves commit 706b5b6277 in a way that we check qemu capabilities
instead of what architecture we are running on to detect whether we can
use *virtio-vga* model or not.  This is not a case only for arm/aarch64.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-10-12 17:46:48 +02:00
Pavel Hrdina
133fb1401f qemu_domain: move video validation out of qemu_command
All definition validation that doesn't depend on qemu capabilities
and was allowed previously as valid definition should be placed into
qemuDomainDefValidate.

The check whether video type is supported or not was based on an enum
that translates type into model.  Use switch to ensure that if new
video type is added, it will be properly handled.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-10-12 17:46:47 +02:00
Peter Krempa
043ba4a40a qemu: Reuse virDomainDeGetVcpusTopology to calculate total vcpu count
Rather than multiplying sockets, cores, and threads use the new helper
for getting the vcpu count resulting from the topology.
2016-10-11 13:52:09 +02:00
Michal Privoznik
8cfdd6e4f5 Revert "conf: Skip post parse callbacks when creating copy"
This breaks vCPU hotplug, because when starting a domain, we
create a copy of domain definition (which becomes live XML) and
during the post parse callbacks we might adjust some tunings so
that vCPU hotplug is possible.

This reverts commit 581b7756af.
2016-10-04 18:00:02 +02:00
Michal Privoznik
581b7756af conf: Skip post parse callbacks when creating copy
When creating a copy of virDomainDef we save ourselves the
trouble of writing deep-copy functions and just format and parse
back domain/device XML. However, the XML we are parsing was
already fully formatted - there is no reason to run post parse
callbacks (which fill in blanks - there are none!).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Michal Privoznik
4172ae371b qemuDomainDefAssignAddresses: Fetch caps from domain object
Just like we did two commits ago, don't try to fetch capabilities
for non-existing binary. Re-use the ones we have for running
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Michal Privoznik
1e501043f7 qemuDomainDeviceDefPostParse: Fetch caps from domain object
Just like we did two commits ago, don't try to fetch capabilities
for non-existing binary. Re-use the ones we have for running
domain.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Michal Privoznik
70b36a7b7e qemuDomainDefPostParse: Fetch qemuCaps from domain object
We can't rely on def->emulator path. It may be provided by user
as we give them opportunity to provide their own XML for
migration. Therefore the path may point to just whatever binary
(or even to a non-existent file). Moreover, this path is meant
for destination, but the capabilities lookup is done on source.
What we can do is to assume same capabilities for post parse
callbacks as the running domain has. They will be used just to
add some default models/controllers/devices/... anyway.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Michal Privoznik
cf198684a8 conf: Extend virDomainDefAssignAddressesCallback for parseOpaque
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Michal Privoznik
78ab5dcea0 conf: Extend virDomainDeviceDefPostParse for parseOpaque
Just like virDomainDefPostParseCallback has gained new
parseOpaque argument, we need to follow the logic with
virDomainDeviceDefPostParse.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Michal Privoznik
2e056b5c51 virDomainDefCopy: Introduce @parseOpaque argument
We want to pass the proper opaque pointer instead of NULL to
virDomainDefParseString.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Michal Privoznik
c41b989112 virDomainDefParse{File,String}: Introduce @parseOpaque argument
We want to pass the proper opaque pointer instead of NULL to
virDomainDefParse and subsequently virDomainDefParseNode too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Michal Privoznik
940d91c55b virDomainDefPostParse: Introduce @parseOpaque argument
Some callers might want to pass yet another pointer to opaque
data to post parse callbacks. The driver generic one is not
enough because two threads executing post parse callback might
want to see different data (e.g. domain object pointer that
domain def belongs to).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-26 16:50:12 +02:00
Jiri Denemark
7ce711a30e qemu: Update guest CPU def in live XML
Storing the updated CPU definition in the live domain definition saves
us from having to update it over and over when we need it. Not to
mention that we will soon further update the CPU definition according to
QEMU once it's started.

A highly wanted side effect of this patch, libvirt will pass all CPU
features explicitly specified in domain XML to QEMU, even those that are
already included in the host model.

This patch should fix the following bugs:
    https://bugzilla.redhat.com/show_bug.cgi?id=1207095
    https://bugzilla.redhat.com/show_bug.cgi?id=1339680
    https://bugzilla.redhat.com/show_bug.cgi?id=1371039
    https://bugzilla.redhat.com/show_bug.cgi?id=1373849
    https://bugzilla.redhat.com/show_bug.cgi?id=1375524
    https://bugzilla.redhat.com/show_bug.cgi?id=1377913

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-09-22 15:40:09 +02:00
Jiri Denemark
3b6be3c0c5 cpu: Rework cpuUpdate
The reworked API is now called virCPUUpdate and it should change the
provided CPU definition into a one which can be consumed by the QEMU
command line builder:

    - host-passthrough remains unchanged
    - host-model is turned into custom CPU with a model and features
      copied from host
    - custom CPU with minimum match is converted similarly to host-model
    - optional features are updated according to host's CPU

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-09-22 15:40:09 +02:00
Jiri Denemark
b27adaed37 qemu: Propagate virCapsPtr to virQEMUCapsNewForBinaryInternal
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-09-22 15:40:08 +02:00
Martin Kletzander
0f61d7b5f2 qemu: Abstract shmem socket path preparation
Put it into qemuDomainPrepareShmemChardev() so it can be used later.
Also don't fill in the path unless the server option is enabled.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-09-20 15:42:43 +02:00
Peter Krempa
64bc75f756 qemu: domain: Don't infer vcpu state
Use the state information (online, hotpluggable) provided by the monitor
code rather than trying to infer it. This fixes an issue where on
architectures that require hotplug of multiple threads at once the
sub-cores would get updated as offline on daemon restart thus creating
an invalid configuration.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1375783
2016-09-14 12:56:57 +02:00
Jiri Denemark
56258a388f qemu: Don't use query-migrate on destination
When migration fails, we need to poke QEMU monitor to check for a reason
of the failure. We did this using query-migrate QMP command, which is
not supposed to return any meaningful result on the destination side.
Thus if the monitor was still functional when we detected the migration
failure, parsing the answer from query-migrate always failed with the
following error message:

    "info migration reply was missing return status"

This irrelevant message was then used as the reason for the migration
failure replacing any message we might have had.

Let's use harmless query-status for poking the monitor to make sure we
only get an error if the monitor connection is broken.

https://bugzilla.redhat.com/show_bug.cgi?id=1374613

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-09-12 15:56:10 +02:00
Peter Krempa
6e19cc59a6 qemu: domain: Clear startup policy for dropped removable media
When a source image is dropped when missing due to startup policy the
policy needs to be cleared since it was relevant only for the given
storage source. New sources need to update it if needed.
2016-09-12 09:54:36 +02:00
Michal Privoznik
c56cdf2593 conf: Add support for virtio-net.rx_queue_size
https://bugzilla.redhat.com/show_bug.cgi?id=1366989

QEMU added another virtio-net tunable [1]. It basically allows
users to set the size of RX virtio ring. But because virtio-net
uses two separate ring buffers to pass data from/to guest they
named it explicitly rx_queue_size. We should expose it in our XML
too.

1: http://lists.nongnu.org/archive/html/qemu-devel/2016-08/msg02029.html

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-09-09 16:16:59 +02:00
Nikolay Shirokovskiy
c62e79c8ca qemu: Filter cur_balloon ABI check for certain transactions
Since the domain lock is not held during preparation of an external XML
config, it is possible that the value can change resulting in unexpected
failures during ABI consistency checking for some save and migrate
operations.

This patch adds a new flag to skip the checking of the cur_balloon value
and then sets the destination value to the source value to ensure
subsequent checks without the skip flag will succeed.

This way it is protected from forges and is keeped up to date too.

Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
2016-09-02 16:54:42 -04:00
Peter Krempa
9eb9106ea5 qemu: command: Add support for sparse vcpu topologies
Add support for using the new approach to hotplug vcpus using device_add
during startup of qemu to allow sparse vcpu topologies.

There are a few limitations imposed by qemu on the supported
configuration:
- vcpu0 needs to be always present and not hotpluggable
- non-hotpluggable cpus need to be ordered at the beginning
- order of the vcpus needs to be unique for every single hotpluggable
  entity

Qemu also doesn't really allow to query the information necessary to
start a VM with the vcpus directly on the commandline. Fortunately they
can be hotplugged during startup.

The new hotplug code uses the following approach:
- non-hotpluggable vcpus are counted and put to the -smp option
- qemu is started
- qemu is queried for the necessary information
- the configuration is checked
- the hotpluggable vcpus are hotplugged
- vcpus are started

This patch adds a lot of checking code and enables the support to
specify the individual vcpu element with qemu.
2016-08-24 15:44:47 -04:00
Peter Krempa
20ef1232ec qemu: process: Copy final vcpu order information into the vcpu definition
The vcpu order information is extracted only for hotpluggable entities,
while vcpu definitions belonging to the same hotpluggable entity need
to all share the order information.

We also can't overwrite it right away in the vcpu info detection code as
the order is necessary to add the hotpluggable vcpus enabled on boot in
the correct order.

The helper will store the order information in places where we are
certain that it's necessary.
2016-08-24 15:44:47 -04:00
Peter Krempa
48e3d42889 qemu: migration: Prepare for non-contiguous vcpu configurations
Introduce a new migration cookie flag that will be used for any
configurations that are not compatible with libvirt that would not
support the specific vcpu hotplug approach. This will make sure that old
libvirt does not fail to reproduce the configuration correctly.
2016-08-24 15:44:47 -04:00
Peter Krempa
5847bc5c64 conf: Add XML for individual vCPU hotplug
Individual vCPU hotplug requires us to track the state of any vCPU. To
allow this add the following XML:

<domain>
  ...
  <vcpu current='2'>3</vcpu>
  <vcpus>
    <vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='1' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='1' enabled='no' hotpluggable='yes'/>
  </vcpus>
  ...

The 'enabled' attribute allows to control the state of the vcpu.
'hotpluggable' controls whether given vcpu can be hotplugged and 'order'
allows to specify the order to add the vcpus.
2016-08-24 15:44:47 -04:00
Peter Krempa
133be0a9e2 qemu: domain: Prepare for VCPUs vanishing while libvirt is not running
Similarly to devices the guest may allow unplug of the VCPU if libvirt
is down. To avoid problems, refresh the vcpu state on reconnect. Don't
mess with the vcpu state otherwise.
2016-08-24 15:44:47 -04:00
Peter Krempa
6b4a23ff6c qemu: domain: Extract cpu-hotplug related data
Now that the monitor code gathers all the data we can extract it to
relevant places either in the definition or the private data of a vcpu.

As only thread id is broken for TCG guests we may extract the rest of
the data and just skip assigning of the thread id. In case where qemu
would allow cpu hotplug in TCG mode this will make it work eventually.
2016-08-24 15:44:47 -04:00
Peter Krempa
9bbbc88a8f qemu: monitor: Add algorithm for combining query-(hotpluggable-)-cpus data
For hotplug purposes it's necessary to retrieve data using
query-hotpluggable-cpus while the old query-cpus API report thread IDs
and order of hotplug.

This patch adds code that merges the data using a rather non-trivial
algorithm and fills the data to the qemuMonitorCPUInfo structure for
adding to appropriate place in the domain definition.
2016-08-24 15:44:47 -04:00
Peter Krempa
ffa536e0f8 qemu: Forbid config when topology based cpu count doesn't match the config
As of qemu commit:
commit a32ef3bfc12c8d0588f43f74dcc5280885bbdb30
Author: Thomas Huth <thuth@redhat.com>
Date:   Wed Jul 22 15:59:50 2015 +0200

    vl: Add another sanity check to smp_parse() function

v2.4.0-952-ga32ef3b

configuration where the maximum CPU count doesn't match the topology is
rejected. Prior to that only configurations where the topology would
contain more cpus than the maximum count would be rejected.

Use QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS as a relevant recent enough
witness to avoid breaking old configs.
2016-08-24 15:44:47 -04:00
Peter Krempa
5b5f494a1b qemu: monitor: Return structures from qemuMonitorGetCPUInfo
The function will gradually add more returned data. Return a struct for
every vCPU containing the data.
2016-08-24 15:44:47 -04:00
Andrea Bolognani
31de0fab93 qemu: domain: Drop piix3-ohci controller for migration
Now that the default USB controller model is explicit rather
than implicit for i440fx machines, we have to tweak the
conditions for dropping it in order to keep migration towards
libvirt <= 0.9.4 working.
2016-08-12 17:38:02 +02:00
Andrea Bolognani
f55eaccb0c qemu: domain: Reflect USB controller model in guest XML
When the user doesn't specify any model for a USB controller,
we use an architecture-dependent default, but we don't reflect
it in the guest XML.

Pick the default USB controller model when parsing the guest
XML instead of when creating the QEMU command line, so that
our choice is saved back to disk.
2016-08-12 17:38:02 +02:00
Michal Privoznik
9c1524a01c qemu: Enable secure boot
In qemu, enabling this feature boils down to adding the following
onto the command line:

  -global driver=cfi.pflash01,property=secure,value=on

However, there are some constraints resulting from the
implementation. For instance, System Management Mode (SMM) is
required to be enabled, the machine type must be q35-2.4 or
later, and the guest should be x86_64. While technically it is
possible to have 32 bit guests with secure boot, some non-trivial
CPU flags tuning is required (for instance lm and nx flags must
be prohibited). Given complexity of our CPU driver, this is not
trivial. Therefore I've chosen to forbid 32 bit guests for now.
If there's ever need, we can refine the check later.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-08-04 17:22:20 +02:00
Peter Krempa
041f35340b qemu: domain: Simplify return values of qemuDomainRefreshVcpuInfo
Call the vcpu thread info validation separately to decrease complexity
of returned values by qemuDomainRefreshVcpuInfo.

This function now returns 0 on success and -1 on error. Certain
failures of qemu to report data are still considered as success. Any
error reported now is fatal.
2016-08-04 08:08:40 +02:00
Peter Krempa
2bdc300a34 qemu: domain: Improve vCPU data checking in qemuDomainRefreshVcpu
Validate the presence of the thread id according to state of the vCPU
rather than just checking the vCPU count. Additionally put the new
validation code into a separate function so that the information
retrieval can be split from the validation.
2016-08-04 08:08:31 +02:00
Peter Krempa
8f56b5baaf qemu: domain: Rename qemuDomainDetectVcpuPids to qemuDomainRefreshVcpuInfo
The function will eventually do more useful stuff than just detection of
thread ids.
2016-08-04 08:03:58 +02:00
John Ferlan
dd0dbe1d66 qemu: Make QEMU_DRIVE_HOST_PREFIX more private
Move QEMU_DRIVE_HOST_PREFIX into the qemu_alias.c to dissuade future
callers from using it. Create qemuAliasDiskDriveSkipPrefix in order
to handle the current consumers that desire to check if an alias has
the drive- prefix and "get beyond it" in order to get the disk alias.
2016-08-02 10:11:11 -04:00
Chunyan Liu
c6f0e177a3 qemuDomainDeviceDefPostParse: add USB controller model check
To sync with virDomainControllerModelUSB, we add two models
in qemuControllerModelUSB 'qusb1' and 'qusb2', but those
models are not supported in qemu driver. So add check in
device post parse to report errors if 'qusb1' and 'qusb2'
are specified.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
2016-08-02 14:02:21 +02:00
Martin Kletzander
a2b97a8d91 qemu: Fix support for startupPolicy with volume/pool disks
Until now we simply errored out when the translation from pool+volume
failed.  However, we should instead check whether that disk is needed or
not since there is an option for that.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1168453

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-08-02 13:21:01 +02:00
Martin Kletzander
779a4ea906 qemu: Remove unnecessary label and its only reference
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-08-02 13:21:01 +02:00
Martin Kletzander
e2705cfb6e qemu: Make qemuDomainCheckDiskStartupPolicy self-contained
There is an error reset following the function and check for
startupPolicy before that.  Let's reflect those things inside that
function so that future code doesn't have to be that complex.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-08-02 13:21:01 +02:00
Boris Fiuczynski
230c631917 qemu: remove panic dev models s390 and pseries when migrating
The panic devices with models s390 and pseries are autogenerated.
For backwards compatibility reasons the devices are to be removed
when migrating.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2016-08-01 14:15:08 +02:00
Michal Privoznik
1e05846373 conf: Catch invalid memory model earlier
Consider the following XML snippet:

    <memory model=''>
      <target>
        <size unit='KiB'>523264</size>
        <node>0</node>
      </target>
    </memory>

Whats wrong you ask? The @model attribute. This should result in
an error thrown into users faces during virDomainDefine phase.
Except it doesn't. The XML validation catches this error, but if
users chose to ignore that, they will end up with invalid XML.
Well, they won't be able to start the machine - that's when error
is produced currently. But it would be nice if we could catch the
error like this earlier.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-07-29 11:03:24 +02:00
Daniel P. Berrange
a48c714115 storage: remove "luks" storage volume type
The current LUKS support has a "luks" volume type which has
a "luks" encryption format.

This partially makes sense if you consider the QEMU shorthand
syntax only requires you to specify a format=luks, and it'll
automagically uses "raw" as the next level driver. QEMU will
however let you override the "raw" with any other driver it
supports (vmdk, qcow, rbd, iscsi, etc, etc)

IOW the intention though is that the "luks" encryption format
is applied to all disk formats (whether raw, qcow2, rbd, gluster
or whatever). As such it doesn't make much sense for libvirt
to say the volume type is "luks" - we should be saying that it
is a "raw" file, but with "luks" encryption applied.

IOW, when creating a storage volume we should use this XML

  <volume>
    <name>demo.raw</name>
    <capacity>5368709120</capacity>
    <target>
      <format type='raw'/>
      <encryption format='luks'>
        <secret type='passphrase' uuid='0a81f5b2-8403-7b23-c8d6-21ccd2f80d6f'/>
      </encryption>
    </target>
  </volume>

and when configuring a guest disk we should use

  <disk type='file' device='disk'>
    <driver name='qemu' type='raw'/>
    <source file='/home/berrange/VirtualMachines/demo.raw'/>
    <target dev='sda' bus='scsi'/>
    <encryption format='luks'>
      <secret type='passphrase' uuid='0a81f5b2-8403-7b23-c8d6-21ccd2f80d6f'/>
    </encryption>
  </disk>

This commit thus removes the "luks" storage volume type added
in

  commit 318ebb36f1
  Author: John Ferlan <jferlan@redhat.com>
  Date:   Tue Jun 21 12:59:54 2016 -0400

    util: Add 'luks' to the FileTypeInfo

The storage file probing code is modified so that it can probe
the actual encryption formats explicitly, rather than merely
probing existance of encryption and letting the storage driver
guess the format.

The rest of the code is then adapted to deal with
VIR_STORAGE_FILE_RAW w/ VIR_STORAGE_ENCRYPTION_FORMAT_LUKS
instead of just VIR_STORAGE_FILE_LUKS.

The commit mentioned above was included in libvirt v2.0.0.
So when querying volume XML this will be a change in behaviour
vs the 2.0.0 release - it'll report 'raw' instead of 'luks'
for the volume format, but still report 'luks' for encryption
format.  I think this change is OK because the storage driver
did not include any support for creating volumes, nor starting
guets with luks volumes in v2.0.0 - that only since then.
Clearly if we change this we must do it before v2.1.0 though.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-07-27 18:59:15 +01:00
Tomasz Flendrich
1aa5e66cf3 qemu: remove ccwaddrs caching
Dropping the caching of ccw address set.
The cached set is not required anymore, because the set is now being
recalculated from the domain definition on demand, so the cache
can be deleted.
2016-07-26 13:04:46 +02:00
Tomasz Flendrich
19a148b7c8 qemu: remove vioserialaddrs caching
Dropping the caching of virtio serial address set.
The cached set is not required anymore, because the set is now being
recalculated from the domain definition on demand, so the cache
can be deleted.

Credit goes to Cole Robinson.
2016-07-26 13:04:46 +02:00
Ján Tomko
ddd31fd7dc Reserve existing USB addresses
Check if they fit on the USB controllers the domain has,
and error out if two devices try to use the same address.
2016-07-21 08:30:26 +02:00