Commit Graph

9608 Commits

Author SHA1 Message Date
Nikolay Shirokovskiy
dd94f36ffb qemu: check iotune params same for all disk in group
Currently it is possible to start a domain which have disks
in same iotune group and at the same time having different iotune
params. Both params set are passed to qemu in command line and the one
that is passed later down command line is get actually set.
Let's prohibit such configurations.

Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-29 11:46:51 +01:00
Nikolay Shirokovskiy
e7efffe6cb qemu: propagate iotune settings to all disks in the group
Currently upon successfull call to qemu's implementation of
virDomainSetBlockIoTune iotune settings are changed only for the
disk given in API if the disk is in iotune group while we need
to change the settings for all disks in the group.

Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-29 11:46:47 +01:00
Nikolay Shirokovskiy
67ebd6ac26 qemu: Move qemuDiskConfigBlkdeviotuneHas* to conf
And introduce virDomainBlockIoTuneInfoHasAny.

Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-29 11:46:17 +01:00
Ján Tomko
d5256cbd55 qemu: eliminate ret in qemuExtDevicesStart
All the callees return either 0 or -1 so there is no need
for propagating the value. And we bail on the first error.

Remove the variable to make the function simpler.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-01-28 13:32:27 +01:00
Ján Tomko
e2ca6eb087 qemu: use def instead of vm->def in qemuExtDevicesStart
We have a helper variable to make the code more concise,
use it consistently.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-01-28 13:32:27 +01:00
Ján Tomko
f84c7c67d5 qemu: eliminate ret variable in qemuExtTPMStart
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-01-28 13:32:27 +01:00
Ján Tomko
06160f6708 qemu: eliminate ret variable in qemuExtTPMStartEmulator
Now that the cleanup section is empty, eliminate the cleanup
label as well as the 'ret' variable.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-01-28 13:32:27 +01:00
Ján Tomko
ebe9c31f41 qemu: use g_auto in qemuExtTPMStartEmulator
Use the g_auto macros wherever possible to eliminate the cleanup
section.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-01-28 13:32:27 +01:00
Andrea Bolognani
c8a3a5d79b qemu_shim: Update temporary directory template
The template still references libvirt-qemu-shim, which was at one
point the name used to refer to what we now know as virt-qemu-run.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
2020-01-27 17:57:43 +01:00
Andrea Bolognani
7dca28e229 qemu_shim: Fix typos
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
2020-01-27 17:57:08 +01:00
Ján Tomko
c07ef7c563 qemu: snapshot: go through cleanup on error
A recent commit added an error check for too-nested backing chains
followed by a return, even though errors above jump to cleanup.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: b168fa88b8
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2020-01-27 17:36:05 +01:00
Ján Tomko
26a42e7315 qemu_shim: cosmetic fixes
Remove bogus G_GNUC_UNUSED attribute and add a missing space.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: d600667278
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2020-01-27 17:36:05 +01:00
Peter Krempa
d9dfc1f7de qemu: checkpoint: Extract calculation of bitmap merging for checkpoint deletion
This will allow some testing before refactoring.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-27 15:28:49 +01:00
Peter Krempa
6796194a28 qemu: checkpoint: Introduce helper to find checkpoint disk definition in parents
The algorithm is used in two places to find the parent checkpoint object
which contains given disk and then uses data from the disk. Additionally
the code is written in a very non-obvious way. Factor out the lookup of
the disk into a function which also simplifies the callers.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-27 15:28:49 +01:00
Peter Krempa
180b3422e9 qemu: domain: Remove unused qemuDomainDiskNodeFormatLookup
The function has no users now and there's no need for it as the common
pattern is to look up the whole disk object anyways.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-27 15:28:49 +01:00
Peter Krempa
f19248a139 qemu: checkpoint: tolerate missing disks on checkpoint deletion
If a disk is unplugged and then the user tries to delete a checkpoint
the code would try to use NULL node name as it was not checked.

Fix this by fetching the whole disk definition object and verifying it
was found.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-27 15:28:49 +01:00
Peter Krempa
7973f7d792 qemu: checkpoint: Use disk definition directly when creating checkpoint
Lookup the whole disk definition rather than just the node name.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-27 15:28:49 +01:00
Peter Krempa
f3e0a45a00 qemu: checkpoint: rename disk->chkdisk in qemuCheckpointAddActions
Upcoming patches will also use the domain disk definition. Rename disk
to chkdisk for clarity.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-27 15:28:49 +01:00
Peter Krempa
a303e8ea47 qemu: checkpoint: rename disk->chkdisk in qemuCheckpointDiscardBitmaps
Upcoming patches will also use the domain disk definition. Rename disk
to chkdisk for clarity.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-27 15:28:49 +01:00
Peter Krempa
44e1b85717 qemu: checkpoint: split out checkpoint deletion bitmaps
qemuCheckpointDiscard is a massive function that can be separated into
smaller bits. Extract the part that actually modifies the disk from the
metadata handling.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-27 15:28:49 +01:00
Peter Krempa
606dc66b09 qemu: checkpoint: Store whether deleted checkpoint is current in a variable
Avoid two computations by using a boolean.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-27 15:28:49 +01:00
Peter Krempa
60b580b949 qemu: capabilities: Add accessor to qemu caps machine types presence
Test code will need to know whether the virQEMUCaps object contains any
machine types already. Add a helper and expose it via 'qemu_capspriv.h'.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-27 14:55:03 +01:00
Peter Krempa
3b8feb4793 qemu: capabilities: Replace aliased machine type by copy of the canonical machine
The previous approac of just purging the alias combined with the fact
that we filled in fake machine types in the test data meant that if a
test case used an alias machine type such as 'pc' or 'q35' it would not
properly resolve to the actual data returned by qemu.

This started to be a problem since the CPU driver now looks at the
default CPU reported with the machine type.

This patch replaces the original approach of just removing the alias by
replacing it with a copy of the machine type data which the type would
alias to. This means that we are using the real data while we don't
modify the test output after every qemu upgrade.

Additionally this change will allow us to drop adding the fake machine
types later.

The test fallout is from actually excercising the CPU driver with
actual data.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-27 14:55:03 +01:00
Peter Krempa
bb61230992 qemu: capabilities: Extract code from virQEMUCapsStripMachineAliases
Separate out the internals as they will become more complex soon.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-27 14:55:03 +01:00
Peter Krempa
0b9d1a8073 qemu: domain: Validate that machine type is supported by qemu
Every supported qemu is able to return the list of machine types it
supports so we can start validating it against that list. The advantage
is a better error message, and the change will also prevent having stale
test data.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-27 14:55:03 +01:00
Daniel P. Berrangé
82dd011dbb qemu: fix linking virt-qemu-run on some distros
Debian/Ubuntu linkers are more strict that other distros requiring glib
to be linked explicitly.

macOS needs -export-dynamic instead of -Wl,--export-dynamic

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-27 13:44:56 +00:00
Peter Krempa
b168fa88b8 qemu: snapshot: Prevent too-nested domain XML when doing inactive snapshot
Similarly to 510d154a0b we need to prevent
doing too deeply nested backing chains and reject them with a sane error
message.

Add a loop to go through the snapshots prior to attempting actually
creating them to prevent some possible inconsistent scenarios.

We don't need to do it when reusing backing chains as we'll be
re-detecting the backing chain in that case anyways.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-27 14:02:01 +01:00
Peter Krempa
8e9e73a984 qemu: snapshot: Always rewrite backingStore data when reusing existing images
Don't adopt the backing store data when reusing images provided by the
user. This will force a backing chain re-probe as users might have
passed in something unexpected in the overlay where our view of the
backing chain would not correspond.

This is done only for inactive snapshots as there we have way less
verification.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-27 14:02:01 +01:00
Daniel P. Berrangé
d600667278 qemu: introduce a new "virt-qemu-run" program
The previous "QEMU shim" proof of concept was taking an approach of only
caring about initial spawning of the QEMU process. It was then
registered with the libvirtd daemon who took over management of it. The
intent was that later libvirtd would be refactored so that the shim
retained control over the QEMU monitor and libvirt just forwarded APIs
to each shim as needed. This forwarding of APIs would require quite alot
of significant refactoring of libvirtd to achieve.

This impl thus takes a quite different approach, explicitly deciding to
keep the VMs completely separate from those seen & managed by libvirtd.
Instead it uses the new "qemu:///embed" URI scheme to embed the entire
QEMU driver in the shim, running with a custom root directory.

Once the driver is initialization, the shim starts a VM and then waits
to shutdown automatically when QEMU shuts down, or should kill QEMU if
it is terminated itself. This ought to use the AUTO_DESTROY feature but
that is not yet available in embedded mode, so we rely on installing a
few signal handlers to gracefully kill QEMU. This isn't reliable if
we crash of course, but you can restart with the same root dir.

Note this program does not expose any way to manage the QEMU process,
since there's no RPC interface enabled. It merely starts the VM and
cleans up when the guest shuts down at the end. This program is
installed to /usr/bin/virt-qemu-run enabling direct use by end users.
Most use cases will probably want to integrate the concept directly
into their respective application codebases. This standalone binary
serves as a nice demo though, and also provides a way to measure
performance of the startup process quite simply.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-27 11:05:02 +00:00
Daniel P. Berrangé
068efae5b1 qemu: add support for running QEMU driver in embedded mode
This enables support for running QEMU embedded to the calling
application process using a URI:

   qemu:///embed?root=/some/path

Note that it is important to keep the path reasonably short to
avoid risk of hitting the limit on UNIX socket path names
which is 108 characters.

When using the embedded mode with a root=/var/tmp/embed, the
driver will use the following paths:

                logDir: /var/tmp/embed/log/qemu
           swtpmLogDir: /var/tmp/embed/log/swtpm
         configBaseDir: /var/tmp/embed/etc/qemu
              stateDir: /var/tmp/embed/run/qemu
         swtpmStateDir: /var/tmp/embed/run/swtpm
              cacheDir: /var/tmp/embed/cache/qemu
                libDir: /var/tmp/embed/lib/qemu
       swtpmStorageDir: /var/tmp/embed/lib/swtpm
 defaultTLSx509certdir: /var/tmp/embed/etc/pki/qemu

These are identical whether the embedded driver is privileged
or unprivileged.

This compares with the system instance which uses

                logDir: /var/log/libvirt/qemu
           swtpmLogDir: /var/log/swtpm/libvirt/qemu
         configBaseDir: /etc/libvirt/qemu
              stateDir: /run/libvirt/qemu
         swtpmStateDir: /run/libvirt/qemu/swtpm
              cacheDir: /var/cache/libvirt/qemu
                libDir: /var/lib/libvirt/qemu
       swtpmStorageDir: /var/lib/libvirt/swtpm
 defaultTLSx509certdir: /etc/pki/qemu

At this time all features present in the QEMU driver are available when
running in embedded mode, availability matching whether the embedded
driver is privileged or unprivileged.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-27 11:04:03 +00:00
Daniel P. Berrangé
207709a031 libvirt: pass a directory path into drivers for embedded usage
The intent here is to allow the virt drivers to be run directly embedded
in an arbitrary process without interfering with libvirtd. To achieve
this they need to store all their configuration & state in a separate
directory tree from the main system or session libvirtd instances.

This can be useful for doing testing of the virt drivers in "make check"
without interfering with the user's own libvirtd instances.

It can also be used for applications using KVM/QEMU as a piece of
infrastructure to build an service, rather than for general purpose
OS hosting. A long standing example is libguestfs, which would prefer
if its temporary VMs did show up in the main libvirtd VM list, because
this confuses apps such as OpenStack Nova. A more recent example would
be Kata which is using KVM as a technology to build containers.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-27 11:02:16 +00:00
Jonathon Jongsma
dee2218bc8 qemu: explicitly disable virgl when requested
If a domain is configured to have an egl-headless display and a virtio
video device, virgl will be enabled automatically within the guest, even
if the video device is configured with accel3d='no'.

In this case we should explicitly pass 'virgl=off' to qemu.

See https://bugzilla.redhat.com/show_bug.cgi?id=1791236 for more
information.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-25 07:51:08 +01:00
Han Han
bd51f89c30 qemu: Implement builtin rng backend
Since v4.2-rc0, QEMU introduced a builtin rng backend that uses
getrandom() syscall to generate random. Add it to libvirt with the
backend model 'builtin'.

https://bugzilla.redhat.com/show_bug.cgi?id=1785091

Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-24 17:03:35 +01:00
Han Han
6a6d00e168 conf: Add rng backend model builtin
The 'builtin' rng backend model can be used as following:
  <rng model='virtio'>
    <backend model='builtin'/>
  </rng>

Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-24 17:03:32 +01:00
Han Han
9378713f56 qemu_capabilities: Introduce QEMU_CAPS_OBJECT_RNG_BUILTIN
It is used to check if qemu is capable of rng-builtin object.

This object is added since qemu-4.2.0-rc0, commit 6c4e9d48.

Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-24 17:03:21 +01:00
Michal Privoznik
c76009313f qemu_capabilities: Rework domain caps cache
Since v5.6.0-48-g270583ed98 we try to cache domain capabilities,
i.e. store filled virDomainCaps in a hash table in virQEMUCaps
for future use. However, there's a race condition in the way it's
implemented. We use virQEMUCapsGetDomainCapsCache() to obtain the
pointer to the hash table, then we search the hash table for
cached data and if none is found the domcaps is constructed and
put into the table. Problem is that this is all done without any
locking, so if there are two threads trying to do the same, one
will succeed and the other will fail inserting the data into the
table.

Also, the API looks a bit fishy - obtaining pointer to the hash
table is dangerous.

The solution is to use a mutex that guards the whole operation
with the hash table. Then, the API can be changes to return
virDomainCapsPtr directly.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1791790

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-01-24 14:48:44 +01:00
Michal Privoznik
cc361a34c5 qemu_conf: Avoid dereferencing NULL in virQEMUDriverGetHost{NUMACaps,CPU}
When fixing [1] I've ran attached reproducer and had it spawn
1024 threads and query capabilities XML in each one of them. This
lead libvirtd to hit the RLIMIT_NOFILE limit which was kind of
expected. What wasn't expected was a subsequent segfault. It
happened because virCPUProbeHost failed and returned NULL. We've
taken the NULL and passed it to virCapabilitiesHostNUMARef()
which dereferenced it. Code inspection showed the same flas in
virQEMUDriverGetHostNUMACaps(), so I'm fixing both places.

1: https://bugzilla.redhat.com/show_bug.cgi?id=1791790

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-01-24 14:48:44 +01:00
Peter Krempa
29d43bf96a qemu: monitor: Improve error message when QEMU reply is too large
Don't use ERANGE as it doesn't make much sense in the error message.
Also point out that the reply from qemu was too large which is not
obvious from the original error:

 error: No complete monitor response found in 10485760 bytes: Numerical result out of range

The new message will read:

 error: internal error: QEMU monitor reply exceeds buffer size (10485760 bytes)

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2020-01-24 13:47:45 +01:00
Peter Krempa
f4e7c792d5 qemu: block: Don't skip creation of 'luks' formatted images
libvirt treats 'luks' images as raw+encryption. The logic in
qemuBlockStorageSourceCreateFormat skipped the creation if the requested
image was raw but didn't take into account the encryption.

This manifested itself e.g. when attempting to do a virsh blockcopy with
the following XML:

    <disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/tmp/enccpy'>
        <encryption format='luks'>
          <secret type='passphrase' uuid='0a81f5b2-8403-7b23-c8d6-21ccc2f80d6f'/>
        </encryption>
      </source>
    </disk>

Where qemu would report the following error:

 unable to execute QEMU command 'blockdev-add': Volume is not in LUKS format

rather than actually formatting the image first.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-24 13:46:46 +01:00
Peter Krempa
0c3792a155 qemu: backup: Implement support for backup disk bitmap name configuration
Use the user-configured name of the bitmap when merging the appropriate
bitmaps for an incremental backup so that the user can see it as
configured. Additionally expose the default bitmap name if nothing is
configured.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-01-24 13:40:53 +01:00
Peter Krempa
bce4ac55f8 qemu: backup: Implement support for backup disk export name configuration
Pass the exportname as configured when exporting the image via NBD and
fill it with the default if it's not configured.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-24 13:40:48 +01:00
Peter Krempa
69908db0f6 qemu: Fix value of 'device' argument for block-commit
When using blockdev configurations the 'device' argument of
'blockdev-commit' must correspond to the topmost node in the block node
graph. Libvirt didn't do this properly in case when 'copy_on_read'
option was enabled on the disk.

Use qemuDomainDiskGetTopNodename to fix it when calling block-commit.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-24 13:40:36 +01:00
Peter Krempa
e3137539a9 qemu: Fix value of 'device' argument for blockdev-mirror
When using blockdev configurations the 'device' argument of
'blockdev-mirror' must correspond to the topmost node in the block node
graph. Libvirt didn't do this properly in case when 'copy_on_read'
option was enabled on the disk.

Use qemuDomainDiskGetTopNodename to fix it for the blockdev-mirror calls
in qemuDomainBlockCopy and the non-shared-storage migration.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-24 13:40:36 +01:00
Peter Krempa
0b0f389335 qemu: domain: Extract code to determine topmost nodename to qemuDomainDiskGetTopNodename
There are more places which require getting the topmost nodename to be
passed to qemu. Separate it out into a new function.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-24 13:40:36 +01:00
Peter Krempa
623366d130 qemu: blockcopy: Actually unplug unused images when mirror job fails to start
If a mirror job fails to start in -blockdev mode we'd not unplug the
backing files we added first because the code on the error path checked
the wrong value. 'rc' is used as status of the code which added the
images, but the state of the 'block(dev)-mirror' call is stored in 'ret'
at that point.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-24 13:40:36 +01:00
Daniel P. Berrangé
6d786f95a3 qemu: fixing auto-detecting binary in domain capabilities
The virConnectGetDomainCapabilities API accepts either a binary path
to the emulator, or desired guest arch. If guest arch is not given,
then the host arch is assumed.

In the case where the binary is not given, the code tried to find the
emulator binary in the existing list of cached emulator capabilities.
This is not valid since we switched to lazy population of the cache in:

  commit 3dd91af01f
  Author: Daniel P. Berrangé <berrange@redhat.com>
  Date:   Mon Dec 2 13:04:26 2019 +0000

    qemu: stop creating capabilities at driver startup

As a result of this change, if there are no persistent guests defined
using the requested guest architecture, virConnectGetDomainCapabilities
will fail to find an emulator binary.

The solution is to stop relying on the cached capabilities to find the
binary and instead use the same logic we use to pick default a binary
per arch when populating capabilities.

Tested-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-23 16:38:59 +00:00
Thomas Huth
e7a65484ba qemu: Refuse to use "ps2" on machines that do not have this bus
The "ps2" bus is only available on certain machines like x86. On
machines like s390x, we should refuse to add a device to this bus
instead of silently ignoring it.

Looking at the QEMU sources, PS/2 is only available if the QEMU binary
has the "i8042" device, so let's check for that and only allow "ps2"
devices if this QEMU device is available, or if we're on x86 anyway
(so we don't have to fake the QEMU_CAPS_DEVICE_I8042 capability in
all the tests that use <input ... bus='ps2'/> in their xml data).

Reported-by: Sebastian Mitterle <smitterl@redhat.com>
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1763191
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-23 12:57:03 +01:00
Julio Faracco
c360dbb564 qemu: Converting DHCP and ARP functions to domain conf
QEMU driver has two functions: qemuGetDHCPInterfaces() and
qemuARPGetInterfaces() that are being used inside only one single
function. They can be turned into generic functions that other drivers
can use. This commit move both from QEMU driver tree to domain conf
tree.

Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-23 12:29:33 +01:00
Ján Tomko
d61f95cf6a qemu: end the agent job in qemuDomainSetTimeAgent
This function grabs an agent job but ends a monitor job.
End the agent job instead.

https://bugzilla.redhat.com/show_bug.cgi?id=1792723

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reported-by: Dan Zheng <dzheng@redhat.com>
Fixes: e005c95f56
2020-01-20 07:55:48 +01:00
Pavel Hrdina
894556ca81 secret: move virSecretGetSecretString into virsecret
The function virSecretGetSecretString calls into secret driver and is
used from other hypervisors drivers and as such makes more sense in
util.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-17 15:52:37 +01:00
Daniel P. Berrangé
3caa28dc50 src: replace gmtime_r/localtime_r/strftime with GDateTime
gmtime_r/localtime_r are mostly used in combination with
strftime to format timestamps in libvirt. This can all
be replaced with GDateTime resulting in simpler code
that is also more portable.

There is some boundary condition problem in parsing POSIX
timezone offsets in GLib which tickles our test suite.
The test suite is hacked to avoid the problem. The upsteam
GLib bug report is

  https://gitlab.gnome.org/GNOME/glib/issues/1999

Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-17 10:02:01 +00:00
Daniel P. Berrangé
fa434739a0 src: replace verify(expr) with G_STATIC_ASSERT(expr)
G_STATIC_ASSERT() is a drop-in functional equivalent of
the GNULIB verify() macro.

Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-17 10:02:01 +00:00
Daniel P. Berrangé
7b9645a7d1 util: replace atomic ops impls with g_atomic_int*
Libvirt's original atomic ops impls were largely copied
from GLib's code at the time. The only API difference
was that libvirt's virAtomicIntInc() would return a
value, but g_atomic_int_inc was void. We thus use
g_atomic_int_add(v, 1) instead, though this means
virAtomicIntInc() now returns the original value,
instead of the new value.

This rewrites libvirt's impl in terms of g_atomic_int*
as a short term conversion. The key motivation was to
quickly eliminate use of GNULIB's verify_expr() macro
which is not a direct match for G_STATIC_ASSERT_EXPR.
Long term all the callers should be updated to use
g_atomic_int* directly.

Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-17 10:02:00 +00:00
Jonathon Jongsma
b28bf62b3f Use glib alloc API for virDomainFSInfo
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-16 16:35:47 +01:00
Jonathon Jongsma
9a7d618c79 qemu: use glib allocation apis for qemuAgentFSInfo
Switch from old VIR_ allocation APIs to glib equivalents.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-16 16:35:47 +01:00
Jonathon Jongsma
9e1a8298cd qemu: use glib alloc in qemuAgentGetFSInfoFillDisks()
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-16 16:35:47 +01:00
Jonathon Jongsma
3c436c22a4 qemu: remove qemuDomainObjBegin/EndJobWithAgent()
This function potentially grabs both a monitor job and an agent job at
the same time. This is problematic because it means that a malicious (or
just buggy) guest agent can cause a denial of service on the host. The
presence of this function makes it easy to do the wrong thing and hold
both jobs at the same time. All existing uses have already been removed
by previous commits.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-16 16:35:47 +01:00
Jonathon Jongsma
599ae372d8 qemu: don't access vmdef within qemu_agent.c
In order to avoid holding an agent job and a normal job at the same
time, we want to avoid accessing the domain's definition while holding
the agent job. To achieve this, qemuAgentGetFSInfo() only returns the
raw information from the agent query to the caller. The caller can then
release the agent job and then proceed to look up the disk alias from
the vm definition. This necessitates moving a few helper functions to
qemu_driver.c and exposing the agent data structure (qemuAgentFSInfo) in
the header.

In addition, because the agent function no longer returns the looked-up
disk alias, we can't test the alias within qemuagenttest.  Instead we
simply test that we parse and return the raw agent data correctly.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-16 16:35:47 +01:00
Jonathon Jongsma
306b4cb070 qemu: Don't store disk alias in qemuAgentDiskInfo
The qemuAgentDiskInfo structure is filled with information received from
the agent command response, except for the 'alias' field, which is
retrieved from the vm definition. Limit this structure only to data that
was received from the agent message.

This is another intermediate step in moving the responsibility for
searching the vmdef from qemu_agent.c to qemu_driver.c so that we can
avoid holding an agent job and a normal job at the same time.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-16 16:35:46 +01:00
Jonathon Jongsma
bdb8a800b4 qemu: store complete agent filesystem information
In an effort to avoid holding both an agent and normal job at the same
time, we shouldn't access the vm definition from within qemu_agent.c
(i.e. while the agent job is being held). In preparation, we need to
store the full filesystem disk information in qemuAgentDiskInfo.  In a
following commit, we can pass this information back to the caller and
the caller can search the vm definition to match the filsystem disk to
an alias.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-16 16:35:46 +01:00
Jonathon Jongsma
e888c0f667 qemu: rename qemuAgentGetFSInfoInternalDisk()
The function name doesn't give a good idea of what the function does.
Rename to qemuAgentGetFSInfoFillDisks() to make it more obvious than it
is filling in the disk information in the fsinfo struct.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-16 16:35:46 +01:00
Daniel P. Berrangé
4cf8dd0c57 qemu: add support for specifying CPU "dies" topology parameter
QEMU since 4.1.0 supports the "dies" parameter for -smp

Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-16 15:11:55 +00:00
Daniel P. Berrangé
fbf27730a3 conf: add support for specifying CPU "dies" parameter
Recently CPU hardware vendors have started to support a new structure
inside the CPU package topology known as a "die". Thus the hierarchy
is now:

  sockets > dies > cores > threads

This adds support for "dies" in the XML parser, with the value
defaulting to 1 if not specified for backwards compatibility.

For example a system with 64 logical CPUs might report

   <topology sockets="4" dies="2" cores="4" threads="2"/>

Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-16 15:11:42 +00:00
Jiri Denemark
bd04d63ad9 qemu: Don't emit SUSPENDED_POSTCOPY event on destination
When pause-before-switchover QEMU capability is enabled, we get STOP
event before MIGRATION event with postcopy-active state. To properly
handle post-copy migration and emit correct events commit
v4.10.0-rc1-4-geca9d21e6c added a hack to
qemuProcessHandleMigrationStatus which translates the paused state
reason to VIR_DOMAIN_PAUSED_POSTCOPY and emits
VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY event when migration state changes
to post-copy.

However, the code was effective on both sides of migration resulting in
a confusing VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY event on the destination
host, where entering post-copy mode is already properly advertised by
VIR_DOMAIN_EVENT_RESUMED_POSTCOPY event.

https://bugzilla.redhat.com/show_bug.cgi?id=1791458

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-01-16 15:12:19 +01:00
Michal Privoznik
4c581527d4 qemu: Stop domain on failed restore
When resuming a domain from a save file, we read the domain XML
from the file, add it onto our internal list of domains, start
the qemu process, let it load the incoming migration stream and
resume its vCPUs afterwards. If anything goes wrong, the domain
object is removed from the list of domains and error is returned
to the caller. However, the qemu process might be left behind -
if resuming vCPUs fails (e.g. because qemu is unable to acquire
write lock on a disk) then due to a bug the qemu process is not
killed but the domain object is removed from the list.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1718707

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-01-16 09:17:07 +01:00
Michal Privoznik
3203ad6cfd qemu: Use g_autoptr() for qemuDomainSaveCookie
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-01-16 09:17:07 +01:00
Michal Privoznik
82e127e343 qemuDomainSaveImageStartVM: Use g_autoptr() for virCommand
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-01-16 09:17:07 +01:00
Michal Privoznik
1c16f261d0 qemuDomainSaveImageStartVM: Use VIR_AUTOCLOSE for @intermediatefd
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-01-16 09:17:07 +01:00
Julio Faracco
a4a5827c9f qemu: Implement virDomainGetHostnameFlags
We have to keep the default - querying the agent if no flag is
set.

Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2020-01-16 09:02:35 +01:00
Thomas Huth
bfd5f69d60 qemu_capabilities: Do not report USB as subsystem type if it is not available
libvirt currently always reports that USB is available as a bus subsystem
type when running "virsh domcapabilities". However, this is not always
true, for example the qemu-system-s390x binary normally never has support
for USB. Thus we should only report that USB is available if there is
also a USB host controller available where we can attach USB devices.

Reported-by: Sebastian Mitterle <smitterl@redhat.com>
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1759849
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-13 13:53:23 +01:00
Peter Krempa
3f2d167d9c conf: Always format storage source auth and encryption under <source> for backing files
Historically there are two places where we format authentication and
encryption for a disk. The logich which formats it for backing files was
flawed though and didn't format it at all. This worked if the image
became a backing file through the means of a snapshot but not directly.

Force formatting of the source and encryption for any non-disk case to
fix the issue.

This caused problems in many places as we use the formatter to copy the
definition. Effectively any copy lost the secret definition.

https://bugzilla.redhat.com/show_bug.cgi?id=1789310
https://bugzilla.redhat.com/show_bug.cgi?id=1788898

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2020-01-13 12:53:58 +01:00
Michael Weiser
5373f63b30 qemu: Warn of restore with managed save being risky
Internal snapshots of a non-running domain do not carry any memory state
and restoring such a snapshot will not replace existing saved memory
state. This allows a scenario, where a user first suspends a domain into
managedsave, restores a non-running snapshot and then resumes the domain
from managedsave. After that, the guest system will run with its
previous memory state atop a different disk state. The most obvious
possible fallout from this is extensive file system corruption. Swap
content and RAID bitmaps might also be off.

This has been discussed[1] and fixed[2] from the end-user perspective for
virt-manager.

This patch marks the restore operation as risky at the libvirt level,
requiring the user to remove the saved memory state first or force the
operation.

[1] https://www.redhat.com/archives/virt-tools-list/2019-November/msg00011.html
[2] https://www.redhat.com/archives/virt-tools-list/2019-December/msg00049.html

Signed-off-by: Michael Weiser <michael.weiser@gmx.de>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-10 10:36:33 +01:00
Jiri Denemark
e0127260fb qemu: Don't use NULL path from qemuDomainGetHostdevPath
Commit v5.10.0-290-g3a4787a301 refactored qemuDomainGetHostdevPath to
return a single path rather than an array of paths. When the function is
called on a missing device, it will now return NULL in @path rather than
a NULL array with zero items and the callers need to be adapted
properly.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-01-10 09:12:57 +01:00
Ján Tomko
264ec9da43 qemu: fix implicit fallthrough warning
src/qemu/qemu_domain_address.c:680:13: error: this statement may fall through [-Werror=implicit-fallthrough=]
             switch ((virDomainFSModel) dev->data.fs->model) {

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: f363af7e35
2020-01-08 10:41:11 +01:00
Michal Privoznik
8fcee47807 qemu_firmware: Accept int in qemuFirmwareOSInterfaceTypeFromOsDefFirmware()
The point of this function is to translate virDomainOsDefFirmware
enum to qemuFirmwareOSInterface enum. However, with my commit
v5.10.0-507-g8e1804f9f6 we are passing a variable type of
virDomainLoader enum. Make the function accept both enums and
make the enum members correspond to each other.

This fixes clang build.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-01-08 10:14:55 +01:00
Ján Tomko
f6d7d8612d qemu: command: take fsdriver type into account
Split the formatting by fsdriver type to allow adding a new type.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-08 09:51:01 +01:00
Ján Tomko
f363af7e35 qemu: address: take fsdriver type into account
Split the switch by fsdriver type to allow adding a new one.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-08 09:51:01 +01:00
Ján Tomko
83f046458e qemu: pass private data to qemuBuildFilesystemCommandLine
This will be used by a future patch.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-08 09:51:01 +01:00
Ján Tomko
801e6da29c qemu: add private data to virDomainFSDef
Wire up the allocation and disposal of private data.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-08 09:51:01 +01:00
Ján Tomko
adadc342c3 qemu: rename gluster_debug_entry
Remove the 'gluster' part and decouple the return from
the gluster_debug_level parsing to allow adding more options
to this section.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-08 09:51:00 +01:00
Peter Krempa
c314222a01 qemu: backup: Move capability check after inactive check
Inactive VM doesn't have qemuCaps set thus we'd never properly report
that VM backups are supported only for running VMs.

Move the capability check after the active check.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2020-01-08 07:10:46 +01:00
Daniel Henrique Barboza
21ad56e932 qemu: remove unneeded labels
Remove unneeded, easy to remove goto labels (cleanup|error|done|...).

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2020-01-07 16:40:41 +01:00
Michal Privoznik
8e1804f9f6 qemu_firmware: Try to autofill for old style UEFI specification
While we discourage people to use the old style of specifying
UEFI for their domains (the old style is putting path to the FW
image under /domain/os/loader/ whilst the new one is using
/domain/os/@firmware), some applications might have not adapted
yet. They still rely on libvirt autofilling NVRAM path and
figuring out NVRAM template when using the old way (notably
virt-install does this). We must preserve backcompat for this
previously supported config approach. However, since we really
want distro maintainers to leave --with-loader-nvram configure
option and rely on JSON descriptors, we need to implement
autofilling of NVRAM template for the old way too.

Fedora: https://bugzilla.redhat.com/show_bug.cgi?id=1782778
RHEL: https://bugzilla.redhat.com/show_bug.cgi?id=1776949

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-07 16:26:47 +01:00
Michal Privoznik
7c5264d2be src: Introduce and use virDomainDefHasOldStyleUEFI() and virDomainDefHasOldStyleROUEFI()
These functions are meant to replace verbose check for the old
style of specifying UEFI with a simple function call.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-07 16:26:47 +01:00
Michal Privoznik
57f9067ca3 qemu_firmware: Introduce @want variable to qemuFirmwareMatchDomain()
This simplifies condition when matching FW interface by having a
single line condition instead of multiline one. Also, it prepares
the code for future expansion.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-07 16:26:47 +01:00
Michal Privoznik
50d7465f3d qemu_firmware: Pass virDomainDef into qemuFirmwareFillDomain()
This function needs domain definition really, we don't need to
pass the whole domain object. This saves couple of dereferences
and characters esp. in more checks to come.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-07 16:26:47 +01:00
Peter Krempa
a4877192a1 qemu: backup: roll-back checkpoint metadata if the checkpoint wasn't taken
We insert the checkpoint metadata into the list of checkpoints prior to
actually creating the on-disk bits. If the 'transaction' or any other
steps done between inserting the checkpoint and creating the on-disk
data fail we'd end up with an unusable checkpoint that would vanish
after libvirtd restart.

Prevent this by rolling back the metadata if we didn't actually take and
record the checkpoint.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-07 15:20:14 +01:00
Peter Krempa
54dd75ec8d qemu: checkpoint: Extract and export rollback of checkpoint metadata storing
If we are certain that the checkpoint creation failed we remove the
metadata from the list. To allow reusing this in the backup code add a
new helper and export it.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-07 15:19:58 +01:00
Wang Huaqiang
65a63d8864 Introduce command 'virsh domstats --memory' for reporting memory BW
Introduce an option '--memory' for showing memory related
information. The memory bandwidth infomatio is listed as:

Domain: 'libvirt-vm'
 memory.bandwidth.monitor.count=4
 memory.bandwidth.monitor.0.name=vcpus_0-4
 memory.bandwidth.monitor.0.vcpus=0-4
 memory.bandwidth.monitor.0.node.count=2
 memory.bandwidth.monitor.0.node.0.id=0
 memory.bandwidth.monitor.0.node.0.bytes.total=10208067584
 memory.bandwidth.monitor.0.node.0.bytes.local=4807114752
 memory.bandwidth.monitor.0.node.1.id=1
 memory.bandwidth.monitor.0.node.1.bytes.total=8693735424
 memory.bandwidth.monitor.0.node.1.bytes.local=5850161152
 memory.bandwidth.monitor.1.name=vcpus_7
 memory.bandwidth.monitor.1.vcpus=7
 memory.bandwidth.monitor.1.node.count=2
 memory.bandwidth.monitor.1.node.0.id=0
 memory.bandwidth.monitor.1.node.0.bytes.total=853811200
 memory.bandwidth.monitor.1.node.0.bytes.local=290701312
 memory.bandwidth.monitor.1.node.1.id=1
 memory.bandwidth.monitor.1.node.1.bytes.total=406044672
 memory.bandwidth.monitor.1.node.1.bytes.local=229425152

Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com>
2020-01-06 14:04:10 +00:00
Wang Huaqiang
5d876f25bd util, resctrl: using 64bit interface instead of 32bit for counters
The underlying resctrl monitoring is actually using 64 bit counters,
not the 32bit one. Correct this by using 64bit data type for reading
hardware value.

To keep the interface consistent, the result of CPU last level cache
that occupied by vcpu processors of specific restrl monitor group is
still reported with a truncated 32bit data type. because, in silicon
world, CPU cache size will never exceed 4GB.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com>
2020-01-06 13:30:03 +00:00
Peter Krempa
5632ed8bad qemu: process: Terminate backup job on VM destroy
Commit d75f865fb9 caused a job-deadlock if
a VM is running the backup job and being destroyed as it removed the
cleanup of the async job type and there was nothing to clean up the
backup job.

Add an explicit cleanup of the backup job when destroying a VM.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:36 +01:00
Peter Krempa
bc8b159cb1 qemu: backup: Properly propagate async job type when cancelling the job
When cancelling the blockjobs as part of failed backup job startup
recover we didn't pass in the correct async job type. Luckily the block
job handler and cancellation code paths use no block job at all
currently so those were correct.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:36 +01:00
Peter Krempa
3a98fe9db3 qemu: blockjob: Remove infrastructure for remembering to delete image
Now that we delete the images elsewhere it's not required. Additionally
it's safe to do as we never released an upstream version which required
this being in place.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:36 +01:00
Peter Krempa
40485059ab qemu: backup: Move deletion of backup images to job termination
While qemu is running both locations are identical in semantics, but the
move will allow us to fix the scenario when the VM is destroyed or
crashes where we'd leak the images.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:35 +01:00
Peter Krempa
d6b994bafd qemu: backup: Configure backup store image with backing file
In contrast to snapshots the backup job does not complain when the
backup job's store file has backing pre-configured. It's actually
required so that the NBD server exposes all the data properly.

Remove our fake termination and use the existing disk source as backing.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:35 +01:00
Peter Krempa
728b993c8a qemu: Reset the node-name allocator in qemuDomainObjPrivateDataClear
qemuDomainObjPrivateDataClear clears state which become invalid after VM
stopped running and the node name allocator belongs there.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:35 +01:00
Peter Krempa
bae81b8e76 qemu: block: Use proper asyncJob when waiting for completion of blockdev-create
The waiting loop used QEMU_ASYNC_JOB_NONE rather than 'asyncJob' passed
from the caller.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:35 +01:00
Daniel P. Berrangé
8812163124 src: remove unused imports of dirname.h
A few places were importing dirname.h without actually using it.

Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-03 15:42:13 +00:00
Daniel P. Berrangé
bf7d2a26a3 src: replace mdir_name() with g_path_get_dirname()
Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-03 15:42:13 +00:00
Daniel P. Berrangé
472cc3941b util: replace IS_ABSOLUTE_FILE_NAME with g_path_is_absolute
Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-03 15:42:13 +00:00
Daniel P. Berrangé
f5e9bdb87f src: replace clock_gettime()/gettimeofday() with g_get_real_time()
g_get_real_time() returns the time since epoch in microseconds.
It uses gettimeofday() internally while libvirt used clock_gettime
because it is declared async signal safe. In practice gettimeofday
is also async signal safe *provided* the timezone parameter is
NULL. This is indeed the case in g_get_real_time().

Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-03 15:42:13 +00:00
Daniel P. Berrangé
f7df985684 src: switch from fnmatch to g_pattern_match_simple
The g_pattern_match function_simple is an acceptably close
approximation of fnmatch for libvirt's needs.

In contrast to fnmatch(), the '/' character can be matched
by the wildcards, there are no '[...]' character ranges and
'*' and '?' can not be escaped to include them literally in
a pattern.

Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-03 15:42:13 +00:00
Daniel P. Berrangé
d0312c584f src: use g_lstat() instead of lstat()
The GLib g_lstat() function provides a portable impl for
Win32.

Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-03 15:42:13 +00:00
Nikolay Shirokovskiy
6c6d93bc62 qemu: hide details of fake reboot
If we use fake reboot then domain goes thru running->shutdown->running
state changes with shutdown state only for short period of time.  At
least this is implementation details leaking into API. And also there is
one real case when this is not convinient. I'm doing a backup with the
help of temporary block snapshot (with the help of qemu's API which is
used in the newly created libvirt's backup API). If guest is shutdowned
I want to continue to backup so I don't kill the process and domain is
in shutdown state. Later when backup is finished I want to destroy qemu
process. So I check if it is in shutdowned state and destroy it if it
is. Now if instead of shutdown domain got fake reboot then I can destroy
process in the middle of fake reboot process.

After shutdown event we also get stop event and now as domain state is
running it will be transitioned to paused state and back to running
later. Though this is not critical for the described case I guess it is
better not to leak these details to user too. So let's leave domain in
running state on stop event if fake reboot is in process.

Reconnection code handles this patch without modification. It detects
that qemu is not running due to shutdown and then calls qemuProcessShutdownOrReboot
which reboots as fake reboot flag is set.

Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-24 09:22:40 +03:00
Daniel P. Berrangé
42b3e5b9e4 qemu: store the emulator name in the capabilities XML
We don't need this for any functional purpose, but when debugging hosts
it is useful to know what binary a given capabilities XML document is
associated with.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-12-23 16:39:38 +00:00
Daniel P. Berrangé
0fcc78d51b qemu: add qemu caps constructor which takes binary name
Simplify repeated code patterns by providing a new constructor taking
the QEMU binary name.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-12-23 16:39:36 +00:00
Daniel P. Berrangé
25db737471 qemu: add explicit flag to skip qemu caps invalidation
Currently if the binary path is NULL in the qemu capabilities object,
cache invalidation is skipped. A future patch will ensure that the
binary path is always non-NULL, so a way to explicitly skip invalidation
is required.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-12-23 16:39:20 +00:00
Daniel Henrique Barboza
7a7d36055c qemu_process.c: remove 'cleanup' label from qemuProcessCreatePretendCmd()
The 'cleanup' flag is doing no cleaup in this function. We can
remove it and return NULL on error or qemuBuildCommandLine().

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-20 18:31:51 -05:00
Daniel Henrique Barboza
d8eb3ab9e1 qemu_process.c: remove cleanup labels after g_auto*() changes
The g_auto*() changes made by the previous patches made a lot
of 'cleanup' labels obsolete. Let's remove them.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-20 18:31:51 -05:00
Daniel Henrique Barboza
d234efc59a qemu_process.c: use g_autoptr()
Change all feasible pointers to use g_autoptr().

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-20 18:31:51 -05:00
Daniel Henrique Barboza
906d653297 qemu_domain.h: add G_DEFINE_AUTOPTR_CLEANUP_FUNC for qemuDomainLogContext
This will allow us to g_autoptr qemuDomainLogContext pointers
in the following patch.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-20 18:31:51 -05:00
Daniel Henrique Barboza
982ea95142 qemu_process.c: use g_autofree
Change all feasible strings and scalar pointers to use g_autofree.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-20 18:31:51 -05:00
Fabiano Fidêncio
2c38781792 qemu: Don't check the output of virGetUserRuntimeDirectory()
virGetUserRuntimeDirectory() *never* *ever* returns NULL, making the
checks for it completely unnecessary.

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-12-20 09:38:43 +01:00
Fabiano Fidêncio
c1a1c75952 qemu: Don't check the output of virGetUserConfigDirectory()
virGetUserConfigDirectory() *never* *ever* returns NULL, making the
checks for it completely unnecessary.

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-12-20 09:38:43 +01:00
Fabiano Fidêncio
2db0583c73 qemu: Don't check the output of virGetUserCacheDirectory()
virGetUserCacheDirectory() *never* *ever* returns NULL, making the
checks for it completely unnecessary.

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-12-20 09:38:43 +01:00
Fabiano Fidêncio
d0e1c6a6ae qemu: Don't check the output of virGetUserDirectory()
virGetUserDirectory() *never* *ever* returns NULL, making the checks for
it completely unnecessary.

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-12-20 09:38:43 +01:00
Daniel Henrique Barboza
ae2edb39b9 qemu: handle unassigned PCI hostdevs in command line
Previous patch made it possible for the QEMU driver to check if
a given PCI hostdev is unassigned, by checking if dev->info->type is
VIR_DOMAIN_DEVICE_ADDRESS_TYPE_UNASSIGNED, meaning that this device
shouldn't be part of the actual guest launch.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-18 13:08:28 -05:00
Daniel Henrique Barboza
96999404cb Introducing new address type='unassigned' for PCI hostdevs
This patch introduces a new PCI hostdev address type called
'unassigned'. This new type gives users the option to add
PCI hostdevs to the domain XML in an 'unassigned' state, meaning
that the device exists in the domain, is managed by Libvirt
like any regular PCI hostdev, but the guest does not have
access to it.

This adds extra options for managing PCI device binding
inside Libvirt, for example, making all the managed PCI hostdevs
declared in the domain XML to be detached from the host and bind
to the chosen driver and, at the same time, allowing just a
subset of these devices to be usable by the guest.

Next patch will use this new address type in the QEMU driver to
avoid adding unassigned devices to the QEMU launch command line.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-18 13:08:27 -05:00
Daniel Henrique Barboza
94f6e2f9fc qemu: command: move validation of vmcoreinfo to qemu_domain.c
Move the validation of vmcoreinfo from qemuBuildVMCoreInfoCommandLine()
to qemuDomainDefValidateFeatures(), allowing for validation
at domain define time.

qemuxml2xmltest.c was changed to account for this caps being
now validated at this earlier stage.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-18 13:01:36 -05:00
Daniel Henrique Barboza
a15de75dc5 qemu: command: move qemuBuildSmartcardCommandLine validation to qemu_domain.c
Move smartcard validation being done by qemuBuildSmartcardCommandLine()
to the existing qemuDomainSmartcardDefValidate() function. This
function is called by qemuDomainDeviceDefValidate(), allowing smartcard
validation in domain define time.

Tests were adapted to consider the new caps being needed in
this earlier stage.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-18 13:01:30 -05:00
Daniel Henrique Barboza
379e955eb8 qemu: command: move qemuBuildGraphicsEGLHeadlessCommandLine validation to qemu_domain.c
Move EGL Headless validation from qemuBuildGraphicsEGLHeadlessCommandLine()
to qemuDomainDeviceDefValidateGraphics(). This function is called by
qemuDomainDefValidate(), validating the graphics parameters in domain
define time.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-18 12:54:56 -05:00
Daniel Henrique Barboza
2acbbd821b qemu: command: move NVDIMM validation to qemu_domain.c
Move the NVDIMM validation from qemuBuildMachineCommandLine()
to a new function in qemu_domain.c, qemuDomainDeviceDefValidateMemory(),
which is called by qemuDomainDeviceDefValidate(). This allows
NVDIMM validation to occur in domain define time.

It also increments memory hotplug validation, which can be seen
by the failures in the hotplug tests in qemuxml2xmltest.c that
needed to be adjusted after the move.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-18 12:54:56 -05:00
Daniel Henrique Barboza
aed9bcd11b qemu_command: tidy up qemuBuildHostdevCommandLine loop
The current 'for' loop with 5 consecutive 'ifs' inside
qemuBuildHostdevCommandLine can be a bit smarter:

- all 5 'ifs' fails if hostdev->mode is not equal to
VIR_DOMAIN_HOSTDEV_MODE_SUBSYS. This check can be moved to the
start of the loop, failing to the next element immediately
in case it fails;

- all 5 'ifs' checks for a specific subsys->type to build the proper
command line argument (virHostdevIsSCSIDevice and virHostdevIsMdevDevice
do that but within a helper). Problem is that the code will keep
checking for matches even if one was already found, and there is
no way a hostdev will fit more than one 'if' (i.e. a hostdev can't
have 2+ different types). This means that a SUBSYS_TYPE_USB will
create its command line argument in the first 'if', then all other
conditionals will surely fail but will end up being checked anyway.

All of this can be avoided by moving the hostdev->mode comparing
to the start of the loop and using a switch statement with
subsys->type to execute the proper code for a given hostdev
type.

Suggested-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-12-18 16:02:08 +01:00
Michal Privoznik
39a7dff726 qemu: Don't leak hostcpu or hostnuma on driver cleanup
When freeing qemu driver struct members, we forgot to free
@hostcpu and @hostnuma members.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-18 14:28:48 +01:00
Michal Privoznik
7cf76d4e3a qemu: Reorder cleanup in qemuStateCleanup()
This function is supposed to clean up virQEMUDriver structure and
free individual members. However, it's doing that in random order
which makes it hard to track which members are being freed and
which are not. Do the free in reverse order than the structure
definition - assuming that the most important members (like
mutex) are declared first and freed last.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-18 14:28:48 +01:00
Laine Stump
6c17606b7c qemu: homogenize MAC address in live & config when hotplugging a netdev
Prior to commit 55ce656463 (first in libvirt 4.6.0), the XML sent to
virDomainAttachDeviceFlags() was parsed only once, and the results of
that parse were inserted into both the live object of the running
domain and into the persistent config. Thus, if MAC address was
omitted from in XML for a network device (<interface>), both the live
and config object would have the same MAC address.

Commit 55ce656463 changed the code to parse the incoming XML twice -
once for live and once for config. This does eliminate the problem of
PCI (/scsi/sata) address conflicts caused by allocating an address
based on existing devices in live object, but then inserting the
result into the config (which may already have a device using that
address), BUT it also means that when the MAC address of a network
device hasn't been specified in the XML, each copy will get a
different auto-generated MAC address.

This results in the MAC address of the device changing the next time
the domain is shutdown and restarted, which creates havoc with the
guest OS's network config.

There have been several discussions about this in the last > 1 year,
attempting to find the ideal solution to this problem that makes MAC
addresses consistent and accounts for all sorts of corner cases with
PCI/scsi/sata addresses. All of these discussions fizzled out because
every proposal was either too difficult to implement or failed to fix
some esoteric case someone thought up.

So, in the interest of solving the MAC address problem while not
making the "other address" situation any worse than before, this patch
simply adds a qemuDomainAttachDeviceLiveAndConfigHomogenize() function
that (for now) copies the MAC address from the config object to the
live object (if the original xml had <mac address='blah'/> then this
will be an effective NOP (as the macs already match)).

Any downstream libvirt containing upstream commit
55ce656463 should have this patch as well.

https://bugzilla.redhat.com/1783411

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2019-12-17 21:21:09 -05:00
Michal Privoznik
7be63dbe25 qemuGetDHCPInterfaces: Switch to GLib
If we use glib alloc functions, we can drop the 'cleanup' label
and @rv variable and also simplify the code a bit.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 16:58:42 +01:00
Michal Privoznik
c06f4b48fe qemuGetDHCPInterfaces: Move some variables inside the loop
Some variables are not used outside of the for() loop. Move their
declaration to clean up the code a bit.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 16:58:42 +01:00
Michal Privoznik
dae430ccbc qemu: Don't use dom->conn to lookup virNetwork
When using the monolithic daemon, then dom->conn has all driver
tables filled in properly and thus it's safe to call an API other
than virDomain*(). However, when using split daemons then
dom->conn has only hypervisor driver table set
(dom->conn->driver) and the rest is NULL. Therefore, if we want
to call a non-domain API (virNetworkLookupByName() in this case),
we have obtain the cached connection object accessible via
virGetConnectNetwork().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 16:58:42 +01:00
Michal Privoznik
5910b180ca qemu_driver: Push qemuDomainInterfaceAddresses() a few lines down
If we place qemuDomainInterfaceAddresses() a few lines below the
two functions its using then we can drop forward declarations of
those functions.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 16:58:42 +01:00
Pavel Mores
b036505279 qemu: use g_autofree instead of VIR_FREE in qemuMonitorTextCreateSnapshot()
While at bugfixing, convert the whole function to the new-style memory
allocation handling.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Pavel Mores <pmores@redhat.com>
2019-12-17 10:49:30 -05:00
Michal Privoznik
430715604f qemu_hotplug: Prepare NVMe disks on hotplug
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:44 +01:00
Michal Privoznik
6edb4321b2 qemu: Allow forcing VFIO when computing memlock limit
With NVMe disks, one can start a blockjob with a NVMe disk
that is not visible in domain XML (at least right away). Usually,
it's fairly easy to override this limitation of
qemuDomainGetMemLockLimitBytes() - for instance for hostdevs we
temporarily add the device to domain def, let the function
calculate the limit and then remove the device. But it's not so
easy with virStorageSourcePtr - in some cases they don't
necessarily are attached to a disk. And even if they are it's
done later in the process and frankly, I find it too complicated
to be able to use the simple trick we use with hostdevs.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:44 +01:00
Michal Privoznik
da27be1b09 qemu: Don't leak storage perms on failure in qemuDomainAttachDiskGeneric
At the very beginning of the attach function the
qemuDomainStorageSourceChainAccessAllow() is called which
modifies CGroups, locks and seclabels for new disk and its
backing chain. This must be followed by a counterpart which
reverts back all the changes if something goes wrong. This boils
down to calling qemuDomainStorageSourceChainAccessRevoke() which
is done under 'error' label. But not all failure branches jump
there. They just jump onto 'cleanup' label where no revoke is
done. Such mistake is easy to do because 'cleanup' label does
exist. Therefore, dissolve 'error' block in 'cleanup' and have
everything jump onto 'cleanup' label.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:44 +01:00
Michal Privoznik
1038505420 qemu_monitor_text: Catch IOMMU/VFIO related errors in qemuMonitorTextAddDrive
Because this is a HMP we're dealing with, there is nothing like
class of reply message, so we have to do some string comparison
to guess if the command fails. Well, with NVMe disks whole new
class of errors comes to play because qemu needs to initialize
IOMMU and VFIO for them. You can see all the messages it may
produce in qemu_vfio_init_pci().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:44 +01:00
Michal Privoznik
8e2026cc18 qemu: Generate command line of NVMe disks
Now, that we have everything prepared, we can generate command
line for NVMe disks.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:44 +01:00
Michal Privoznik
c4062d5620 qemu_capabilities: Introduce QEMU_CAPS_DRIVE_NVME
This capability tracks if qemu is capable of:

  -drive file.driver=nvme

The feature was added in QEMU's commit of v2.12.0-rc0~104^2~2.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:44 +01:00
Michal Privoznik
c988a39c7b qemu: Allow NVMe disk in CGroups
If a domain has an NVMe disk configured, then we need to allow it
on devices CGroup so that qemu can access it. There is one caveat
though - if an NVMe disk is read only we need CGroup to allow
write too. This is because when opening the device, qemu does
couple of ioctl()-s which are considered as write.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:44 +01:00
Michal Privoznik
329a680297 qemu: Mark NVMe disks as 'need VFIO'
There are couple of places where a domain with a VFIO device gets
special treatment: in CGroups when enabling/disabling access to
/dev/vfio/vfio, and when creating/removing nodes in domain mount
namespace. Well, a NVMe disk is a VFIO device too. Fortunately,
we have this qemuDomainNeedsVFIO() function which is the only
place that needs adjustment.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:44 +01:00
Michal Privoznik
a80ebd2a2a qemu: Create NVMe disk in domain namespace
If a domain has an NVMe disk configured, then we need to create
/dev/vfio/* paths in domain's namespace so that qemu can open
them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:43 +01:00
Michal Privoznik
d3f06dcdb5 qemu: Take NVMe disks into account when calculating memlock limit
We have this beautiful function that does crystal ball
divination. The function is named
qemuDomainGetMemLockLimitBytes() and it calculates the upper
limit of how much locked memory is given guest going to need. The
function bases its guess on devices defined for a domain. For
instance, if there is a VFIO hostdev defined then it adds 1GiB to
the guessed maximum. Since NVMe disks are pretty much VFIO
hostdevs (but not quite), we have to do the same sorcery.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
ACKed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:43 +01:00
Michal Privoznik
8943ca11b2 qemu: prepare NVMe devices too
The qemu driver has its own wrappers around virHostdev module (so
that some arguments are filled in automatically). Extend these to
include NVMe devices too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
ACKed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:43 +01:00
Michal Privoznik
8cd7196974 conf: Format and parse NVMe type disk
To simplify implementation, some restrictions are added. For
instance, an NVMe disk can't go to any bus but virtio and has to
be type of 'disk' and can't have startupPolicy set.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:43 +01:00
Michal Privoznik
1ee471960b qemuMigrationSrcIsSafe: Rework slightly
There are going to be more disk types that are considered unsafe
with respect to migration. Therefore, move the error reporting
call outside of if() body and rework if-else combo to switch().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:43 +01:00
Michal Privoznik
081a12aba9 virpci: Introduce and use virPCIDeviceAddressGetIOMMUGroupDev
Sometimes, we have a PCI address and not fully allocated
virPCIDevice and yet we still want to know its /dev/vfio/N path.
Introduce virPCIDeviceAddressGetIOMMUGroupDev() function exactly
for that.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:43 +01:00
Michal Privoznik
cfce298042 qemu: Drop some 'cleanup' labels
Previous patches rendered some of 'cleanup' labels needless.
Drop them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:43 +01:00
Michal Privoznik
3a4787a301 qemuDomainGetHostdevPath: Don't include /dev/vfio/vfio in returned paths
Now that all callers of qemuDomainGetHostdevPath() handle
/dev/vfio/vfio on their own, we can safely drop handling in this
function. In near future the decision whether domain needs VFIO
file is going to include more device types than just
virDomainHostdev.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:43 +01:00
Michal Privoznik
f976516542 qemuDomainGetHostdevPath: Use more g_autoptr()/g_autofree
There are several variables which could be automatically freed
upon return from the function. I'm not changing @tmpPaths (which
is a string list) because it is going to be removed in next
commit.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:43 +01:00
Michal Privoznik
6f43c505d9 qemu: Explicitly add/remove /dev/vfio/vfio to/from NS/CGroups
In near future, the decision what to do with /dev/vfio/vfio with
respect to domain namespace and CGroup is going to be moved out
of qemuDomainGetHostdevPath() because there will be some other
types of devices than hostdevs that need access to VFIO.

All functions that I'm changing (except qemuSetupHostdevCgroup())
assume that hostdev we are adding/removing to VM is not in the
definition yet (because of how qemuDomainNeedsVFIO() is written).
Fortunately, this assumption is true.

For qemuSetupHostdevCgroup(), the worst thing that may happen is
that we allow /dev/vfio/vfio which was already allowed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-12-17 10:04:43 +01:00
Daniel Henrique Barboza
6f894a29d8 qemu: command: move sound codec validation to qemu_domain.c
qemuBuildSoundCodecStr() validates if a given QEMU binary
supports the sound codec. This validation can be moved to
qemu_domain.c to be executed in domain define time.

The codec validation was moved to the existing
qemuDomainDeviceDefValidateSound() function.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-12-16 18:12:40 -05:00