Commit Graph

12134 Commits

Author SHA1 Message Date
Jiri Denemark
73b81fc55f qemu: Add support for postcopy-recover QEMU migration state
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-06-07 17:40:20 +02:00
Jiri Denemark
5dd2d11ec0 qemu: Handle 'postcopy-paused' migration state
When connection breaks during post-copy migration, QEMU enters
'postcopy-paused' state. We need to handle this state and make the
situation visible to upper layers.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-06-07 17:40:20 +02:00
Jiri Denemark
458fa0e2bf qemu: Use switch in qemuProcessHandleMigrationStatus
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-06-07 17:40:20 +02:00
Jiri Denemark
425886ea99 qemu: Fetch paused migration stats
Even though a migration is paused, we still want to see the amount of
data transferred so far and that the migration is indeed not progressing
any further.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2022-06-07 17:40:20 +02:00
Jiri Denemark
918c14ee06 qemu: Don't wait for migration job when migration is running
Migration is a job which takes some time and if it succeeds, there's
nothing to call another migration on. If a migration fails, it might
make sense to rerun it with different arguments, but this would only be
done once the first migration fails rather than while it is still
running.

If this was not enough, the migration job now stays active even if
post-copy migration fails and anyone possibly retrying the migration
would be waiting for the job timeout just to get a suboptimal error
message.

So let's special case getting a migration job when another one is
already active.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2022-06-07 17:40:20 +02:00
Jiri Denemark
96db9dcfe9 qemu: Drop forward declarations in migration code
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2022-06-07 17:40:20 +02:00
Jiri Denemark
7c36e5004c qemu: Restore async job start timestamp on reconnect
Jobs that are supposed to remain active even when libvirt daemon
restarts were reported as started at the time the daemon was restarted.
This is not very helpful, we should restore the original timestamp.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-06-07 17:40:20 +02:00
Jiri Denemark
013d3091e0 qemu: Restore failed migration job on reconnect
Since we keep the migration job active when post-copy migration fails,
we need to restore it when reconnecting to running domains.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2022-06-07 17:40:20 +02:00
Jiri Denemark
c2f6a6a726 qemu: Abort failed post-copy when we haven't called Finish yet
When migration fails after it already switched to post-copy phase on the
source, but early enough that we haven't called Finish on the
destination yet, we know the vCPUs were not started on the destination
and the source host still has a complete state of the domain. Thus we
can just ignore the fact post-copy phase started and normally abort the
migration and resume vCPUs on the source.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2022-06-07 17:40:20 +02:00
Jiri Denemark
88a59fbbf3 qemu: Keep migration job active after failed post-copy
When post-copy migration fails, we can't just abort the migration and
resume the domain on the source host as it is already running on the
destination host and no host has a complete state of the domain memory.
Instead of the current approach of just marking the domain on both ends
as paused/running with a post-copy failed sub state, we will keep the
migration job active (even though the migration API will return failure)
so that the state is more visible and we can better control what APIs
can be called on the domains and even allow for resuming the migration.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2022-06-07 17:40:20 +02:00
Jiri Denemark
6637880b3c qemu: Add qemuDomainObjRestoreAsyncJob
The code for setting up a previously active backup job in
qemuProcessRecoverJob is generalized into a dedicated function so that
it can be later reused in other places.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2022-06-07 17:40:20 +02:00
Jiri Denemark
d6d1c4980d qemu: Rename qemuDomainObjRestoreJob as qemuDomainObjPreserveJob
It is used for saving job out of domain object. Just like
virErrorPreserveLast is used for errors. Let's make the naming
consistent as Restore would suggest different semantics.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2022-06-07 17:40:20 +02:00
Jiri Denemark
7c1840fa37 qemu: Introduce qemuProcessCleanupMigrationJob
The function can be used as a callback for qemuDomainCleanupAdd to
automatically clean up a migration job when a domain is destroyed.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2022-06-07 17:40:20 +02:00
Jiri Denemark
6ca0ff90ac qemu: Make qemuDomainCleanupAdd return void
The function never returns anything but zero.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2022-06-07 17:40:20 +02:00
Jiri Denemark
3abe9c496c qemu: Explicitly emit events on post-copy failure
The events would normally be triggered only if we're changing domain
state. But most of the time the domain is already in the right state and
we're just changing its substate from {PAUSED,RUNNING}_POSTCOPY to
*_POSTCOPY_FAILED. Let's emit lifecycle events explicitly when post-copy
migration fails to make the failure visible without polling.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2022-06-07 17:40:20 +02:00
Jiri Denemark
13b43c22b7 qemu: Keep domain running on dst on failed post-copy migration
There's no need to artificially pause a domain when post-copy fails
from our point of view unless QEMU connection is broken too as migration
may still be progressing well.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-06-07 17:40:20 +02:00
Jiri Denemark
aab9d64d4d Introduce VIR_DOMAIN_RUNNING_POSTCOPY_FAILED
This new "post-copy failed" reason for the running state will be used on
the destination host when post-copy migration fails while the domain is
already running there.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-06-07 17:40:20 +02:00
Jiri Denemark
8d00f3e801 qemu: Add debug messages to job recovery code
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2022-06-07 17:40:19 +02:00
Peter Krempa
95bd137216 qemu: Fix crash in qemuBuildDeviceCommandlineHandleOverrides
'STREQ' is used to compare the override alias with the device alias.
While the parser ensures that the override alias is non-NULL, the device
alias may be NULL and STREQ doesn't handle that.

Fixes: 38ab5c9ead
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/321
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-06-06 10:41:38 +02:00
Peter Krempa
8d3a807a4a qemu: fd: Fix monitor usage of qemuFDPassDirectGetPath
We need to use the 'name' variable and just overwrite it with the FD
number when FDs are passed on the monitor. Otherwise we will read NULL
path if the FD is accessed before being passed on the monitor. The idea
of this helper is to simplify the monitor code so it would be
counterproductive to have other behaviour.

Fixes the following symptom:

 $ virsh attach-interface cd network default --model virtio
 error: Failed to attach interface
 error: internal error: unable to execute QEMU command 'netdev_add': File descriptor named '(null)' has not been found

Fixes: bca9047906
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/318
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2092856
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-06-06 09:42:58 +02:00
Michal Privoznik
8d2567bccf qemu: Restore label to temp file in qemuDomainScreenshot()
Obtaining a screenshot via virDomainScreenshot() works like this:
  1) we create a temp file, label it, then
  2) tell QEMU to store the screenshot into it, and
  3) finally, open the file for transfer via virStream

Since the file is just temporary and even explicitly unlinked at
the end, no seclabel restoration is done. This makes perfect
sense for security models which attach a label to file itself
(DAC, SELinux) because the label is gone with the file. However,
for models where a list of files and allowed actions is kept on a
side (AppArmor) this approach means we just append files into the
profile and never remove them. In turn, the file grows and policy
update takes longer with each entry.

Restore the seclabel for AppArmor's sake.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2022-06-02 17:03:43 +02:00
Michal Privoznik
215b2466cd virStorageSourceGetActualType: Change type of retval
The virStorageSourceGetActualType() function returns either
virStorageSource->type (which is of type virStorageType), or
virStorageSourcePoolDef->type, which really stores a value of the
same enum. Thus, the latter struct can be changed so that the
virStorageSourceGetActualType() function can return correct type
instead of generic int.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
2022-06-01 14:54:59 +02:00
Michal Privoznik
2307f06cb2 Drop needless typecast to virStorageType enum
There are three places (two in domain_conf.c and one in
qemu_migration.c) where a virStorageSource->type is typecasted to
virStorageType (for the purpose of catching missing enum member
in a switch() statement at compile time). This is needless,
because as of v8.2.0-rc1~120 the struct member is of proper type.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
2022-06-01 14:54:59 +02:00
Tim Wiederhake
1b05f2e50b Fix typos
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-06-01 09:55:48 +02:00
Michal Privoznik
2597296ea6 qemu_capspriv: Drop needless declarations
There are two functions declared in qemu_capspriv.h:
1) virQEMUCapsInitHostCPUModel() which is not used anywhere but
   qemu_capabilities.c,

2) virQEMUCapsSetSEVCapabilities() which is my personal favorite
   but despite that it's never implemented nor called.

Drop them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
2022-06-01 09:45:40 +02:00
Michal Privoznik
f344005547 lib: Be consistent about vm->pid
The virDomainObj struct has @pid member where the domain's
hypervisor PID is stored (e.g. QEMU/bhyve/libvirt_lxc/... PID).
However, we are not consistent when it comes to shutoff state.
Initially, because virDomainObjNew() uses g_new0() the @pid is
initialized to 0. But when domain is shut off, some functions set
it to -1 (virBhyveProcessStop, virCHProcessStop, qemuProcessStop,
..).

In other places, the @pid is tested to be 0, on some other places
it's tested for being negative and in the rest for being
positive.

To solve this inconsistency we can stick with either value, -1 or
0. I've chosen the latter as it's safer IMO. For instance if by
mistake we'd kill(vm->pid, SIGTERM) we would kill ourselves
instead of init's process group.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
2022-06-01 09:35:26 +02:00
Michal Privoznik
14bd5036e4 qemuProcessStop: Don't try to remove QoS on already removed TAP
When cleaning up after stopped domain, one of the things we do is
attempt to clear QoS settings on OVS type interfaces. Well, this
is needless because they were removed just a couple of lines
above. As a result, the attempt fails and a warning is printed
into logs, polluting them needlessly.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/313
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-05-30 09:14:41 +02:00
Scott Davis
8c6fa38efc qemu: fix null string specifier argument in qemuDomainBlockJobAbort
Detected by gcc 11 -Wformat-overflow:
../../src/qemu/qemu_driver.c: In function ‘qemuDomainBlockJobAbort’:
../../src/util/virerror.h:176:5: warning: ‘%s’ directive argument is null [-Wformat-overflow=]
  176 |     virReportErrorHelper(VIR_FROM_THIS, code, __FILE__, \
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  177 |                          __FUNCTION__, __LINE__, __VA_ARGS__)
      |                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../src/qemu/qemu_driver.c:14475:17: note: in expansion of macro ‘virReportError’
14475 |                 virReportError(VIR_ERR_OPERATION_FAILED,
      |                 ^~~~~~~~~~~~~~
../../src/qemu/qemu_driver.c:14476:73: note: format string is defined here
14476 |                                _("block job '%s' failed while pivoting: %s"),
      |                                                                         ^~

Signed-off-by: Scott Davis <scott.davis@starlab.io>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-05-26 10:14:40 +02:00
Jiri Denemark
76baf935aa qemu: Do not pass unused opaque pointer to monitor callbacks
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-05-24 16:26:04 +02:00
Jiri Denemark
88f3727e71 qemu: Do not use opaque pointer in QEMU monitor callbacks
It always points to QEMU driver, which is quite redundant as all
callbacks also get a pointer to a vm object. Let's get the driver
pointer from there instead.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-05-24 16:26:04 +02:00
Jiri Denemark
64d5d06c56 qemu: Drop driver parameter from qemuProcessEventSubmit
We can easily get it from the vm object.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-05-24 16:26:04 +02:00
Jiri Denemark
7b5046ff6c qemu: Make vm parameter of qemuProcessEventSubmit mandatory
All callers (QMP event handlers) always pass non-NULL vm pointer. Let's
make the parameter mandatory.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-05-24 16:26:04 +02:00
Jiri Denemark
3ccd69f8c0 qemu: Pass arguments to qemuProcessEventSubmit directly
Allocating and filling qemuProcessEvent structure is a repeated pattern
before all calls to qemuProcessEventSubmit. We can move the allocation
inside this function and let callers pass all arguments directly.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-05-24 16:26:04 +02:00
Jiri Denemark
b4662bbd1f qemu: Avoid unlocked access to vm object in monitor callbacks
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-05-24 16:26:04 +02:00
Michal Privoznik
1c23123732 qemu_tpm: Make APIs work over a single virDomainTPMDef
In qemu_extdevice.c lives code that handles helper daemons that
are required for some types of devices (e.g. virtiofsd,
vhost-user-gpu, swtpm, etc.). These devices have their own
handling code in separate files, with only a very basic functions
exposed (e.g. for starting/stopping helper process, placing it
into given CGroup, etc.). And these functions all work over a
single instance of device (virDomainVideoDef *, virDomainFSDef *,
etc.), except for TPM handling code which takes virDomainDef *
and iterates over it inside its module.

Remove this oddness and make qemuExtTPM*() functions look closer
to the rest of the code.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2022-05-24 16:15:29 +02:00
Michal Privoznik
2df6849d78 qemu_hotplug: Deny changing @rss and @rss_hash_report attributes of virtio vNICs
We have virDomainUpdateDeviceFlags() API that allows changing of
some attributes of a device whilst domain is still running (e.g.
setting different QoS, link state change on vNICs). But only very
limited set of attributes can be changed and we have to check
whether user isn't trying to sneak in a change that's not
allowed. Well, in case of a virtio vNIC we forgot to check for
@rss and @rss_hash_report attributes of <driver/>.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2082540
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2022-05-24 09:02:00 +02:00
Daniel Henrique Barboza
8ccb4f463e qemu_capspriv.h: fix indentation
Fix identation of virQEMUCapsUpdateHostCPUModel() params.

Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-23 19:24:28 -03:00
Jiri Denemark
9c495f8fcb qemu: Do not return NULL when qemuMigrationSrcBegin succeeds
My recent commit v8.3.0-201-gc500955e95 tried to fix a regression which
would cause the function to return success even if virCloseCallbacksSet
failed. But due to a strange code flow in the function introduced an
opposite regression. The function would return NULL on success when
called without VIR_MIGRATE_CHANGE_PROTECTION flag.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-05-23 16:59:05 +02:00
Boris Fiuczynski
b41163005c util: make reuse of ccw device address format constant
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-05-23 16:31:45 +02:00
Boris Fiuczynski
45a8e3988f util: refactor virDomainDeviceCCWAddress into virccw.h
Refactor ccw data structure virDomainDeviceCCWAddress into util virccw.h
and rename it as virCCWDeviceAddress.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-05-23 16:31:40 +02:00
Jiri Denemark
c500955e95 qemu: Fix error propagation in qemuMigrationBegin
Commit v8.3.0-152-g49ef0f95c6 removed explicit VIR_FREE from
qemuMigrationBegin, effectively reverting v1.2.14-57-g77ddd0bba2

The xml variable was used to hold the return value and thus had to be
unset when an error happened after xml was already non-NULL. Such code
may be quite confusing though and we usually avoid it by not storing
anything to a return variable until everything succeeded.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2022-05-23 13:13:37 +02:00
Marc-André Lureau
53905292f9 qemu: add -chardev dbus support
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-05-19 12:36:37 +02:00
Marc-André Lureau
7648e40da5 conf: add <serial type='dbus'>
Like a Spice port, a dbus serial must specify an associated channel name.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-05-19 12:36:35 +02:00
Marc-André Lureau
1ce258a570 qemu: add audio type 'dbus'
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-05-19 12:36:32 +02:00
Marc-André Lureau
a062f5f777 conf: add <audio type='dbus'> support
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-05-19 12:36:28 +02:00
Marc-André Lureau
bde66322e8 qemu: add -display dbus support
By default, libvirt will start a private bus and tell QEMU to connect to
it. Instead, a D-Bus "address" to connect to can be specified, or the
p2p mode enabled.

D-Bus display works best with GL & a rendernode, which can be specified
with <gl> child element.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-05-19 12:36:20 +02:00
Marc-André Lureau
5c1e203a80 qemu: start the D-Bus daemon for the display
Start the daemon if necessary (it is already stopped in qemuProcessStop)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-05-19 12:36:17 +02:00
Marc-André Lureau
88ba34f5a0 conf: add <graphics type='dbus'>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-05-19 12:36:09 +02:00
Marc-André Lureau
14f45e5d8d qemu: add -display dbus capability check
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2022-05-19 12:35:26 +02:00
Peter Krempa
579403ba2e virclosecallbacks: Don't pass opqaue pointer to callback invocation
Remove the argument from the function prototypes and the callback
handler.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2022-05-17 19:31:08 +02:00