When connection breaks during post-copy migration, QEMU enters
'postcopy-paused' state. We need to handle this state and make the
situation visible to upper layers.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Even though a migration is paused, we still want to see the amount of
data transferred so far and that the migration is indeed not progressing
any further.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Migration is a job which takes some time and if it succeeds, there's
nothing to call another migration on. If a migration fails, it might
make sense to rerun it with different arguments, but this would only be
done once the first migration fails rather than while it is still
running.
If this was not enough, the migration job now stays active even if
post-copy migration fails and anyone possibly retrying the migration
would be waiting for the job timeout just to get a suboptimal error
message.
So let's special case getting a migration job when another one is
already active.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Jobs that are supposed to remain active even when libvirt daemon
restarts were reported as started at the time the daemon was restarted.
This is not very helpful, we should restore the original timestamp.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Since we keep the migration job active when post-copy migration fails,
we need to restore it when reconnecting to running domains.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When migration fails after it already switched to post-copy phase on the
source, but early enough that we haven't called Finish on the
destination yet, we know the vCPUs were not started on the destination
and the source host still has a complete state of the domain. Thus we
can just ignore the fact post-copy phase started and normally abort the
migration and resume vCPUs on the source.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When post-copy migration fails, we can't just abort the migration and
resume the domain on the source host as it is already running on the
destination host and no host has a complete state of the domain memory.
Instead of the current approach of just marking the domain on both ends
as paused/running with a post-copy failed sub state, we will keep the
migration job active (even though the migration API will return failure)
so that the state is more visible and we can better control what APIs
can be called on the domains and even allow for resuming the migration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The code for setting up a previously active backup job in
qemuProcessRecoverJob is generalized into a dedicated function so that
it can be later reused in other places.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
It is used for saving job out of domain object. Just like
virErrorPreserveLast is used for errors. Let's make the naming
consistent as Restore would suggest different semantics.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The function can be used as a callback for qemuDomainCleanupAdd to
automatically clean up a migration job when a domain is destroyed.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The function never returns anything but zero.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The events would normally be triggered only if we're changing domain
state. But most of the time the domain is already in the right state and
we're just changing its substate from {PAUSED,RUNNING}_POSTCOPY to
*_POSTCOPY_FAILED. Let's emit lifecycle events explicitly when post-copy
migration fails to make the failure visible without polling.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
There's no need to artificially pause a domain when post-copy fails
from our point of view unless QEMU connection is broken too as migration
may still be progressing well.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This new "post-copy failed" reason for the running state will be used on
the destination host when post-copy migration fails while the domain is
already running there.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
'STREQ' is used to compare the override alias with the device alias.
While the parser ensures that the override alias is non-NULL, the device
alias may be NULL and STREQ doesn't handle that.
Fixes: 38ab5c9ead
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/321
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We need to use the 'name' variable and just overwrite it with the FD
number when FDs are passed on the monitor. Otherwise we will read NULL
path if the FD is accessed before being passed on the monitor. The idea
of this helper is to simplify the monitor code so it would be
counterproductive to have other behaviour.
Fixes the following symptom:
$ virsh attach-interface cd network default --model virtio
error: Failed to attach interface
error: internal error: unable to execute QEMU command 'netdev_add': File descriptor named '(null)' has not been found
Fixes: bca9047906
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/318
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2092856
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Obtaining a screenshot via virDomainScreenshot() works like this:
1) we create a temp file, label it, then
2) tell QEMU to store the screenshot into it, and
3) finally, open the file for transfer via virStream
Since the file is just temporary and even explicitly unlinked at
the end, no seclabel restoration is done. This makes perfect
sense for security models which attach a label to file itself
(DAC, SELinux) because the label is gone with the file. However,
for models where a list of files and allowed actions is kept on a
side (AppArmor) this approach means we just append files into the
profile and never remove them. In turn, the file grows and policy
update takes longer with each entry.
Restore the seclabel for AppArmor's sake.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
The virStorageSourceGetActualType() function returns either
virStorageSource->type (which is of type virStorageType), or
virStorageSourcePoolDef->type, which really stores a value of the
same enum. Thus, the latter struct can be changed so that the
virStorageSourceGetActualType() function can return correct type
instead of generic int.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
There are three places (two in domain_conf.c and one in
qemu_migration.c) where a virStorageSource->type is typecasted to
virStorageType (for the purpose of catching missing enum member
in a switch() statement at compile time). This is needless,
because as of v8.2.0-rc1~120 the struct member is of proper type.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
There are two functions declared in qemu_capspriv.h:
1) virQEMUCapsInitHostCPUModel() which is not used anywhere but
qemu_capabilities.c,
2) virQEMUCapsSetSEVCapabilities() which is my personal favorite
but despite that it's never implemented nor called.
Drop them.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
The virDomainObj struct has @pid member where the domain's
hypervisor PID is stored (e.g. QEMU/bhyve/libvirt_lxc/... PID).
However, we are not consistent when it comes to shutoff state.
Initially, because virDomainObjNew() uses g_new0() the @pid is
initialized to 0. But when domain is shut off, some functions set
it to -1 (virBhyveProcessStop, virCHProcessStop, qemuProcessStop,
..).
In other places, the @pid is tested to be 0, on some other places
it's tested for being negative and in the rest for being
positive.
To solve this inconsistency we can stick with either value, -1 or
0. I've chosen the latter as it's safer IMO. For instance if by
mistake we'd kill(vm->pid, SIGTERM) we would kill ourselves
instead of init's process group.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
When cleaning up after stopped domain, one of the things we do is
attempt to clear QoS settings on OVS type interfaces. Well, this
is needless because they were removed just a couple of lines
above. As a result, the attempt fails and a warning is printed
into logs, polluting them needlessly.
Closes: https://gitlab.com/libvirt/libvirt/-/issues/313
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Detected by gcc 11 -Wformat-overflow:
../../src/qemu/qemu_driver.c: In function ‘qemuDomainBlockJobAbort’:
../../src/util/virerror.h:176:5: warning: ‘%s’ directive argument is null [-Wformat-overflow=]
176 | virReportErrorHelper(VIR_FROM_THIS, code, __FILE__, \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
177 | __FUNCTION__, __LINE__, __VA_ARGS__)
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../src/qemu/qemu_driver.c:14475:17: note: in expansion of macro ‘virReportError’
14475 | virReportError(VIR_ERR_OPERATION_FAILED,
| ^~~~~~~~~~~~~~
../../src/qemu/qemu_driver.c:14476:73: note: format string is defined here
14476 | _("block job '%s' failed while pivoting: %s"),
| ^~
Signed-off-by: Scott Davis <scott.davis@starlab.io>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It always points to QEMU driver, which is quite redundant as all
callbacks also get a pointer to a vm object. Let's get the driver
pointer from there instead.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
All callers (QMP event handlers) always pass non-NULL vm pointer. Let's
make the parameter mandatory.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Allocating and filling qemuProcessEvent structure is a repeated pattern
before all calls to qemuProcessEventSubmit. We can move the allocation
inside this function and let callers pass all arguments directly.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In qemu_extdevice.c lives code that handles helper daemons that
are required for some types of devices (e.g. virtiofsd,
vhost-user-gpu, swtpm, etc.). These devices have their own
handling code in separate files, with only a very basic functions
exposed (e.g. for starting/stopping helper process, placing it
into given CGroup, etc.). And these functions all work over a
single instance of device (virDomainVideoDef *, virDomainFSDef *,
etc.), except for TPM handling code which takes virDomainDef *
and iterates over it inside its module.
Remove this oddness and make qemuExtTPM*() functions look closer
to the rest of the code.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
We have virDomainUpdateDeviceFlags() API that allows changing of
some attributes of a device whilst domain is still running (e.g.
setting different QoS, link state change on vNICs). But only very
limited set of attributes can be changed and we have to check
whether user isn't trying to sneak in a change that's not
allowed. Well, in case of a virtio vNIC we forgot to check for
@rss and @rss_hash_report attributes of <driver/>.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2082540
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Fix identation of virQEMUCapsUpdateHostCPUModel() params.
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
My recent commit v8.3.0-201-gc500955e95 tried to fix a regression which
would cause the function to return success even if virCloseCallbacksSet
failed. But due to a strange code flow in the function introduced an
opposite regression. The function would return NULL on success when
called without VIR_MIGRATE_CHANGE_PROTECTION flag.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Refactor ccw data structure virDomainDeviceCCWAddress into util virccw.h
and rename it as virCCWDeviceAddress.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Commit v8.3.0-152-g49ef0f95c6 removed explicit VIR_FREE from
qemuMigrationBegin, effectively reverting v1.2.14-57-g77ddd0bba2
The xml variable was used to hold the return value and thus had to be
unset when an error happened after xml was already non-NULL. Such code
may be quite confusing though and we usually avoid it by not storing
anything to a return variable until everything succeeded.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Like a Spice port, a dbus serial must specify an associated channel name.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
By default, libvirt will start a private bus and tell QEMU to connect to
it. Instead, a D-Bus "address" to connect to can be specified, or the
p2p mode enabled.
D-Bus display works best with GL & a rendernode, which can be specified
with <gl> child element.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Start the daemon if necessary (it is already stopped in qemuProcessStop)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Remove the argument from the function prototypes and the callback
handler.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>