This is a special job for operations that need to modify domain state
during an active migration. The modification must not affect any state
that could conflict with the migration code. This is useful mainly for
event handlers that need to be processed during migration and which
could otherwise time out on acquiring a normal MODIFY job.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This phase marks a migration protocol as broken in a post-copy phase.
Libvirt is no longer actively watching the migration in this phase as
the migration API that started the migration failed.
This may either happen when post-copy migration really fails (QEMU
enters postcopy-paused migration state) or when the migration still
progresses between both QEMU processes, but libvirt lost control of it
because the connection between libvirt daemons (in p2p migration) or a
daemon and client (non-p2p migration) was closed. For example, when one
of the daemons was restarted.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When recovering from a failed post-copy migration, we need to go through
all migration phases again, but don't need to repeat all the steps in
each phase. Let's create a new set of migration phases dedicated to
post-copy recovery so that we can easily distinguish between normal and
recovery code.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When libvirt daemon is restarted during an active post-copy migration,
we do not always mark the migration as broken. In this phase libvirt is
not really needed for migration to finish successfully. In fact the
migration could have even finished while libvirt was not running or it
may still be happily running.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
So far migration could only be completed while a migration API was
running and waiting for the migration to finish. In case such API could
not be called (the connection that initiated the migration is broken)
the migration would just be aborted or left in a "don't know what to do"
state. But this will change soon and we will be able to successfully
complete such migration once we get the corresponding event from QEMU.
This is specific to post-copy migration when vCPUs are already running
on the destination and we're only waiting for all memory pages to be
transferred. Such post-copy migration (which no-one is actively
watching) is called unattended migration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Normally migrationPort is released in the Finish phase, but we need to
make sure it is properly released also in case qemuMigrationDstFinish is
not called at all. Currently the only callback which is called in this
situation qemuMigrationDstPrepareCleanup which already releases
migrationPort. This patch adds similar handling to additional callbacks
which will be used in the future.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When connection breaks during post-copy migration, QEMU enters
'postcopy-paused' state. We need to handle this state and make the
situation visible to upper layers.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Jobs that are supposed to remain active even when libvirt daemon
restarts were reported as started at the time the daemon was restarted.
This is not very helpful, we should restore the original timestamp.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Since we keep the migration job active when post-copy migration fails,
we need to restore it when reconnecting to running domains.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When migration fails after it already switched to post-copy phase on the
source, but early enough that we haven't called Finish on the
destination yet, we know the vCPUs were not started on the destination
and the source host still has a complete state of the domain. Thus we
can just ignore the fact post-copy phase started and normally abort the
migration and resume vCPUs on the source.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The code for setting up a previously active backup job in
qemuProcessRecoverJob is generalized into a dedicated function so that
it can be later reused in other places.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
It is used for saving job out of domain object. Just like
virErrorPreserveLast is used for errors. Let's make the naming
consistent as Restore would suggest different semantics.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The function can be used as a callback for qemuDomainCleanupAdd to
automatically clean up a migration job when a domain is destroyed.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
There's no need to artificially pause a domain when post-copy fails
from our point of view unless QEMU connection is broken too as migration
may still be progressing well.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The virDomainObj struct has @pid member where the domain's
hypervisor PID is stored (e.g. QEMU/bhyve/libvirt_lxc/... PID).
However, we are not consistent when it comes to shutoff state.
Initially, because virDomainObjNew() uses g_new0() the @pid is
initialized to 0. But when domain is shut off, some functions set
it to -1 (virBhyveProcessStop, virCHProcessStop, qemuProcessStop,
..).
In other places, the @pid is tested to be 0, on some other places
it's tested for being negative and in the rest for being
positive.
To solve this inconsistency we can stick with either value, -1 or
0. I've chosen the latter as it's safer IMO. For instance if by
mistake we'd kill(vm->pid, SIGTERM) we would kill ourselves
instead of init's process group.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
When cleaning up after stopped domain, one of the things we do is
attempt to clear QoS settings on OVS type interfaces. Well, this
is needless because they were removed just a couple of lines
above. As a result, the attempt fails and a warning is printed
into logs, polluting them needlessly.
Closes: https://gitlab.com/libvirt/libvirt/-/issues/313
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It always points to QEMU driver, which is quite redundant as all
callbacks also get a pointer to a vm object. Let's get the driver
pointer from there instead.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
All callers (QMP event handlers) always pass non-NULL vm pointer. Let's
make the parameter mandatory.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Allocating and filling qemuProcessEvent structure is a repeated pattern
before all calls to qemuProcessEventSubmit. We can move the allocation
inside this function and let callers pass all arguments directly.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Like a Spice port, a dbus serial must specify an associated channel name.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
By default, libvirt will start a private bus and tell QEMU to connect to
it. Instead, a D-Bus "address" to connect to can be specified, or the
p2p mode enabled.
D-Bus display works best with GL & a rendernode, which can be specified
with <gl> child element.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Remove the argument from the function prototypes and the callback
handler.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Access the 'driver' struct from the private data rather than the passed
opaque pointer in preparation to remove the opaque pointer.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Unix socket chardevs with FD passing need to use the direct mode so we
need to convert it to use qemuFDPassDirect.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that we store the state of the host FIPS mode setting in the qemu
driver object, we don't need to outsource the logic into
'qemuCheckFips'.
Additionally since we no longer support very old qemu's which would not
yet have --enable-fips we can drop the part of the comment about very
old qemus.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Improve the debug log inside 'qemuBuildCommandLine' to include the name
from the definition and remove useless data such as the pointer to the
qemuDriver object or qemuCaps.
Additionally remove the non-specific debug statements:
VIR_DEBUG("Building emulator command line");
from the two callers of qemuBuildCommandLine.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Introduce 'qemuBuildCommandLineFlags' and use it instead of specific
flag booleans.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Both callers populate the variable when qemuInterfacePrepareSlirp
returned 1. We can save the hassle in the callers by just doing it right
away.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
While 'add-fd' qmp command gives the possibility to find an unused fdset
ID when hot-adding fdsets, such usage is extremely inconvenient.
This patch allows us to track the used fdset id so that we can avoid the
need to check results and thus employ simpler code flow when hot-adding
devices which use FD passing.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It's effectively replaced by checks in qemuFDPassTransfer. This will
simplify cleanup paths on constructing the qemuFDPass object when FDs
are being handled.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Every running QEMU process we are willing to reconnect (i.e., at least
3.1.0) supports migration events and we can assume the capability is
already enabled since last time libvirt daemon connected to its monitor.
Well, it's not guaranteed though. If libvirt 1.2.17 or older was used to
start QEMU 3.1.0 or newer, migration events would not be enabled. And if
the user decides to upgrade libvirt from 1.2.17 to 8.4.0 while the QEMU
process is still running, they would not be able to migrate the domain
because of disabled migration events. I think we do not really need to
worry about this scenario as libvirt 1.2.17 is 7 years old while QEMU
3.1.0 was released only 3.5 years ago. Thus a chance someone would be
running such configuration should be fairly small and a combination with
upgrading 1.2.17 to 8.4.0 (or newer) with running domains should get it
pretty much to zero. The issue would disappear ff the ancient libvirt is
first upgraded to something older than 8.4.0 and then to the current
libvirt.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Add the ability to configure a qemu-vdagent in guest domains. This
device is similar to the spice vdagent channel except that qemu handles
the spice-vdagent protocol messages itself rather than routing them over
a spice protocol channel.
The qemu-vdagent device has two notable configuration options which
determine whether qemu will handle particular vdagent features:
'clipboard' and 'mouse'.
The 'clipboard' option allows qemu to synchronize its internal clipboard
manager with the guest clipboard, which enables client<->guest clipboard
synchronization for non-spice guests such as vnc.
The 'mouse' option allows absolute mouse positioning to be sent over the
vdagent channel rather than using a usb or virtio tablet device.
Sample configuration:
<channel type='qemu-vdagent'>
<target type='virtio' name='com.redhat.spice.0'/>
<source>
<clipboard copypaste='yes'/>
<mouse mode='client'/>
</source>
</channel>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This is a quite an old (created at 2016) patch fixing an issue for at
that time contemporary Fedora 23. virsh reboot returns success (yet
after hanging for a while), VM is rebooted sucessfully too but then
shutdown from inside guest causes reboot and not shutdown.
VM has agent installed. So virsh reboot first tries to reboot VM thru
the agent. The agent calls 'shutdown -r' command. Typically it returns
instantly but on this distro for some reason it takes time. I did not
investigate the cause but the command waits in dbus client code,
probably waits for reply. The libvirt waits 60s for agent command to
execute and then errors out. Next reboot API falls back to ACPI shutdown
which returns successfully thus the reboot command return success too.
Yet shutdown command in guest eventually successfull and guest is truly
rebooted. So libvirt does not receive SHUTDOWN event and fake reboot
flag which is armed on fallback path stays armed. Thus next shutdown
from guest leads to reboot.
The issue has 100% repro on Fedora 23. On modern distros I can't
reproduce it at all. Shutdown command is asynchronous and returns
immediately even if I start some service that ignores TERM signal and
thus shutdown procedure waits for 90s (if I not mistaken) before sending
KILL.
Yet I guess it is nice to have this patch to be more robust.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Nikolay Shirokovskiy <nikolay.shirokovskiy@openvz.org>
Acquiring job introduced in commit [1] to fix a race described in the
commit. Actually it does not help because we get domain in create API
before acuiring job. Then [2] fixed the race but [1] was not reverted even
it is does not required by [2] to work properly.
[1] commit b629c64e5e0a32ef439b8eeb3a697e2cd76f3248
Author: Martin Kletzander <mkletzan@redhat.com>
Date: Thu Oct 30 14:38:35 2014 +0100
qemu: avoid rare race when undefining domain
[2] commit c7d1c139ca3402e875002753952e80ce8054374e
Author: Martin Kletzander <mkletzan@redhat.com>
Date: Thu Dec 11 11:14:08 2014 +0100
qemu: avoid rare race when undefining domain
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@openvz.org>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
SPICE ports cleanup looks overly complicated. We can just set *reserved
flags whenever port is reserved (auto or non auto).
Also *Reserved flags are not cleared on stop in case of reconnect with
autoport (flags are set on reconnect in qemuProcessGraphicsReservePorts
call). Yeah config is freed in the end of stopping domain but still.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@openvz.org>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
VNC websocket port cleanup looks a bit repetetive. Let's set websocketReserved
flag whenever we reserve port (auto or not).
Also websocketReserved flag is not cleared on stop in case of reconnect with
auto port (flags is set on reconnect in qemuProcessGraphicsReservePorts
call). Yeah config is freed in the end of stopping domain but still.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@openvz.org>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Scenario is with two domains with same VNC websocket port.
- start first domain
- start second, it will fail as port is occupied
As a result port will be released which breaks port reservation logic.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@openvz.org>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Code to release VNC port looks repetitive. The reason is there were
originally 2 functions to release ports - for auto and non-auto cases.
Also portReserved flag is not cleared on stop in case of reconnect with
auto port (flags is set on reconnect in qemuProcessGraphicsReservePorts call).
Yeah config is freed in the end of stopping domain but still.
Let's use this flag whenever we reserve port (auto or not). This makes
things clearer.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@openvz.org>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
All QEMU releases currently supported by libvirt already understand
"-incoming defer". We can drop the code handling "-incoming URI".
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The aim of 'restrictive' numatune mode is to rely solely on
CGroups to have QEMU running on configured NUMA nodes. However,
we were never setting the cpuset controller when a domain was
starting up. We are doing so only when
virDomainSetNumaParameters() is called (aka live pinning).
This is obviously wrong. Fortunately, fix is simple as
'restrictive' is similar to 'strict' - every location where
VIR_DOMAIN_NUMATUNE_MEM_STRICT occurs can be audited and
VIR_DOMAIN_NUMATUNE_MEM_RESTRICTIVE case can be added.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2070380
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Since its introduction in v1.3.2-43-gef1fa55e46 there is a dead
code in virDomainCgroupSetupGlobalCpuCgroup() (well,
qemuSetupGlobalCpuCgroup() back then). The code formats NUMA
nodeset but never sets it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>