In qemuMigrationSrcRun, we already checked for non-NULL mig
and then dereferenced it. It's only possible for mig to be
NULL in the error section.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
This partially reverts 82592551cb.
When migrating a domain, qemuMigrationDstPrepareAny() is called
which eventually calls qemuProcessLaunch(conn = NULL, flags =
VIR_QEMU_PROCESS_START_AUTODESTROY); But the very first thing
that qemuProcessLaunch does is check if AUTODESTROY flag is set
and @conn is not NULL. Well, it is.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1494454
If a domain disk is stored on local filesystem (e.g. ext4) but is
not being migrated it is very likely that domain is not able to
run on destination. Regardless of share/cache mode.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Range check in virPortAllocatorSetUsed is not useful anymore
when we manage ports for entire unsigned short range values.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Ensure all enum cases are listed in switch statements, or cast away
enum type in places where we don't wish to cover all cases.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
It is very difficult while reading the migration code trying to
understand whether a particular function is being called on the src side
or the dst side, or either. Putting "Src" or "Dst" in the method names will
make this much more obvious. "Any" is used in a few helpers which can be
called from both sides.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
These APIs are not required anywhere outside the migration code so need
not be exported to the rest of the QEMU driver.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The qemuMigrationPrecreateStorage method needs a connection
to access the storage driver. Instead of passing it around,
open it at time of use.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When setting up graphics, we sometimes need to resolve networks,
requiring the caller to pass in a virConnectPtr, except sometimes they
pass in NULL. Use virGetConnectNetwork() to acquire the connection to
the network driver when it is needed.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
During domain startup there are many places where we need to acquire
secrets. Currently code passes around a virConnectPtr, except in the
places where we pass in NULL. So there are a few codepaths where ability
to start guests using secrets will fail. Change to acquire a handle to
the secret driver when needed.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
There is a long standing hack to pass a virConnectPtr into the
qemuMonitorStartCPUs method, so that when the text monitor prompts
for a disk password, we can lookup virSecretPtr objects. This causes
us to have to pass a virConnectPtr around through countless methods
up the call chain....except some places don't have any virConnectPtr
available so have always just passed NULL. We can finally fix this
disastrous design by using virGetConnectSecret() to open a connection
to the secret driver at time of use.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Loadable drivers must never depend on each other. Over time some usage
mistakenly crept in for the storage and network drivers, but now this is
eliminated the syntax-check rules can enforce this separation once more.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The storagePoolLookupByTargetPath() method in the storage driver is used
by the QEMU driver during block migration. If there's a valid use case
for this in the QEMU driver, then external apps likely have similar
needs. Exposing it in the public API removes the direct dependancy from
the QEMU driver to the storage driver.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Convert the stats field in _qemuDomainJobInfo to be a union. This
will allow for the collection of various different types of stats
in the same field.
When starting the async job that will end up being used for stats,
set the @statsType value appropriately. The @mirrorStats are
special and are used with stats.mig in order to generate the
returned job stats for a migration.
Using the NONE should avoid the possibility that some random
async job would try to return stats for migration even though
a migration is not in progress.
For now a migration and a save job will use the same statsType
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
In my first approach in 4b480d1076 I overlooked the comment in
qemuMigrationRunIncoming stating that during actual migration the
qemuMigrationRunIncoming does not wait until the migration is complete
but rather offloads that to the Finish phase of migration.
This means that during actual migration qemuProcessRefreshState was
called prior to qemu actually transferring the full state and thus the
queries did not get the correct information. The approach worked only
for restore, where we wait for the migration to finish during qemu
startup.
Fix the issue by calling qemuProcessRefreshState both from
qemuProcessStart if there's no incomming migration and from
qemuMigrationFinish so that the code actually works as expected.
When migrating a shutoff domain (i.e., offline migration), we have no
statistics to report and thus jobInfo will be NULL in
qemuMigrationFinish.
Broken by me in v3.10.0-183-ge8784e7868.
https://bugzilla.redhat.com/show_bug.cgi?id=1536351
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Libvirt 3.7.0 and earlier libvirt reported a migration job as completed
immediately after QEMU finished sending migration data at which point
migration was not really complete yet. Commit v3.7.0-29-g3f2d6d829e
fixed this, but caused a regression in reporting statistics for
completed jobs which started reporting the job as still running. This
happened because the completed job statistics including the job status
are copied from the running job before we finally mark it as completed.
Let's make sure QEMU_DOMAIN_JOB_STATUS_COMPLETED is always set in the
completed job info even when the job has not finished yet.
https://bugzilla.redhat.com/show_bug.cgi?id=1523036
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Right-aligning backslashes when defining macros or using complex
commands in Makefiles looks cute, but as soon as any changes is
required to the code you end up with either distractingly broken
alignment or unnecessarily big diffs where most of the changes
are just pushing all backslashes a few characters to one side.
Generated using
$ git grep -El '[[:blank:]][[:blank:]]\\$' | \
grep -E '*\.([chx]|am|mk)$$' | \
while read f; do \
sed -Ei 's/[[:blank:]]*[[:blank:]]\\$/ \\/g' "$f"; \
done
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
The parameters used "migrate" prefix which is pretty redundant and
qemuMonitorMigrationParams structure is our internal representation of
QEMU migration parameters and it is supposed to use names which match
QEMU names.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
QEMU identified a race condition between the device state serialization
and the end of storage migration. Both QEMU and libvirt needs to be
updated to fix this.
Our migration work flow is modified so that after starting the migration
we to wait for QEMU to enter "pre-switchover", "postcopy-active", or
"completed" state. Once there, we cancel all block jobs as usual. But if
QEMU is in "pre-switchover", we need to resume the migration afterwards
and wait again for the real end (either "postcopy-active" or
"completed" state).
Old QEMU will just enter either "postcopy-active" or "completed"
directly, which is still correctly handled even by new libvirt. The
"pre-switchover" state will only be entered if QEMU supports it and the
pause-before-switchover capability was enabled. Thus all combinations of
libvirt and QEMU will work, but only new QEMU with new libvirt will
avoid the race condition.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This new capability enables a pause before device state serialization so
that we can finish all block jobs without racing with the end of the
migration. The pause is indicated by "pre-switchover" state. Once we're
done QEMU enters "device" migration state.
This patch just defines the new capability and QEMU migration states and
their mapping to our job states.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Instead of enumerating all states which need to be turned into
QEMU_DOMAIN_JOB_STATUS_FAILED (and failing to add all of them), it's
better to mention just the one which needs to be left alone.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Almost every failure in qemuMigrationRun while we are talking to QEMU
monitor results in a jump to exit_monitor label. The only exception is
removed by this patch.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
The "ret" variable is used for storing the return value of a function
and should not be used as a temporary variable.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Merge cancel and cancelPostCopy sections with the generic error section,
where we can easily decide whether canceling the ongoing migration is
required.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Let cleanup only do things common to both failure and success paths and
move error handling code inside the new "error" section.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Some code which was supposed to be executed only when migration
succeeded was buried inside the cleanup code.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
When adding a new job state it's useful to let the compiler complain
about places where we need to think about what to do with the new
state.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
All calls to qemuMonitorGetMigrationCapability in QEMU driver are
replaced with qemuMigrationCapsGet.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
We need to send allowReboot in the migration cookie to ensure the same
behavior of the virDomainSetLifecycleAction() API on the destination.
Consider this scenario:
1. On the source the domain is started with:
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
2. User calls an API to set "destroy" for <on_reboot>:
<on_poweroff>destroy</on_poweroff>
<on_reboot>destroy</on_reboot>
<on_crash>destroy</on_crash>
3. The guest is migrated to a different host
4a. Without the allowReboot in the migration cookie the QEMU
process on destination would be started with -no-reboot
which would prevent using the virDomainSetLifecycleAction() API
for the rest of the guest lifetime.
4b. With the allowReboot in the migration cookie the QEMU process
on destination is started without -no-reboot like it was started
on the source host and the virDomainSetLifecycleAction() API
continues to work.
The following patch adds a QEMU implementation of the
virDomainSetLifecycleAction() API and that implementation disallows
using the API if all actions are set to "destroy" because we add
"-no-reboot" on the QEMU command line. Changing the lifecycle action
is in this case pointless because the QEMU process is always terminated.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
When migration fails, QEMU may provide a description of the error in
the reply to query-migrate QMP command. We can fetch this error and use
it instead of the generic "unexpectedly failed" message.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Pass flags to the function rather than just whether we have incoming
migration. This also enforces correct startup policy for USB devices
when reverting from a snapshot.
qemuMigrationPrepareAny called multiple of the functions starting the
qemu process for incoming migration by adding the flags explicitly.
Extract them to a variable so that they can be easily used for other
calls or changed in the future.
Seeing a log message saying 'flags=93' is ambiguous & confusing unless
you happen to know that libvirt always prints flags as hex. Change our
debug messages so that they always add a '0x' prefix when printing flags,
and '0' prefix when printing mode. A few other misc places gain a '0x'
prefix in error messages too.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In case of real migration (not migrating to file on save, dump etc)
migration info is not complete at time qemu finishes migration
in normal (non postcopy) mode. We need to update disks stats,
downtime info etc. Thus let's not expose this job status as
completed.
To archive this let's set status to 'qemu completed' after
qemu reports migration is finished. It is not visible as complete
job to clients. Cookie code on confirm phase will finally turn
job into completed. As we don't need more things to do when
migrating to file status is set to 'completed' as before
in this case.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When getting job info in case mirror does not reach ready phase
fetch mirror stats from qemu. Otherwise mirror stats are already
saved in current job.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Instead of checking stat.status let's set status to migrating
as soon as migrate command is send (waiting for completion
is a good place too).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Setting status to none has little value - getting job status
will not return even elapsed time.
After this patch getting job stats stays correct in a sence
it will not fetch migration stats because it consults
stats.status before doing the fetch.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemuMigrationFetchJobStatus is rather inconvinient. Some of its
callers don't need status to be updated, some don't need to update
elapsed time right away. So let's update status or elapsed time
in callers instead.
This patch drops updating job status on getting job stats by
client. This way we will not provide status 'completed' while
it is not yet updated by migration routine.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This way we get stats only in one place. The former code waits for
complete/postcopy status basically and don't need to mess with stats.
The patch drops raising an error on stats updates failure. This
does not make much sense anyway.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Let's introduce QEMU_DOMAIN_JOB_STATUS_POSTCOPY state for job.current->status
instead of checking job.current->stats.status. The latter can be changed
when fetching migration statistics. Moving state function from the variable
and leave only store function seems more managable.
This patch removes all state checking usage of stats except for
qemuDomainGetJobStatsInternal. This place will be handled separately.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This patch simply switches code from using VIR_DOMAIN_JOB_* to
introduced QEMU_DOMAIN_JOB_STATUS_*. Later this gives us freedom
to introduce states for postcopy and mirroring phases.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
At some places we either already have synchronous job or we just
released it. Also, some APIs might want to use this code without
having to release their job. Anyway, the job acquire code is
moved out to qemuDomainRemoveInactiveJob so that
qemuDomainRemoveInactive does just what it promises.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
At present shared disks can be migrated with either readonly or cache=none. But
cache=directsync should be safe for migration, because both cache=directsync and cache=none
don't use the host page cache, and cache=direct write through qemu block layer cache.
Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
Reviewed-by: Wang Yechao <wang.yechao255@zte.com.cn>
While qemuProcessIncomingDefNew takes an fd argument and stores it in
qemuProcessIncomingDef structure, the caller is still responsible for
closing the file descriptor.
Introduced by commit v1.2.21-140-ge7c6f4575.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>