Some code which was supposed to be executed only when migration
succeeded was buried inside the cleanup code.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
When adding a new job state it's useful to let the compiler complain
about places where we need to think about what to do with the new
state.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
All calls to qemuMonitorGetMigrationCapability in QEMU driver are
replaced with qemuMigrationCapsGet.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
We need to send allowReboot in the migration cookie to ensure the same
behavior of the virDomainSetLifecycleAction() API on the destination.
Consider this scenario:
1. On the source the domain is started with:
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
2. User calls an API to set "destroy" for <on_reboot>:
<on_poweroff>destroy</on_poweroff>
<on_reboot>destroy</on_reboot>
<on_crash>destroy</on_crash>
3. The guest is migrated to a different host
4a. Without the allowReboot in the migration cookie the QEMU
process on destination would be started with -no-reboot
which would prevent using the virDomainSetLifecycleAction() API
for the rest of the guest lifetime.
4b. With the allowReboot in the migration cookie the QEMU process
on destination is started without -no-reboot like it was started
on the source host and the virDomainSetLifecycleAction() API
continues to work.
The following patch adds a QEMU implementation of the
virDomainSetLifecycleAction() API and that implementation disallows
using the API if all actions are set to "destroy" because we add
"-no-reboot" on the QEMU command line. Changing the lifecycle action
is in this case pointless because the QEMU process is always terminated.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
When migration fails, QEMU may provide a description of the error in
the reply to query-migrate QMP command. We can fetch this error and use
it instead of the generic "unexpectedly failed" message.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Pass flags to the function rather than just whether we have incoming
migration. This also enforces correct startup policy for USB devices
when reverting from a snapshot.
qemuMigrationPrepareAny called multiple of the functions starting the
qemu process for incoming migration by adding the flags explicitly.
Extract them to a variable so that they can be easily used for other
calls or changed in the future.
Seeing a log message saying 'flags=93' is ambiguous & confusing unless
you happen to know that libvirt always prints flags as hex. Change our
debug messages so that they always add a '0x' prefix when printing flags,
and '0' prefix when printing mode. A few other misc places gain a '0x'
prefix in error messages too.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In case of real migration (not migrating to file on save, dump etc)
migration info is not complete at time qemu finishes migration
in normal (non postcopy) mode. We need to update disks stats,
downtime info etc. Thus let's not expose this job status as
completed.
To archive this let's set status to 'qemu completed' after
qemu reports migration is finished. It is not visible as complete
job to clients. Cookie code on confirm phase will finally turn
job into completed. As we don't need more things to do when
migrating to file status is set to 'completed' as before
in this case.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When getting job info in case mirror does not reach ready phase
fetch mirror stats from qemu. Otherwise mirror stats are already
saved in current job.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Instead of checking stat.status let's set status to migrating
as soon as migrate command is send (waiting for completion
is a good place too).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Setting status to none has little value - getting job status
will not return even elapsed time.
After this patch getting job stats stays correct in a sence
it will not fetch migration stats because it consults
stats.status before doing the fetch.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemuMigrationFetchJobStatus is rather inconvinient. Some of its
callers don't need status to be updated, some don't need to update
elapsed time right away. So let's update status or elapsed time
in callers instead.
This patch drops updating job status on getting job stats by
client. This way we will not provide status 'completed' while
it is not yet updated by migration routine.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This way we get stats only in one place. The former code waits for
complete/postcopy status basically and don't need to mess with stats.
The patch drops raising an error on stats updates failure. This
does not make much sense anyway.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Let's introduce QEMU_DOMAIN_JOB_STATUS_POSTCOPY state for job.current->status
instead of checking job.current->stats.status. The latter can be changed
when fetching migration statistics. Moving state function from the variable
and leave only store function seems more managable.
This patch removes all state checking usage of stats except for
qemuDomainGetJobStatsInternal. This place will be handled separately.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This patch simply switches code from using VIR_DOMAIN_JOB_* to
introduced QEMU_DOMAIN_JOB_STATUS_*. Later this gives us freedom
to introduce states for postcopy and mirroring phases.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
At some places we either already have synchronous job or we just
released it. Also, some APIs might want to use this code without
having to release their job. Anyway, the job acquire code is
moved out to qemuDomainRemoveInactiveJob so that
qemuDomainRemoveInactive does just what it promises.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
At present shared disks can be migrated with either readonly or cache=none. But
cache=directsync should be safe for migration, because both cache=directsync and cache=none
don't use the host page cache, and cache=direct write through qemu block layer cache.
Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
Reviewed-by: Wang Yechao <wang.yechao255@zte.com.cn>
While qemuProcessIncomingDefNew takes an fd argument and stores it in
qemuProcessIncomingDef structure, the caller is still responsible for
closing the file descriptor.
Introduced by commit v1.2.21-140-ge7c6f4575.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Most places which want to check ABI stability for an active domain need
to call this API rather than the original
qemuDomainDefCheckABIStability. The only exception is in snapshots where
we need to decide what to do depending on the saved image data.
https://bugzilla.redhat.com/show_bug.cgi?id=1460952
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Use ATTRIBUTE_FALLTHROUGH, introduced by commit
5d84f5961b8e28e802f600bb2d2c6903e219092e, instead of comments to
indicate that the fall through is an intentional behavior.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
If QEMU is new enough and we have the live updated CPU definition in
either save or migration cookie, we can use it to enforce ABI. The
original guest CPU from domain XML will be stored in private data.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Since the domain XML send during migration uses the original guest CPU
definition but we still want the destination to enforce ABI if it is new
enough, we send the live updated CPU definition in a migration cookie.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When persistent migration of a transient domain is requested but no
custom XML is passed to the migration API we would just let the
destination daemon make a persistent definition from the live definition
itself. This is not a problem now, but once the destination daemon
starts replacing the original CPU definition with the one from migration
cookie before starting a domain, it would need to add more ugly hacks to
reverse the operation. Let's just always send the persistent definition
in the cookie to make things a bit cleaner.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The destination host may not be able to start a domain using the live
updated CPU definition because either libvirt or QEMU may not be new
enough. Thus we need to send the original guest CPU definition.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
While fixing a bug with incorrectly freed memory in commit
v3.1.0-399-g5498aa29a, I accidentally broke persistent migration of
transient domains. Before adding qemuDomainDefCopy in the path, the code
just took NULL from vm->newDef and used it as the persistent def, which
resulted in no persistent XML being sent in the migration cookie. This
scenario is perfectly valid and the destination correctly handles it by
using the incoming live definition and storing it as the persistent one.
After the mentioned commit libvirtd would just segfault in the described
scenario.
https://bugzilla.redhat.com/show_bug.cgi?id=1446205
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When creating v3.2.0-77-g8be3ccd04 commit, I completely forgot that one
migration capability is very special. It's the "events" capability which
tells QEMU to report "MIGRATION" events. Since libvirt always wants the
events, it is enabled in qemuConnectMonitor and the rest of the code
should not touch it.
https://bugzilla.redhat.com/show_bug.cgi?id=1439841https://bugzilla.redhat.com/show_bug.cgi?id=1441165
Messed-up-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Not all async jobs are visible via virDomainGetJobStats (either they are
too fast or getting the stats is not allowed during the job), but
forcing all of them to advertise the operation is easier than hunting
the jobs for which fetching statistics is allowed. And we won't need to
think about this when we add support for getting stats for more jobs.
https://bugzilla.redhat.com/show_bug.cgi?id=1441563
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
0feebab2 adds calling qemuBlockNodeNamesDetect for completed job
on updating block jobs. This affects cancelling drive mirror logic as
this function drops vm lock. Now we have to recheck all disks
before the disk with the completed block job before going
to wait for block job events.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
While peer-to-peer migration enters the Confirm phase even if the
Perform phase fails, the client which initiated a non-p2p migration will
never call virDomainMigrateConfirm* API if the Perform phase failed.
Thus we need to explicitly reset migration before reporting a failure
from the Perform phase API.
https://bugzilla.redhat.com/show_bug.cgi?id=1425003
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Since the disks are copied by qemu, there's no need to enforce
cache=none. Thankfully the code that added qemuMigrateDisk did not break
existing configs, since if you don't select any disk to migrate
explicitly the code behaves sanely.
The logic for determining whether a disk should be migrated is
open-coded since using qemuMigrateDisk twice would be semantically
incorrect.
So far only QEMU_MONITOR_MIGRATION_CAPS_POSTCOPY was reset, but only in
a single code path leaving post-copy enabled in quite a few cases.
https://bugzilla.redhat.com/show_bug.cgi?id=1425003
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
It's only called from qemuMigrationReset now so it doesn't need to be
exported and {tls,sec}Alias are always NULL.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This new API is supposed to reset all migration parameters to make sure
future migrations won't accidentally use them. This patch makes the
first step and moves qemuMigrationResetTLS call inside
qemuMigrationReset.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Migration parameters are either reset by the main migration code path or
from qemuProcessRecoverMigration* in case libvirtd is restarted during
migration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Finished qemuMigrationRun does not mean the migration itself finished
(it might have just switched to post-copy mode). While resetting TLS
parameters is probably OK at this point even if migration is still
running, we want to consolidate the code which resets various migration
parameters. Thus qemuMigrationResetTLS will be called from the Confirm
phase (or at the end of the Perform phase in case of v2 protocol), when
migration is either canceled or finished.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
So far our code is full of the following pattern:
dom = virGetDomain(conn, name, uuid)
if (dom)
dom->id = 42;
There is no reasong why it couldn't be just:
dom = virGetDomain(conn, name, uuid, id);
After all, client domain representation consists of tuple (name,
uuid, id).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Creating a copy of the definition we want to add in a migration cookie
makes the code cleaner and less prone to memory leaks or double free
errors.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>