The migration code was using few blockdev bits before blockdev was
fully integrated to allow TLS with NBD.
Since we now always use blockdev we can remove the check.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We no longer need the arguments which were conditionally filled based on
presence of the QEMU_CAPS_BLOCKDEV feature.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Assume that QEMU_CAPS_BLOCKDEV is present and remove all code executed
when it's not.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The only caller doesn't check the return value and actually doesn't have
one either. Remove the return value and adjust return statements.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Since we started handling the monitor EOF event inside a job any code
which uses virDomainObjWait would no longer properly abort in case when
the VM crashed during the wait.
This is because virDomainObjWait uses virDomainObjIsActive which checks
'vm->def->id' to see if the VM is still active. Unfortunately the domain
id is cleared in qemuProcessStop which is run only inside the job.
To fix this we can use the 'beingDestroyed' flag stored in the VM
private data which is set to true around the time when the condition is
signalled.
Reported-by: Pavel Hrdina <phrdina@redhat.com>
Fixes: 8c9ff9960b
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The qemu code will need to check other qemu-private conditions when
reporting success for waiting. Thus we must replace all use of it with a
qemu-specific helper. For now the helper forwards directly to
virDomainObjWait.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This call to qemuMigrationSrcIsAllowedHostdev() (which does a
hardcoded fail of the migration if there is any PCI or mdev hostdev
device in the domain) while doing the destination side of migration
prep was found once the call to that same function was removed from
the source side migration prep (commit 25883cd5).
According to jdenemar, for the V2 migration protocol, prep of the
destination is the first step, so this *was* the proper place to do
the check, but for V3 migration this is in a way redundant (since we
will have already done the check on the source side (updated by
25883cd5 to query QEMU rather than do a hardcoded fail)).
Of course it's possible that the source could support migration of a
particular VFIO device, but the destination doesn't. But the current
check on the destination side is worthless even in that case, since it
is just *always* failing rather than querying QEMU; and QEMU can't be
queried at the point where the destination check is happening, since
it isn't yet running.
Anyway QEMU should complain when it's started if it's going to fail,
so removing this check should just move the failure to happen a bit
later. So the best solution to this problem is to simply remove the
hardcoded check/fail from qemuMigrationDstPrepareFresh() and rely on
QEMU to fail if it needs to.
Fixes: 25883cd5f0
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Commit v8.4.0-287-gd4d3bb8130 tried to make sure the original
pre-migration memory locking limit is restored at the end of migration,
but it missed the case when libvirt daemon is restarted during
migration which needs to be aborted on reconnect.
And if this was not enough, I forgot to actually save the status XML
after setting the field in priv (in the commit mentioned above and also
in v8.4.0-291-gd375993ab3).
https://bugzilla.redhat.com/show_bug.cgi?id=2107424
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We keep original values of migration parameters so that we can restore
them at the end of migration to make sure later migration does not use
some random values. However, this does not really work when libvirt
daemon is restarted on the source host because we failed to explicitly
save the status XML after getting the migration parameters from QEMU.
Actually it might work if the status XML is written later for some other
reason such as domain state change, but that's not how it should work.
https://bugzilla.redhat.com/show_bug.cgi?id=2107892
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Commit 6262752460 added the acquiring of a job, but it is not always
VIR_ASYNC_JOB_MIGRATION_OUT, so the code fails when doing save or anything else.
Correct the async job by passing it from the caller as another parameter.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
libvirt currently will block migration for any vfio-assigned device
unless it is a network device that is associated with a virtio-net
failover device (ie. if the hostdev object has a teaming->type ==
VIR_DOMAIN_NET_TEAMING_TYPE_TRANSIENT).
In the future there will be other vfio devices that can be migrated,
so we don't want to rely on this hardcoded block. QEMU 6.0+ will
anyway inform us of any devices that will block migration (as a part
of qemuDomainGetMigrationBlockers()), so we only need to do the
hardcoded check in the case of old QEMU that can't provide that
information.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
The new code that queries QEMU about migration blockers was put at the
top of qemuMigrationSrcIsAllowed(), but that function can also be
called in the case of offline migration (ie when the domain is
inactive / QEMU isn't running). This check should have been put inside
the "if (!(flags & VIR_MIGRATE_OFFLINE))" conditional, so let's move
it there.
Fixes: 156e99f686
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
The code is run with an async job and thus needs to make sure a nested
job is acquired before entering the monitor.
While touching the code in qemuMigrationSrcIsAllowed I also fixed the
grammar which was accidentally broken by v8.5.0-140-g2103807e33.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
vDPA devices will be migratable soon, so we shouldn't unconditionally
block migration of any domain with a vDPA device. Instead, we should
rely on QEMU to make the decision when that info is available from the
query-migrate QMP command (QEMU versions too old to have that info in
the results of query-migrate don't support migration of vDPA devices,
so in that case we will continue to unconditionally block migration).
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Since QEMU 6.0, if QEMU knows that a migration would fail,
'query-migrate' will return an array of error strings describing the
migration blockers. This can be used to check whether there are any
devices/conditions blocking migration.
This patch adds a call to this query at the top of
qemuMigrationSrcIsAllowed().
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
This patch moves qemuDomainJobObj into hypervisor/ as generalized
virDomainJobObj along with generalized private job callbacks as
virDomainObjPrivateJobCallbacks.
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The 'max-bandwidth' field was added as argument of
'migrate-set-parameters' in qemu-2.8, thus all qemu version supported by
libvirt already use the new code path.
This patch assumes the presence and removes the legacy code paths.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When resuming post-copy migration users may want to limit the bandwidth
used by the migration and use a value that is different from the one
specified when the migration was originally started.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/333
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
So the we can apply selected migration parameters even when resuming
post-copy migration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The flags will later be used to determine which parameters should
actually be applied.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
My original commit v8.4.0-288-gf01fc4d119 accidentally forgot to fix
both instances of the same problem. While it fixed the destination side
of migration, the source one remained broken.
However, that commit was also wrong in saying the issue could have
caused unlimited memory locking to be allowed for QEMU when RDMA
migration was used. It could not, because the code would refuse to even
think about starting RDMA migration if hard_limit was not set. But
avoiding the "mem.hard_limit > 0" check is useful anyway.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Our documentation says RDMA migration requires hard_limit to be set so
that we know how big memory locking limit should be set for the domain
during migration. But since commit v1.2.13-71-gcf521fc8ba (which changed
the default hard_limit value from 0 to
VIR_DOMAIN_MEMORY_PARAM_UNLIMITED) we were actually setting memlock
limit to unlimited if hard_limit was not set.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
For RDMA migration we update memory locking limit, but never set it back
once migration finishes (on the destination host) or aborts (on the
source host).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This helper will not try to set the limit if it is already big enough,
which may be useful when libvirt daemon is running in a containerized
environment and is not allowed to change memory locking limit.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Convert all the cases where we can unconditionally free
the virURI at the end of scope.
In libxlDomainMigrationDstPrepare, uri is only filled
if uri_in was present, so moving the virURIFree out of
the condition is safe.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
They were constructed from two separate strings using "%s: %s", which
is ugly and does not work well with translations.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
This is a special job for operations that need to modify domain state
during an active migration. The modification must not affect any state
that could conflict with the migration code. This is useful mainly for
event handlers that need to be processed during migration and which
could otherwise time out on acquiring a normal MODIFY job.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Every single call to qemuMigrationJobContinue needs to register a
cleanup callback in case the migrating domain dies between phases or
when migration is paused due to a failure in postcopy mode.
Let's integrate registering the callback in qemuMigrationJobContinue to
make sure the current thread does not release a migration job without
setting a cleanup callback.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The callback will properly cleanup non-p2p migration job in case the
migrating domain dies between Begin and Perform while the client which
controls the migration is not cooperating (normally the API for the next
migration phase would handle this).
The same situation can happen even after Prepare and Perform phases, but
they both already register a suitable callback, so no fix is needed
there.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Normally the structure is created once the source reports completed
migration, but with post-copy migration we can get here even after
libvirt daemon was restarted. It doesn't make sense to preserve the
structure in our status XML as we're going to rewrite almost all of it
while refreshing the stats anyway. So we just create the structure here
if it doesn't exist to make sure we can properly report statistics of a
completed migration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Everything was already done in the normal Finish phase and vCPUs are
running. We just need to wait for all remaining data to be transferred.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The QEMU process is already running, all we need to do is to call
migrate-recover QMP command. Except for some checks and cookie handling,
of course.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Non-postcopy case talks to QEMU monitor and thus needs to create a
nested job. Since qemuMigrationAnyConnectionClosed is called in case
there's no thread processing a migration API, we need to make the
current thread a temporary owner of the migration job to avoid "This
thread doesn't seem to be the async job owner: 0". This is done by
starting a migration phase.
While no monitor interaction happens in postcopy case and just setting
the phase (to indicate a broken postcopy migration) would be enough,
being consistent and setting the owner does not hurt anything.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
To prepare the code for handling incoming migration too.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The function is now called qemuMigrationAnyConnectionClosed to make it
clear it is supposed to handle broken connection during migration. It
will soon be used on both sides of migration so the "Src" part was changed
to "Any" to avoid renaming the function twice in a row.
The original *Cleanup name could easily be confused with cleanup
callbacks called when a domain is destroyed.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Offline migration jumps over a big part of qemuMigrationDstPrepareFresh.
Let's move that part into a new qemuMigrationDstPrepareActive function
to make the code easier to follow.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Moves most of the code from qemuMigrationDstPrepareAny to a new
qemuMigrationDstPrepareFresh so that qemuMigrationDstPrepareAny can be
shared with post-copy resume.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>