When a domain is killed on the source host while it is being migrated
and libvirtd is waiting for the migration to finish (waiting for the
domain condition in qemuMigrationSrcWaitForCompletion), the run-time
state including priv->job.current may already be freed once
virDomainObjWait returns with -1. Thus the priv->job.current pointer
cached in jobInfo is no longer valid and setting jobInfo->status may
crash the daemon.
https://bugzilla.redhat.com/show_bug.cgi?id=1593137
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Once we called qemuDomainObjEnterRemote to talk to the destination
daemon during a peer to peer migration, the vm lock is released and we
only hold an async job. If the source domain dies at this point the
monitor EOF callback is allowed to do its job and (among other things)
clear all private data irrelevant for stopped domain. Thus when we call
qemuDomainObjExitRemote, the domain may already be gone and we should
avoid touching runtime private data (such as current job info).
In other words after acquiring the lock in qemuDomainObjExitRemote, we
need to check the domain is still alive. Unless we're doing offline
migration.
https://bugzilla.redhat.com/show_bug.cgi?id=1589730
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The variable is used to store the offline migration capability of the
destination daemon. Let's call it 'dstOffline' so that we can later use
'offline' to indicate whether we were asked to do offline migration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
And replace all calls with virObjectEventStateQueue such that:
qemuDomainEventQueue(driver, event);
becomes:
virObjectEventStateQueue(driver->domainEventState, event);
And remove NULL checking from all callers.
Signed-off-by: Anya Harter <aharter@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Replace instances where we previously called virGetLastError just to
either get the code or to check if an error exists with
virGetLastErrorCode to avoid a validity pre-check.
Signed-off-by: Ramy Elkest <ramyelkest@gmail.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
The alias of the secret for decrypting the TLS passphrase is useless
besides for TLS setup. Stop passing it around.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This has been broken since commit v4.0.0-165-g93412bb827 which added
jobInfo->statsType enum to distinguish various statistics types. During
migration the type will always be QEMU_DOMAIN_JOB_STATS_TYPE_MIGRATION,
however the destination code consuming the statistics data from
migration cookie failed to properly set the type. So even though
everything was filled in, the type remained *_NONE and any attempt to
fetch the statistics data of a completed migration on the destination
host failed.
https://bugzilla.redhat.com/show_bug.cgi?id=1584071
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Implement the secure way to transport non-shared storage data across
migrations. The new approach uses blockdev-add to create the NBD client
so that the TLS secret object can be specified.
https://bugzilla.redhat.com/show_bug.cgi?id=1300772
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Separate the code relevant for this approach so that we can later add a
second implementation without making the function messy.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Drop the mention of 'drive mirror' from the function names and mention
NBD. This will help when adding the 'blockdev mirror' migration code
which will allow using TLS.
Additionally fix some of the function comments to make more sense
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
The initiation of a synchronous block job in the NBD storage migration
code was placed after entering the monitor thus after the lock on the VM
object was unlocked. Thankfully nothing bad could happen in this
situation since the migration job prevents any disk detaches or other
modifications of the domain object.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
The pointer to the qemu driver is already included in domain object's
private data, so does not need to be passed as yet another parameter
when the domain object is already passed.
Also removes parameter 'driver' from functions which had it just because of
qemuBlockJobUpdate.
Signed-off-by: Roland Schulz <schullzroll@gmail.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
When adding a new object to the domain object list, there should
have been 2 virObjectRef calls made one for each list into which
the object was placed to match the 2 virObjectUnref calls that
would occur during Remove as part of virHashRemoveEntry when
virObjectFreeHashData is called when the element is removed from
the hash table as set up in virDomainObjListNew.
Some drivers (libxl, lxc, qemu, and vz) handled this inconsistency
by calling virObjectRef upon successful return from virDomainObjListAdd
in order to use virDomainObjEndAPI when done with the returned @vm.
While others (bhyve, openvz, test, and vmware) handled this via only
calling virObjectUnlock upon successful return from virDomainObjListAdd.
This patch will "unify" the approach to use virDomainObjEndAPI
for any @vm successfully returned from virDomainObjListAdd.
Because list removal is so tightly coupled with list addition,
this patch fixes the list removal algorithm to return the object
as entered - "locked and reffed". This way, the callers can then
decide how to uniformly handle add/remove success and failure.
This removes the onus on the caller to "specially handle" the
@vm during removal processing.
The Add/Remove logic allows for some logic simplification such
as in libxl where we can Remove the @vm directly rather than
needing to set a @remove_dom boolean and removing after the
libxlDomainObjEndJob completes as the @vm is locked/reffed.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Use the TLS env for migration when starting the NBD server if TLS is
enabled for migration.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
To allow encryption of the non-shared storage migration NBD connection
we will need to instantiated the NBD server with the TLS env.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
When a VM is destroyed while being migrated (waiting in
qemuMigrationSrcWaitForCompletion) the private object cleanup code frees
the 'current' job info. Since the migration code attempts to setup
various aspects of the current job even on failure this results into a
crash.
Job data is cleared in qemuDomainObjPrivateDataClear since commit
888aa4b6b9db
Fix this by skipping all of the code which requires the qemu process to
be alive if the VM is not active any more.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Since libvirt is currently not able to setup the NBD migration stream
secured by TLS we should not allow such migration since data would be
transferred unencrypted.
This will break compatibility of TLS migration if non-shared storage is
requested but the security implications are more severe.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Trying to delete the non-existent TLS objects results in ugly error
messages in the log, which could easily confuse users. Let's avoid this
confusion by not trying to delete the objects if we were not asked to
enable TLS migration and thus we didn't created the objects anyway.
This patch restores the behavior to the state before "qemu: Reset all
migration parameters".
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We store the flags passed to the API which started the migration. Let's
use them instead of a separate bool to check if post-copy migration was
requested.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When an async job is running, we sometimes need to know how it was
started to distinguish between several types of the job, e.g., post-copy
vs. normal migration. So far we added a specific bool item to
qemuDomainJobObj for such cases, which doesn't scale very well and
storing such bools in status XML would be painful so we didn't do it.
A better approach is to store the flags passed to the API which started
the async job, which can be easily stored in status XML.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When an always-on migration capability is supposed to be enabled on both
sides of migration, each side can only enable the feature if it is
enabled by the other side.
Thus the source host sends a list of supported migration capabilities in
the migration cookie generated in the Begin phase. The destination host
consumes the list in the Prepare phase and decides what capabilities can
be enabled when starting a QEMU process for incoming migration. Once
done the destination sends the list of supported capabilities back to
the source where it is used during the Perform phase to determine what
capabilities can be automatically enabled.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Some migration capabilities may be enabled automatically, but only if
both sides of migration support them. Thus we need to be able transfer
the list of supported migration capabilities in migration cookie.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Since every parameter or capability set in qemuMigrationCompression
structure is now reflected in qemuMigrationParams structure, we can
replace qemuMigrationAnyCompressionDump with a new API which will work
on qemuMigrationParams.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There's no need to call this API explicitly in the migration code. We
can pass the compression parameters to qemuMigrationParamsFromFlags and
it can internally call qemuMigrationParamsSetCompression to apply them
to the qemuMigrationParams structure.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Propagate the calls up the stack to the point where
qemuMigrationParamsFromFlags is called. The end goal achieved in the
following few patches is to merge compression parameters into the
general migration parameters code.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Most migration capabilities are directly connected with
virDomainMigrateFlags so qemuMigrationParamsFromFlags can automatically
enable them.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Some migration capabilities are always enabled if QEMU supports them. We
can just drop the explicit code for them and let
qemuMigrationParamsCheck automatically set such capabilities.
QEMU_MONITOR_MIGRATION_CAPS_EVENTS would normally be one of the always
on features, but it is the only feature we want to enable even for other
jobs which internally use migration (such as save and snapshot). Hence
this capability is set very early after libvirtd connects to QEMU
monitor.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It's just a tiny wrapper around qemuMigrationParamsSetCapability and
setting priv->job.postcopyEnabled is not something qemuMigrationParams
code should be doing anyway so let the callers do it.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Every migration entry point in qemu_driver is supposed to call
qemuMigrationParamsFromFlags to transform flags and parameters into
qemuMigrationParams structure and pass the result to qemuMigration*
APIs.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Instead of checking each capability at the time we want to set it in
qemuMigrationParamsSetCapability we can check all of them at once in
qemuMigrationParamsCheck.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We reached the point when qemuMigrationParamsApply is the only API which
sends migration parameters and capabilities to QEMU. Thus all but the
TLS parameters can be set before we ask QEMU for the current values of
all parameters in qemuMigrationParamsCheck.
Supported migration capabilities are queried as soon as libvirt connects
to QEMU monitor so we can check them anytime.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We reached the point when qemuMigrationParamsApply is the only API which
sends migration parameters and capabilities to QEMU. Thus all but the
TLS parameters can be set before we ask QEMU for the current values of
all parameters in qemuMigrationParamsCheck.
Supported migration capabilities are queried as soon as libvirt connects
to QEMU monitor so we can check them anytime.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Prefer xbzrle-cache-size migration parameter over the special
migrate-set-cache-size QMP command.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Migration capabilities are closely related to migration parameters and
it makes sense to keep them in a single data structure. Similarly to
migration parameters the capabilities are all send to QEMU at once in
qemuMigrationParamsApply, all other APIs operate on the
qemuMigrationParams structure.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The new name is qemuMigrationParamsApply and it will soon become the
only API which will send all requested migration parameters and
capabilities to QEMU. All other qemuMigrationParams* APIs will just
operate on the qemuMigrationParams structure.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There's no real reason for qemuMigrationParamsEnableTLS to require the
callers to pass a valid virQEMUDriverConfigPtr, it can just call
virQEMUDriverGetConfig.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function checks whether QEMU supports TLS migration and stores the
original value of tls-creds parameter to priv->migTLSAlias. This is no
longer needed because we already have the original value stored in
priv->migParams.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The code can be merged directly in qemuMigrationParamsAddTLSObjects.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Restore the original values of all migration parameters we store in
qemuDomainJobObj instead of explicitly resting only a limited set of
them.
The result is not strictly equivalent to the previous code wrt reseting
TLS state because the previous code would only reset it if we changed it
before while the new code will reset it always if QEMU supports TLS
migration. This is not a problem for the parameters themselves, but it
can cause spurious errors about missing TLS objects being logged at the
end of non-TLS migration. This issue will be fixed ~50 patches later.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Any job which touches migration parameters will first store their
original values (i.e., QEMU defaults) to qemuDomainJobObj to make it
easier to reset them back once the job finishes.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When connection to the client which controls a non-p2p migration gets
closed between Perform and Confirm phase, we don't know whether the
domain was successfully migrated or not. Thus, we have to leave the
domain paused and just cleanup the migration job and reset migration
parameters.
Previously we didn't reset the parameters and future save or snapshot
operations would see wrong environment (and could fail because of it) in
case the domain stayed running on the source host.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Currently migration parameters are stored in a structure which mimics
the QEMU migration parameters handled by query-migrate-parameters and
migrate-set-parameters. The new structure will become a libvirt's
abstraction on top of QEMU migration parameters, capabilities, and
related stuff.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>