In libxlDomainMigrationConfirm(), a transient domain is removed
from the domain list after successful migration. Later in cleanup,
the domain object is unlocked, resulting in a crash
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fb4208ed700 (LWP 12044)]
0x00007fb4267251e6 in virClassIsDerivedFrom (klass=0xdeadbeef,
parent=0x7fb42830d0c0) at util/virobject.c:169
169 if (klass->magic == parent->magic)
(gdb) bt
0 0x00007fb4267251e6 in virClassIsDerivedFrom (klass=0xdeadbeef,
parent=0x7fb42830d0c0) at util/virobject.c:169
1 0x00007fb42672591b in virObjectIsClass (anyobj=0x7fb4100082b0,
klass=0x7fb42830d0c0) at util/virobject.c:365
2 0x00007fb42672583c in virObjectUnlock (anyobj=0x7fb4100082b0)
at util/virobject.c:338
3 0x00007fb41a8c7d7a in libxlDomainMigrationConfirm (driver=0x7fb4100404c0,
vm=0x7fb4100082b0, flags=1, cancelled=0) at libxl/libxl_migration.c:583
Fix by setting the virDomainObjPtr to NULL after removing it from
the domain list.
During migration, the libxl driver starts a modify job in the
begin phase, ending the job in the confirm phase. This is
essentially VIR_MIGRATE_CHANGE_PROTECTION semantics, but the
driver does not support that flag. Without CHANGE_PROTECTION
support, the job would never be terminated in error conditions
where migrate confirm phase is not executed. Further attempts
to modify the domain would result in failure to acquire a job
after LIBXL_JOB_WAIT_TIME.
Similar to the qemu driver, end the job in the begin phase.
Protecting the domain object across all phases of migration can
be done in a future patch adding CHANGE_PROTECTION support.
In libxlDomainMigrationPrepare(), a new virDomainObj is created
from the incoming domain def and added to the driver's domain
list, but never removed if there are subsequent failures during
the prepare phase.
targethost# virsh list --all
sourcehost# virsh migrate --live dom xen+ssh://targethost/system
error: operation failed: Fail to create socket for incoming migration.
targethost# virsh list --all
error: Failed to list domains
error: name in virGetDomain must not be NULL
After adding code to remove the domain on prepare failure, noticed
that libvirtd crashed due to double free of the virDomainDef. Similar
to the qemu driver, pass a pointer to virDomainDefPtr so it can be set
to NULL once a virDomainObj is created from it.
Migration code specifies the problematic non-cooperative resume mode
which is a known issue with Xen's libxl [1]. Instead, use the better
supported cooperative mode.
Without this, guests BUG() in xen_irq_resume after failing to bind
still-bound event channels.
[1] http://bugs.xenproject.org/xen/bug/30
This patch adds initial migration support to the libxl driver,
using the VIR_DRV_FEATURE_MIGRATION_PARAMS family of migration
functions.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>