This patch exports qemuMigrationIsAllowed and adds a new parameter to it
to denote if it's a remote migration or a local migration. Local
migrations are used in snapshots and saving of the machine state and
have fewer restrictions. This patch also adjusts callers of the function
and tweaks some error messages to be more universal.
False positive, but it breaks the build with gcc-4.6.3.
qemu/qemu_migration.c:2931:37: error: 'offline' may be used
uninitialized in this function [-Werror=uninitialized]
qemu/qemu_migration.c:2887:10: note: 'offline' was declared here
Offline migration transfers inactive definition of a domain (which may
or may not be active). After successful completion, the domain remains
in its current state on source host and is defined but inactive on
destination host. It's a bit more clever than virDomainGetXMLDesc() on
source host followed by virDomainDefineXML() on destination host, as
offline migration will run pre-migration hook to update the domain XML
on destination host. Currently, copying non-shared storage is not
supported during offline migration.
Offline migration can be requested with a new migration flag called
VIR_MIGRATE_OFFLINE (which has to be combined with
VIR_MIGRATE_PERSIST_DEST flag).
Remove the obsolete 'qemud' naming prefix and underscore
based type name. Introduce virQEMUDriverPtr as the replacement,
in common with LXC driver naming style
Currently, if user calls virDomainAbortJob we just issue
'migrate_cancel' and hope for the best. However, if user calls
the API in wrong phase when migration hasn't been started yet
(perform phase) the cancel request is just ignored. With this
patch, the request is remembered and as soon as perform phase
starts, migration is cancelled.
The libvirt coding standard is to use 'function(...args...)'
instead of 'function (...args...)'. A non-trivial number of
places did not follow this rule and are fixed in this patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
For now, disk migration via block copy job is not implemented in
libvirt. But when we do implement it, we have to deal with the
fact that qemu does not yet provide an easy way to re-start a qemu
process with mirroring still intact. Paolo has proposed an idea
for a persistent dirty bitmap that might make this possible, but
until that design is complete, it's hard to say what changes
libvirt would need. Even something like 'virDomainSave' becomes
hairy, if you realize the implications that 'virDomainRestore'
would be stuck with recreating the same mirror layout.
But if we step back and look at the bigger picture, we realize that
the initial client of live storage migration via disk mirroring is
oVirt, which always uses transient domains, and that if a transient
domain is destroyed while a mirror exists, oVirt can easily restart
the storage migration by creating a new domain that visits just the
source storage, with no loss in data.
We can make life a lot easier by being cowards for now, forbidding
certain operations on a domain. This patch guarantees that we
never get in a state where we would have to restart a domain with
a mirroring block copy, by preventing saves, snapshots, migration,
hot unplug of a disk in use, and conversion to a persistent domain
(thankfully, it is still relatively easy to 'virsh undefine' a
running domain to temporarily make it transient, run tests on
'virsh blockcopy', then 'virsh define' to restore the persistence).
Later, if the qemu design is enhanced, we can relax our code.
The change to qemudDomainDefine looks a bit odd for undoing an
assignment, rather than probing up front to avoid the assignment,
but this is because of how virDomainAssignDef combines both a
lookup and assignment into a single function call.
* src/conf/domain_conf.h (virDomainHasDiskMirror): New prototype.
* src/conf/domain_conf.c (virDomainHasDiskMirror): New function.
* src/libvirt_private.syms (domain_conf.h): Export it.
* src/qemu/qemu_driver.c (qemuDomainSaveInternal)
(qemuDomainSnapshotCreateXML, qemuDomainRevertToSnapshot)
(qemuDomainBlockJobImpl, qemudDomainDefine): Prevent dangerous
actions while block copy is already in action.
* src/qemu/qemu_hotplug.c (qemuDomainDetachDiskDevice): Likewise.
* src/qemu/qemu_migration.c (qemuMigrationIsAllowed): Likewise.
Transport Open vSwitch per-port data during live
migration by using the utility functions
virNetDevOpenvswitchGetMigrateData() and
virNetDevOpenvswitchSetMigrateData().
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Add the ability for the Qemu V3 migration protocol to
include transporting network configuration. A generic
framework is proposed with this patch to allow for the
transfer of opaque data.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Laine Stump <laine@laine.org>
Having hostuuid in migration cookie is a nice bonus since it provides an
easy way of detecting migration to the same host. However, requiring it
breaks backward compatibility with older libvirt releases.
Recently, patches were added support for (managed)saving, restoring, and
migrating domains with host USB devices. However, qemu driver would
still forbid migration of such domains because qemuMigrationIsAllowed
was not updated.
When p2p migration fails early because qemuMigrationIsAllowed or
qemuMigrationIsSafe say migration should be cancelled, we fail to clear
the migration-out async job. As a result of that, further APIs called
for the same domain may fail with Timed out during operation: cannot
acquire state change lock.
Reported by Guido Winkelmann.
Save/restore with passed through USB devices currently only works if the
USB device can be found at the same USB address where it used to be
before saving a domain. This makes sense in case a user explicitly
configure the USB address in domain XML. However, if the device was
found automatically by vendor/product identification, we should try to
search for that device when restoring the domain and use any device we
find as long as there is only one available. In other words, the USB
device can now be removed and plugged again or the host can be rebooted
between saving and restoring the domain.
Using VIR_DOMAIN_XML_MIGRATABLE flag, one can request domain's XML
configuration that is suitable for migration or save/restore. Such XML
may contain extra run-time stuff internal to libvirt and some default
configuration may be removed for better compatibility of the XML with
older libvirt releases.
This flag may serve as an easy way to get the XML that can be passed
(after desired modifications) to APIs that accept custom XMLs, such as
virDomainMigrate{,ToURI}2 or virDomainSaveFlags.
Recently, there have been some improvements made to qemu so it
supports seamless migration or something very close to it.
However, it requires libvirt interaction. Once qemu is migrated,
the SPICE server needs to send its internal state to the destination.
Once it's done, it fires SPICE_MIGRATE_COMPLETED event and this
fact is advertised in 'query-spice' output as well.
We must not kill qemu until SPICE server finishes the transfer.
https://www.gnu.org/licenses/gpl-howto.html recommends that
the 'If not, see <url>.' phrase be a separate sentence.
* tests/securityselinuxhelper.c: Remove doubled line.
* tests/securityselinuxtest.c: Likewise.
* globally: s/; If/. If/
The current qemu capabilities are stored in a virBitmapPtr
object, whose type is exposed to callers. We want to store
more data besides just the flags, so we need to move to a
struct type. This object will also need to be reference
counted, since we'll be maintaining a cache of data per
binary. This change introduces a 'qemuCapsPtr' virObject
class. Most of the change is just renaming types and
variables in all the callers
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We were failing to react to allocation failure when initializing
a snapshot object list. Changing things to store a pointer
instead of a complete object adds one more possible point of
allocation failure, but at the same time, will make it easier to
react to failure now, as well as making it easier for a future
patch to split all virDomainSnapshotPtr handling into a separate
file, as I continue to add even more snapshot code.
Luckily, there was only one client outside of domain_conf.c that
was actually peeking inside the object, and a new wrapper function
was easy.
* src/conf/domain_conf.h (_virDomainObj): Use a pointer.
(virDomainSnapshotObjListInit): Rename.
(virDomainSnapshotObjListFree, virDomainSnapshotForEach): New
declarations.
(_virDomainSnapshotObjList): Move definitions...
* src/conf/domain_conf.c: ...here.
(virDomainSnapshotObjListInit, virDomainSnapshotObjListDeinit):
Rename...
(virDomainSnapshotObjListNew, virDomainSnapshotObjListFree): ...to
these.
(virDomainSnapshotForEach): New function.
(virDomainObjDispose, virDomainListPopulate): Adjust callers.
* src/qemu/qemu_domain.c (qemuDomainSnapshotDiscard)
(qemuDomainSnapshotDiscardAllMetadata): Likewise.
* src/qemu/qemu_migration.c (qemuMigrationIsAllowed): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSnapshotLoad)
(qemuDomainUndefineFlags, qemuDomainSnapshotCreateXML)
(qemuDomainSnapshotListNames, qemuDomainSnapshotNum)
(qemuDomainListAllSnapshots)
(qemuDomainSnapshotListChildrenNames)
(qemuDomainSnapshotNumChildren)
(qemuDomainSnapshotListAllChildren)
(qemuDomainSnapshotLookupByName, qemuDomainSnapshotGetParent)
(qemuDomainSnapshotGetXMLDesc, qemuDomainSnapshotIsCurrent)
(qemuDomainSnapshotHasMetadata, qemuDomainRevertToSnapshot)
(qemuDomainSnapshotDelete): Likewise.
* src/libvirt_private.syms (domain_conf.h): Export new function.
When entering "confirm" phase, we are interested in the value of
cancelled rather then ret variable which was interesting before "finish"
phase and didn't change since then.
Previously, qemu did not respond to monitor commands during migration if
the limit was too high. This prevented us from raising the limit
earlier. The qemu issue seems to be fixed (according to my testing) and
we may remove the 32Mb/s limit.
Switch virDomainObjPtr to use the virObject APIs for reference
counting. The main change is that virObjectUnref does not return
the reference count, merely a bool indicating whether the object
still has any refs left. Checking the return value is also not
mandatory.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This converts the following public API datatypes to use the
virObject infrastructure:
virConnectPtr
virDomainPtr
virDomainSnapshotPtr
virInterfacePtr
virNetworkPtr
virNodeDevicePtr
virNWFilterPtr
virSecretPtr
virStreamPtr
virStorageVolPtr
virStoragePoolPtr
The code is significantly simplified, since the mutex in the
virConnectPtr object now only needs to be held when accessing
the per-connection virError object instance. All other operations
are completely lock free.
* src/datatypes.c, src/datatypes.h, src/libvirt.c: Convert
public datatypes to use virObject
* src/conf/domain_event.c, src/phyp/phyp_driver.c,
src/qemu/qemu_command.c, src/qemu/qemu_migration.c,
src/qemu/qemu_process.c, src/storage/storage_driver.c,
src/vbox/vbox_tmpl.c, src/xen/xend_internal.c,
tests/qemuxml2argvtest.c, tests/qemuxmlnstest.c,
tests/sexpr2xmltest.c, tests/xmconfigtest.c: Convert
to use virObjectUnref/virObjectRef
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Per the FSF address could be changed from time to time, and GNU
recommends the following now: (http://www.gnu.org/licenses/gpl-howto.html)
You should have received a copy of the GNU General Public License
along with Foobar. If not, see <http://www.gnu.org/licenses/>.
This patch removes the explicit FSF address, and uses above instead
(of course, with inserting 'Lesser' before 'General').
Except a bunch of files for security driver, all others are changed
automatically, the copyright for securify files are not complete,
that's why to do it manually:
src/security/security_selinux.h
src/security/security_driver.h
src/security/security_selinux.c
src/security/security_apparmor.h
src/security/security_apparmor.c
src/security/security_driver.c
Introduce new members in the virMacAddr 'class'
- virMacAddrSet: set virMacAddr from a virMacAddr
- virMacAddrSetRaw: setting virMacAddr from raw 6 byte MAC address buffer
- virMacAddrGetRaw: writing virMacAddr into raw 6 byte MAC address buffer
- virMacAddrCmp: comparing two virMacAddr
- virMacAddrCmpRaw: comparing a virMacAddr with a raw 6 byte MAC address buffer
then replace raw MAC addresses by replacing
- 'unsigned char *' with virMacAddrPtr
- 'unsigned char ... [VIR_MAC_BUFLEN]' with virMacAddr
and introduce usage of above functions where necessary.
QEMU (and librbd) flush the cache on the source before the
destination starts, and the destination does not read any
changeable data before that, so live migration with rbd caching
is safe.
This makes 'virsh migrate' work with rbd and caching without the
--unsafe flag.
Reported-by: Vladimir Bashkirtsev <vladimir@bashkirtsev.com>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Now that domain listing is a thin wrapper around child listing,
it's easier to have a common entry point. This restores the
hashForEach optimization lost in the previous patch when there
are no snapshots being filtered out of the entire list.
* src/conf/domain_conf.h (virDomainSnapshotObjListGetNames)
(virDomainSnapshotObjListNum): Add parameter.
(virDomainSnapshotObjListGetNamesFrom)
(virDomainSnapshotObjListNumFrom): Delete.
* src/libvirt_private.syms (domain_conf.h): Drop deleted functions.
* src/conf/domain_conf.c (virDomainSnapshotObjListGetNames):
Merge, and (re)add an optimization.
* src/qemu/qemu_driver.c (qemuDomainUndefineFlags)
(qemuDomainSnapshotListNames, qemuDomainSnapshotNum)
(qemuDomainSnapshotListChildrenNames)
(qemuDomainSnapshotNumChildren): Update callers.
* src/qemu/qemu_migration.c (qemuMigrationIsAllowed): Likewise.
* src/conf/virdomainlist.c (virDomainListPopulate): Likewise.
If we migrate to fd, spec->fwdType is not MIGRATION_FWD_DIRECT,
we will close spec->dest.fd.local in qemuMigrationRun(). So we
should set spec->dest.fd.local to -1 in qemuMigrationRun().
Bug present since 0.9.5 (commit 326176179).
When we added the default USB controller into domain XML, we efficiently
broke migration to older versions of libvirt that didn't support USB
controllers at all (0.9.4 and earlier) even for domains that don't use
anything that the older libvirt can't provide. We still want to present
the default USB controller in any XML seen by a user/app but we can
safely remove it from the domain XML used during migration. If we are
migrating to a new enough libvirt, it will add the controller XML back,
while older libvirt won't be confused with it although it will still
tell qemu to create the controller.
Similar approach can be used in the future whenever we find out we
always enabled some kind of device without properly advertising it in
domain XML.
Once qemu monitor reports migration has completed, we just closed our
end of the pipe and let migration tunnel die. This generated bogus error
in case we did so before the thread saw EOF on the pipe and migration
was aborted even though it was in fact successful.
With this patch we first wake up the tunnel thread and once it has read
all data from the pipe and finished the stream we close the
filedescriptor.
A small additional bonus of this patch is that real errors reported
inside qemuMigrationIOFunc are not overwritten by virStreamAbort any
more.
When QEMU reported failed or canceled migration, we correctly detected
it but didn't really consider it as an error condition and migration
protocol just went on. Luckily, some of the subsequent steps eventually
failed end we reported an (unrelated and mostly random) error back to
the caller.
In some cases (spotted with broken connection during tunneled migration)
we were overwriting the original error with worse or even misleading
errors generated when we were cleaning up after failed migration.
Currently, we have 3 boolean arguments we have to pass
to qemuProcessStart(). As libvirt grows it is harder and harder
to remember them and their position. Therefore we should
switch to flags instead.
This patch adds a netlink callback when migrating a VEPA enabled
virtual machine. It fixes a Bug where a VM would not request a port
association when it was cleared by lldpad.
This patch requires the latest git version of lldpad to work.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
In case an API fails with "cannot acquire state change lock", searching
for the API that possibly forgot to end its job is not always easy.
Let's keep track of the job owner and print it out for easier
identification.
The code is splattered with a mix of
sizeof foo
sizeof (foo)
sizeof(foo)
Standardize on sizeof(foo) and add a syntax check rule to
enforce it
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In the current V3 migration protocol, Libvirt does not
check the result of the function
qemuMigrationVPAssociatePortProfiles
This means that it is possible for a migration to complete
successfully even when the VM loses network connectivity on
the destination host.
With this change libvirt aborts the migration
(during the "finish" step) when the above function fails, that
is to say when at least one of the port profile associations fails.
Signed-off by: Christian Benvenuti <benve@cisco.com>
Since we defined a custom virURIPtr type, we should use a
virURIFree method instead of assuming it will always be
a typedef for xmlURIPtr
* src/util/viruri.c, src/util/viruri.h, src/libvirt_private.syms:
Add a virURIFree method
* src/datatypes.c, src/esx/esx_driver.c, src/libvirt.c,
src/qemu/qemu_migration.c, src/vmx/vmx.c, src/xen/xend_internal.c,
tests/viruritest.c: s/xmlFreeURI/virURIFree/
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When a client which started non-p2p migration dies in a bad time, the
source libvirtd never clears the migration job and almost nothing can be
done with the domain without restarting the daemon. This patch makes use
of connection close callbacks and ensures that migration job is properly
discarded when the client disconnects.
Destination daemon should not rely on the client or source daemon
(depending on the type of migration) to call Finish when migration
fails, because the client may crash before it can do so. The domain
prepared for incoming migration is set to be destroyed (and migration
job cleaned up) when connection with the client closes but this is not
enough. If the associated qemu process crashes after Prepare step and
the domain is cleaned up before the connection gets closed, autodestroy
is not called for the domain and migration jobs remains set. In case the
domain is defined on destination host (i.e., it is not completely
removed once destroyed) we keep the job set for ever. To fix this, we
register a cleanup callback which is responsible to clean migration-in
job when a domain dies anywhere between Prepare and Finish steps. Note
that we can't blindly clean any job when spotting EOF on monitor since
normally an API is running at that time.
This reverts commit 61f2b6ba5f and most of
commit d8916dc8e2, which effectively
brings back commit ef1065cf5a written by
Jim Fehlig:
The qemu migration speed default is 32MiB/s as defined in migration.c
/* Migration speed throttling */
static int64_t max_throttle = (32 << 20);
There's no need to throttle migration when targeting a file, so set
migration speed to unlimited prior to migration, and restore to libvirt
default value after migration.
Default units is MB for migrate_set_speed monitor command, so
(INT64_MAX / (1024 * 1024)) is used for unlimited migration speed.
This was reverted because migration to file could not be canceled and
even monitored since qemu was not processing any monitor commands until
the migration finished. This is now different as we make sure the
file descriptor we pass to qemu is able to properly report EAGAIN.
Recent qemu changes might have helped as well.
I tested managedsave with this patch in and indeed, it is 10x faster
while I can still monitor its progress.
When host-model and host-passthrouh CPU modes were introduced, qemu
driver was properly modify to update guest CPU definition during
migration so that we use the right CPU at the destination. However,
similar treatment is needed for (managed)save and snapshots since they
need to save the exact CPU so that a domain can be properly restored.
To avoid repetition of such situation, all places that need live XML
share the code which generates it.
As a side effect, this patch fixes error reporting from
qemuDomainSnapshotWriteMetadata().
Currently, startupPolicy='requisite' was determining cold boot
by migrateFrom != NULL. That means, if domain was started up
with migrateFrom set we didn't require disk source path and allowed
it to be dropped. However, on snapshot-revert domain wasn't migrated
but according to documentation, requisite should drop disk source
as well.
This patch includes the following changes to virnetdevmacvlan.c and
virnetdevvportprofile.c:
- removes some netlink functions which are now available in
virnetdev.c
- Adds a vf argument to all port profile functions.
For 802.1Qbh devices, the port profile calls can use a vf argument if
passed by the caller. If the vf argument is -1 it will try to derive the vf
if the device passed is a virtual function.
For 802.1Qbg devices, This patch introduces a null check for the device
argument because during port profile assignment on a hostdev, this argument
can be null.
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
No matter what cache mode is used, readonly disks are always safe wrt
migration. Shared disks are required to be readonly or to disable
host-side cache, which makes them safe as well.
Add de-association handling for 802.1qbg (vepa) via lldpad
netlink messages. Also adds the possibility to perform an
association request without waiting for a confirmation.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
Function xmlParseURI does not remove square brackets around IPv6
address when parsing. One of the solutions is making wrappers around
functions working with xmlURI*. This assures that uri->server will be
always properly assigned and it doesn't have to be changed when used
on some new place in the code.
For this purpose, functions virParseURI and virSaveURI were
added. These function are wrappers around xmlParseURI and xmlSaveUri
respectively.
Also there is one new syntax check function to prohibit these functions
anywhere else.
File changes:
- src/util/viruri.h -- declaration
- src/util/viruri.c -- definition
- src/libvirt_private.syms -- symbol export
- src/Makefile.am -- added source and header files
- cfg.mk -- added sc_prohibit_xmlURI
- all others -- ID name and include fixes
Migrating domains with disks using cache != none is unsafe unless the
disk images are stored on coherent clustered filesystem. Thus we forbid
migrating such domains unless VIR_MIGRATE_UNSAFE flags is used.
When migrating a qemu domain, we enter the monitor, send some commands,
try to connect to destination qemu, send other commands, end exit the
monitor. However, if we couldn't connect to destination qemu we forgot
to exit the monitor.
Bug introduced by commit d9d518b1c8.
Calling qemuDomainMigrateGraphicsRelocate notifies spice clients to
connect to destination qemu so that they can seamlessly switch streams
once migration is done. Unfortunately, current qemu is not able to
accept any connections while incoming migration connection is open.
Thus, we need to delay opening the migration connection to the point
spice client is already connected to the destination qemu.
Commit 5d784bd6d7 was a nice attempt to
clarify the semantics by requiring domain name from dxml to either match
original name or dname. However, setting dxml domain name to dname
doesn't really work since destination host needs to know the original
domain name to be able to use it in migration cookies. This patch
requires domain name in dxml to match the original domain name. The
change should be safe and backward compatible since migration would fail
just a bit later in the process.
When sVirt is integrated with the LXC driver, it will be neccessary
to invoke the security driver APIs using only a virDomainDefPtr
since the lxc_container.c code has no virDomainObjPtr available.
Aside from two functions which want obj->pid, every bit of the
security driver code only touches obj->def. So we don't need to
pass a virDomainObjPtr into the security drivers, a virDomainDefPtr
is sufficient. Two functions also gain a 'pid_t pid' argument.
* src/qemu/qemu_driver.c, src/qemu/qemu_hotplug.c,
src/qemu/qemu_migration.c, src/qemu/qemu_process.c,
src/security/security_apparmor.c,
src/security/security_dac.c,
src/security/security_driver.h,
src/security/security_manager.c,
src/security/security_manager.h,
src/security/security_nop.c,
src/security/security_selinux.c,
src/security/security_stack.c: Change all security APIs to use a
virDomainDefPtr instead of virDomainObjPtr
A generic error code was returned, if the user aborted a migration job.
This made it hard to distinguish between a user requested abort and an
error that might have occured. This patch introduces a new error code,
which is returned in the specific case of a user abort, while leaving
all other failures with their existing code. This makes it easier to
distinguish between failure while mirgrating and an user requested
abort.
* include/libvirt/virterror.h: - add new error code
* src/util/virterror.c: - add message for the new error code
* src/qemu/qemu_migration.h: - Emit operation aborted error instead of
operation failed, on migration abort
The virTimestamp and virTimeMs functions in src/util/util.h
duplicate functionality from virtime.h, in a non-async signal
safe manner. Remove them, and convert all code over to the new
APIs.
* src/util/util.c, src/util/util.h: Delete virTimeMs and virTimestamp
* src/lxc/lxc_driver.c, src/qemu/qemu_domain.c,
src/qemu/qemu_driver.c, src/qemu/qemu_migration.c,
src/qemu/qemu_process.c, src/util/event_poll.c: Convert to use
virtime APIs
If a connection to destination host is lost during peer-to-peer
migration (because keepalive protocol timed out), we won't be able to
finish the migration and it doesn't make sense to wait for qemu to
transmit all data. This patch automatically cancels such migration
without waiting for virDomainAbortJob to be called.
Rename the macvtap.c file to virnetdevmacvlan.c to reflect its
functionality. Move the port profile association code out into
virnetdevvportprofile.c. Make the APIs available unconditionally
to callers
* src/util/macvtap.h: rename to src/util/virnetdevmacvlan.h,
* src/util/macvtap.c: rename to src/util/virnetdevmacvlan.c
* src/util/virnetdevvportprofile.c, src/util/virnetdevvportprofile.h:
Pull in vport association code
* src/Makefile.am, src/conf/domain_conf.h, src/qemu/qemu_conf.c,
src/qemu/qemu_conf.h, src/qemu/qemu_driver.c: Update include
paths & remove conditional compilation
In preparation for code re-organization, rename the Macvtap
management APIs to have the following patterns
virNetDevMacVLanXXXXX - macvlan/macvtap interface management
virNetDevVPortProfileXXXX - virtual port profile management
* src/util/macvtap.c, src/util/macvtap.h: Rename APIs
* src/conf/domain_conf.c, src/network/bridge_driver.c,
src/qemu/qemu_command.c, src/qemu/qemu_command.h,
src/qemu/qemu_driver.c, src/qemu/qemu_hotplug.c,
src/qemu/qemu_migration.c, src/qemu/qemu_process.c,
src/qemu/qemu_process.h: Update for renamed APIs
This reverts commit ef1065cf5ac; see also this bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=751900
In qemu 0.15.1 and earlier, during migration to file, the
qemu_savevm_state_begin and qemu_savevm_state_iterate methods
will both process as much migration data as possible until either
1. The file descriptor returns EAGAIN
2. The bandwidth rate limit is reached
If we set the rate limit to ULONG_MAX, test 2 never becomes true. We're
passing a plain file descriptor to QEMU and POSIX does not support EAGAIN on
regular files / block devices, so test 1 never becomes true either.
In the 'virsh save --bypass-cache' case, we pass a pipe instead of a
regular fd, but using a pipe adds I/O overhead, so always passing a
pipe just so qemu can see EAGAIN doesn't seem nice.
The ultimate fix needs to come from qemu - background migration must
respect asynchronous abort requests, or else periodically return
control to the main handling loop without an EAGAIN and without
waiting to hit an insanely large amount of data. But until a
version of qemu is fixed to support "unlimited" data rates while
still allowing cancellation, the best we can do is avoid the
automatic use of unlimited rates from within libvirt (users can
still explicitly change the migration rates, if they are aware that
they are giving up the ability to cancel a job).
Reverting the lone use of QEMU_DOMAIN_FILE_MIG_BANDWIDTH_MAX is
the simplest patch; this slows migration back down to a default
32M/sec cap, but also ensures that the main qemu processing loop
will still be responsive to cancellation requests. Hopefully
upstream qemu will provide us a means of safely using unlimited
speed, including a runtime probe of that capability.
* src/qemu/qemu_migration.c (qemuMigrationToFile): Revert attempt
to use unlimited migration bandwidth when migrating to file.
Signed-off-by: Daniel Veillard <veillard@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
- changed some return 1's to return -1
- changed if (rc) error checks to if (rc < 0)
- fixed some other minor convention violations
I might have missed some. Can fix in another patch or can respin
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Reported-by: Eric Blake <eblake@redhat.com>
Reported-by: Laine Stump <laine@laine.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
<domainsnapshot> is the first public instance of <domain> being
used as a sub-element, although we have two other private uses
(runtime state, and migration cookie). Although indentation has
no effect on XML parsing, using it makes the output more consistent.
This uses virBuffer auto-indentation to obtain the effect, for all
but the portions of <domain> that are not generated a line at a
time into the same virBuffer. Further patches will clean up the
remaining problems.
* src/conf/domain_conf.h (virDomainDefFormatInternal): New prototype.
* src/conf/domain_conf.c (virDomainDefFormatInternal): Export.
(virDomainObjFormat, virDomainSnapshotDefFormat): Update callers.
* src/libvirt_private.syms (domain_conf.h): Add new export.
* src/qemu/qemu_migration.c (qemuMigrationCookieXMLFormat): Use
new function.
(qemuMigrationCookieXMLFormatStr): Update caller.
Destination libvirtd remembers the original name in the prepare phase
and clears it in the finish phase. The original name is used when
comparing domain name in migration cookie.
* src/qemu/qemu_migration.c: if 'vmdef' is NULL, the function
virDomainSaveConfig still dereferences it, it doesn't make
sense, so should add return value check to make sure 'vmdef'
is non-NULL before calling virDomainSaveConfig, in addition,
in order to debug later, also should record error information
into log.
Signed-off-by: Alex Jia <ajia@redhat.com>
If a domain has inactive XML we want to transfer it to destination
when migrating with VIR_MIGRATE_PERSIST_DEST. In order to harm
the migration protocol as least as possible, a optional cookie was
chosen.
Commit 282fe1f0 documented that transient domains will auto-delete
any snapshot metadata when the last reference to the domain is
removed, and that management apps are in charge of grabbing any
snapshot metadata prior to that point. However, this was not
actually implemented for qemu until now.
* src/qemu/qemu_driver.c (qemudDomainCreate)
(qemuDomainDestroyFlags, qemuDomainSaveInternal)
(qemudDomainCoreDump, qemuDomainRestoreFlags, qemudDomainDefine)
(qemuDomainUndefineFlags, qemuDomainMigrateConfirm3)
(qemuDomainRevertToSnapshot): Clean up snapshot metadata.
* src/qemu/qemu_migration.c (qemuMigrationPrepareAny)
(qemuMigrationPerformJob, qemuMigrationPerformPhase)
(qemuMigrationFinish): Likewise.
* src/qemu/qemu_process.c (qemuProcessHandleMonitorEOF)
(qemuProcessReconnect, qemuProcessReconnectHelper)
(qemuProcessAutoDestroyDom): Likewise.
Adjust qemuMigrationRun() to use migMaxBandwidth in qemuDomainObjPrivate
structure when setting qemu migration speed. Caller-specified 'resource'
parameter overrides migMaxBandwidth.
The qemu migration speed default is 32MiB/s as defined in migration.c
/* Migration speed throttling */
static int64_t max_throttle = (32 << 20);
There's no need to throttle migration when targeting a file, so set migration
speed to unlimited prior to migration, and restore to libvirt default value
after migration.
Default units is MB for migrate_set_speed monitor command, so
(INT64_MAX / (1024 * 1024)) is used for unlimited migration speed.
Tested with both json and text monitors.
Commit 498d783 cleans up some of virtual file names for parsing strings
in memory. This patch cleans up (hopefuly) the rest forgotten by the
first patch.
This patch also changes all of the previously modified "filenames" to
valid URI's replacing spaces for underscores.
Changes to v1:
- Replace all spaces for underscores, so that the strings form valid
URI's
- Replace spaces in places changed by commit 498d783
Migration is another case of stranding metadata. And since
snapshot metadata is arbitrarily large, there's no way to
shoehorn it into the migration cookie of migration v3.
This patch consolidates two existing locations for migration
validation into one helper function, then enhances that function
to also do the new checks. If we could always trust the source
to validate migration, then the destination would not have to
do anything; but since older servers that did not do checking
can migrate to newer destinations, we have to repeat some of
the same checks on the destination; meanwhile, we want to
detect failures as soon as possible. With migration v2, this
means that validation will reject things at Prepare on the
destination if the XML exposes the problem, otherwise at Perform
on the source; with migration v3, this means that validation
will reject things at Begin on the source, or if the source
is old and the XML exposes the problem, then at Prepare on the
destination.
This patch is necessarily over-strict. Once a later patch
properly handles auto-cleanup of snapshot metadata on the
death of a transient domain, then the only time we actually
need snapshots to prevent migration is when using the
--undefinesource flag on a persistent source domain.
It is possible to recreate snapshot metadata on the destination
with VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE and
VIR_DOMAIN_SNAPSHOT_CREATE_CURRENT. But for now, that is limited,
since if we delete the snapshot metadata prior to migration,
then we won't know the name of the current snapshot to pass
along; and if we delete the snapshot metadata after migration
and use the v3 migration cookie to pass along the name of the
current snapshot, then we need a way to bypass the fact that
this patch refuses migration with snapshot metadata present.
So eventually, we may have to introduce migration protocol v4
that allows feature negotiation and an arbitrary number of
handshake exchanges, so as to pass as many rpc calls as needed
to transfer all the snapshot xml hierarchy.
But all of that is thoughts for the future; for now, the best
course of action is to quit early, rather than get into a
funky state of stale metadata; then relax restrictions later.
* src/qemu/qemu_migration.h (qemuMigrationIsAllowed): Make static.
* src/qemu/qemu_migration.c (qemuMigrationIsAllowed): Alter
signature, and allow checks for both outgoing and incoming.
(qemuMigrationBegin, qemuMigrationPrepareAny)
(qemuMigrationPerformJob): Update callers.
In a SELinux or root-squashing NFS environment, libvirt has to go
through some hoops to create a new file that qemu can then open()
by name. Snapshots are a case where we want to guarantee an empty
file that qemu can open; also, reopening a save file to convert it
from being marked partial to complete requires a reopen to avoid
O_DIRECT headaches. Refactor some existing code to make it easier
to reuse in later patches.
* src/qemu/qemu_migration.h (qemuMigrationToFile): Drop parameter.
* src/qemu/qemu_migration.c (qemuMigrationToFile): Let cgroup do
the stat, rather than asking caller to do it and pass info down.
* src/qemu/qemu_driver.c (qemuOpenFile): New function, pulled from...
(qemuDomainSaveInternal): ...here.
(doCoreDump, qemuDomainSaveImageOpen): Use it here as well.
Commit 3261761 made it possible to use pipes instead of sockets
for outgoing tunneled migration; however, it caused a regression
because the pipe was never given a SELinux label.
* src/qemu/qemu_migration.c (doTunnelMigrate): Label outgoing pipe.
When a user migrates a domain by command as
libvirt saves vm's domain XML config in destination host after migration.
But it saves vm->def. Then, the saved XML contains some garbage.
<domain type='kvm' id='50'>
^^^^^^^^
...
<console type='pty' tty='/dev/pts/5'>
^^^^^^^^^^^^^^^^^
Avoid saving unnecessary things by saving persistent vm definition.
Changing the current vm, and writing that change to the file
system, all before a new qemu starts, is risky; it's hard to
roll back if starting the new qemu fails for some reason.
Instead of abusing vm->current_snapshot and making the command
line generator decide whether the current snapshot warrants
using -loadvm, it is better to just directly pass a snapshot all
the way through the call chain if it is to be loaded.
This frees up the last use of snapshot->def->active for qemu's
use, so the next patch can repurpose that field for tracking
which snapshot is current.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Don't use active
field of snapshot.
* src/qemu/qemu_process.c (qemuProcessStart): Add a parameter.
* src/qemu/qemu_process.h (qemuProcessStart): Update prototype.
* src/qemu/qemu_migration.c (qemuMigrationPrepareAny): Update
callers.
* src/qemu/qemu_driver.c (qemudDomainCreate)
(qemuDomainSaveImageStartVM, qemuDomainObjStart)
(qemuDomainRevertToSnapshot): Likewise.
(qemuDomainSnapshotSetCurrentActive)
(qemuDomainSnapshotSetCurrentInactive): Delete unused functions.
* src/qemu/qemu_migration.c: avoid dead 'ret' assignment and silence
clang warning.
Detected by ccc-analyzer:
CC libvirt_driver_qemu_la-qemu_migration.lo
qemu/qemu_migration.c:2046:5: warning: Value stored to 'ret' is never read
ret = qemuMigrationConfirm(driver, sconn, vm,
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
By opening a connection to remote qemu process ourselves and passing the
socket to qemu we get much better errors than just "migration failed"
when the connection is opened by qemu.
The core of these two functions is very similar and most of it is even
exactly the same. Factor out the core functionality into a separate
function to remove code duplication and make further changes easier.
Warning detected by Coverity. No need for the NULL check, and
removing it silences the warning without any semantic change.
* src/qemu/qemu_migration.c (qemuMigrationFinish): All entries to
endjob had non-NULL vm.
Currently, we attempt to run sync job and async job at the same time. It
means that the monitor commands for two jobs can be run in any order.
In the function qemuDomainObjEnterMonitorInternal():
if (priv->job.active == QEMU_JOB_NONE && priv->job.asyncJob) {
if (qemuDomainObjBeginNestedJob(driver, obj) < 0)
We check whether the caller is an async job by priv->job.active and
priv->job.asynJob. But when an async job is running, and a sync job is
also running at the time of the check, then priv->job.active is not
QEMU_JOB_NONE. So we cannot check whether the caller is an async job
in the function qemuDomainObjEnterMonitorInternal(), and must instead
put the burden on the caller to tell us when an async command wants
to do a nested job.
Once the burden is on the caller, then only async monitor enters need
to worry about whether the VM is still running; for sync monitor enter,
the internal return is always 0, so lots of ignore_value can be dropped.
* src/qemu/THREADS.txt: Reflect new rules.
* src/qemu/qemu_domain.h (qemuDomainObjEnterMonitorAsync): New
prototype.
* src/qemu/qemu_process.h (qemuProcessStartCPUs)
(qemuProcessStopCPUs): Add parameter.
* src/qemu/qemu_migration.h (qemuMigrationToFile): Likewise.
(qemuMigrationWaitForCompletion): Make static.
* src/qemu/qemu_domain.c (qemuDomainObjEnterMonitorInternal): Add
parameter.
(qemuDomainObjEnterMonitorAsync): New function.
(qemuDomainObjEnterMonitor, qemuDomainObjEnterMonitorWithDriver):
Update callers.
* src/qemu/qemu_driver.c (qemuDomainSaveInternal)
(qemudDomainCoreDump, doCoreDump, processWatchdogEvent)
(qemudDomainSuspend, qemudDomainResume, qemuDomainSaveImageStartVM)
(qemuDomainSnapshotCreateActive, qemuDomainRevertToSnapshot):
Likewise.
* src/qemu/qemu_process.c (qemuProcessStopCPUs)
(qemuProcessFakeReboot, qemuProcessRecoverMigration)
(qemuProcessRecoverJob, qemuProcessStart): Likewise.
* src/qemu/qemu_migration.c (qemuMigrationToFile)
(qemuMigrationWaitForCompletion, qemuMigrationUpdateJobStatus)
(qemuMigrationJobStart, qemuDomainMigrateGraphicsRelocate)
(doNativeMigrate, doTunnelMigrate, qemuMigrationPerformJob)
(qemuMigrationPerformPhase, qemuMigrationFinish)
(qemuMigrationConfirm): Likewise.
* src/qemu/qemu_hotplug.c: Drop unneeded ignore_value.
Once it's plugged in, the <listen> element will be an optional
replacement for the "listen" attribute that graphics elements already
have. If the <listen> element is type='address', it will have an
attribute called 'address' which will contain an IP address or dns
name that the guest's display server should listen on. If, however,
type='network', the <listen> element should have an attribute called
'network' that will be set to the name of a network configuration to
get the IP address from.
* docs/schemas/domain.rng: updated to allow the <listen> element
* docs/formatdomain.html.in: document the <listen> element and its
attributes.
* src/conf/domain_conf.[hc]:
1) The domain parser, formatter, and data structure are modified to
support 0 or more <listen> subelements to each <graphics>
element. The old style "legacy" listen attribute is also still
accepted, and will be stored internally just as if it were a
separate <listen> element. On output (i.e. format), the address
attribute of the first <listen> element of type 'address' will be
duplicated in the legacy "listen" attribute of the <graphic>
element.
2) The "listenAddr" attribute has been removed from the unions in
virDomainGRaphicsDef for graphics types vnc, rdp, and spice.
This attribute is now in the <listen> subelement (aka
virDomainGraphicsListenDef)
3) Helper functions were written to provide simple access
(both Get and Set) to the listen elements and their attributes.
* src/libvirt_private.syms: export the listen helper functions
* src/qemu/qemu_command.c, src/qemu/qemu_hotplug.c,
src/qemu/qemu_migration.c, src/vbox/vbox_tmpl.c,
src/vmx/vmx.c, src/xenxs/xen_sxpr.c, src/xenxs/xen_xm.c
Modify all these files to use the listen helper functions rather
than directly referencing the (now missing) listenAddr
attribute. There can be multiple <listen> elements to a single
<graphics>, but the drivers all currently only support one, so all
replacements of direct access with a helper function indicate index
"0".
* tests/* - only 3 of these are new files added explicitly to test the
new <listen> element. All the others have been modified to reflect
the fact that any legacy "listen" attributes passed in to the domain
parse will be saved in a <listen> element (i.e. one of the
virDomainGraphicsListenDefs), and during the domain format function,
both the <listen> element as well as the legacy attributes will be
output.
Make MIGRATION_OUT use the new helper methods.
This also introduces new protection to migration v3 process: the
migration job is held from Begin to Confirm to avoid changes to a domain
during migration (esp. between Begin and Perform phases). This change is
automatically applied to p2p and tunneled migrations. For normal
migration, this requires support from a client. In other words, if an
old (pre 0.9.4) client starts normal migration of a domain, the domain
will not be protected against changes between Begin and Perform steps.
The qemu driver accesses fields in the virDomainNetDef directly, but
with the advent of the virDomainActualNetDef, some pieces of
information may be found in a different place (the ActualNetDef) if
the network connection is of type='network' and that network is of
forward type='bridge|private|vepa|passthrough'. The previous patch
added functions to mask this difference from callers - they hide the
decision making process and just pick the value from the proper place.
This patch uses those functions in the qemu driver as a first step in
making qemu work with the new network types. At this point, the
virDomainActualNetDef is guaranteed always NULL, so the GetActualX()
function will return exactly what the def->X that's being replaced
would have returned (ie bisecting is not compromised).
There is one place (in qemu_driver.c) where the internal details of
the NetDef are directly manipulated by the code, so the GetActual
functions cannot be used there without extra additional code; that
file will be treated in a separate patch.
The virtPortProfile in the domain interface struct is now a separately
allocated object *pointed to by* (rather than contained in) the main
virDomainNetDef object. This is done to make it easier to figure out
when a virtualPortProfile has/hasn't been specified in a particular
config.
Otherwise, an ABI mismatch gives error messages attributing the target
xml string as current, and the current domain state as the new xml.
* src/qemu/qemu_migration.c (qemuMigrationBegin): Use correct
argument order.
Starting/ending jobs when closing the connection may reset any
error which was reported earlier in p2p migration. We must
save the original error before doing so. This means we can also
just call virConnectClose as normal, instead of virUnrefConnect
* src/qemu/qemu_migration.c: Preserve errors in p2p migration
Commit f548480b broke migration v3 on qemu, because the driver
passed flags on through to qemu_migration even though
qemu_migration wasn't using those flags.
* src/qemu/qemu_migration.h (QEMU_MIGRATION_FLAGS): New define.
* src/qemu/qemu_driver.c: Simplify all migration callbacks.
* src/qemu/qemu_migration.c (qemuMigrationConfirm): Fix regression.
Continuation of commit 313ac7fd, and enforce things with a syntax
check.
Technically, virNetServerClientCalculateHandleMode is not printing
a mode_t, but rather a collection of VIR_EVENT_HANDLE_* bits;
however, these bits are < 8, so there is no different in the
output, and that was the easiest way to silence the new syntax check.
* cfg.mk (sc_flags_debug): New syntax check.
(exclude_file_name_regexp--sc_flags_debug): Add exemptions.
* src/fdstream.c (virFDStreamOpenFileInternal): Print flags in
hex, mode_t in octal.
* src/libvirt-qemu.c (virDomainQemuMonitorCommand)
(virDomainQemuAttach): Likewise.
* src/locking/lock_driver_nop.c (virLockManagerNopInit): Likewise.
* src/locking/lock_driver_sanlock.c (virLockManagerSanlockInit):
Likewise.
* src/locking/lock_manager.c: Likewise.
* src/qemu/qemu_migration.c: Likewise.
* src/qemu/qemu_monitor.c: Likewise.
* src/rpc/virnetserverclient.c
(virNetServerClientCalculateHandleMode): Print mode with %o.
Most of the code in these two functions is supposed to be identical but
currently it isn't (which is natural since the code is duplicated).
Let's move common parts of these functions into qemuMigrationPrepareAny.
This also fixes qemuMigrationPrepareTunnel which didn't store received
lockState in the domain object.
Query commands are safe to be called during long running jobs (such as
migration). This patch makes them all work without the need to
special-case every single one of them.
The patch introduces new job.asyncCond condition and associated
job.asyncJob which are dedicated to asynchronous (from qemu monitor
point of view) jobs that can take arbitrarily long time to finish while
qemu monitor is still usable for other commands.
The existing job.active (and job.cond condition) is used all other
synchronous jobs (including the commands run during async job).
Locking schema is changed to use these two conditions. While asyncJob is
active, only allowed set of synchronous jobs is allowed (the set can be
different according to a particular asyncJob) so any method that
communicates to qemu monitor needs to check if it is allowed to be
executed during current asyncJob (if any). Once the check passes, the
method needs to normally acquire job.cond to ensure no other command is
running. Since domain object lock is released during that time, asyncJob
could have been started in the meantime so the method needs to recheck
the first condition. Then, normal jobs set job.active and asynchronous
jobs set job.asyncJob and optionally change the list of allowed job
groups.
Since asynchronous jobs only set job.asyncJob, other allowed commands
can still be run when domain object is unlocked (when communicating to
remote libvirtd or sleeping). To protect its own internal synchronous
commands, the asynchronous job needs to start a special nested job
before entering qemu monitor. The nested job doesn't check asyncJob, it
only acquires job.cond and sets job.active to block other jobs.
The LXC and UML drivers can both make use of auditing. Move
the qemu_audit.{c,h} files to src/conf/domain_audit.{c,h}
* src/conf/domain_audit.c: Rename from src/qemu/qemu_audit.c
* src/conf/domain_audit.h: Rename from src/qemu/qemu_audit.h
* src/Makefile.am: Remove qemu_audit.{c,h}, add domain_audit.{c,h}
* src/qemu/qemu_audit.h, src/qemu/qemu_cgroup.c,
src/qemu/qemu_command.c, src/qemu/qemu_driver.c,
src/qemu/qemu_hotplug.c, src/qemu/qemu_migration.c,
src/qemu/qemu_process.c: Update for changed audit API names
The drivers were accepting domain configs without checking if those
were actually meant for them. For example the LXC driver happily
accepts configs with type QEMU.
Add a check for the expected domain types to the virDomainDefParse*
functions.
If virDomainSaveConfig() failed, we will return NULL to source,
and the vm is still available to restart during confirm() step in
v3 protocol. So we should kill it off in qemuMigrationFinish().
In v2 protocol, we should not set vm to NULL, because we hold
a reference of vm and should unrefernce it.
Detected by Coverity. qemuDomainEventQueue requires a non-NULL
pointer; most callers silently drop the event if we encountered
and OOM situation trying to create the event.
* src/qemu/qemu_migration.c (qemuMigrationFinish): Check for OOM.
The virSecurityManagerSetFDLabel method is used to label
file descriptors associated with disk images. There will
shortly be a need to label other file descriptors in a
different way. So the current name is ambiguous. Rename
the method to virSecurityManagerSetImageFDLabel to clarify
its purpose
* src/libvirt_private.syms,
src/qemu/qemu_migration.c, src/qemu/qemu_process.c,
src/security/security_apparmor.c, src/security/security_dac.c,
src/security/security_driver.h, src/security/security_manager.c,
src/security/security_manager.h, src/security/security_selinux.c,
src/security/security_stack.c: s/FDLabel/ImageFDLabel/
The qemuMigrationPrepareDirect/PrepareTunnel methods accidentally
set the domain job to QEMU_JOB_MIGRATION_OUT when it should have
been QEMU_JOB_MIGRATION_IN. This didn't have any ill-effect, but
it is none-the-less wrong.
* src/qemu/qemu_migration.c: Fix job type
If an application is using libvirt + KVM as a piece of its
internal infrastructure to perform a specific task, it can
be desirable to guarentee the VM dies when the virConnectPtr
disconnects from libvirtd. This ensures the app can't leak
any VMs it was using. Adding VIR_DOMAIN_START_AUTOKILL as
a flag when starting guests enables this to be done.
* include/libvirt/libvirt.h.in: All VIR_DOMAIN_START_AUTOKILL
* src/qemu/qemu_driver.c: Support automatic killing of guests
upon connection close
* tools/virsh.c: Add --autokill flag to 'start' and 'create'
commands
Migration is a multi-step process
1. Begin(src)
2. Prepare(dst)
3. Perform(src)
4. Finish(dst)
5. Confirm(src)
At step 2, a QEMU process is lauched in the destination to
accept the incoming migration. Occasionally the process
that is controlling the migration workflow aborts, and fails
to call step 4, Finish. This leaves a QEMU process running
on the target (albeit with paused CPUs). Unfortunately because
step 2 actives a job on the QEMU process, it is unkillable by
normal means.
By registering the VM for autokill against the src virConnectPtr
in step 2, we can ensure that the guest is forcefully killed off
if the connection is closed without step 4 being invoked
* src/qemu/qemu_migration.c: Register autokill in PrepareDirect
and PrepareTunnel. Unregister autokill on successful run
of Finish
* src/qemu/qemu_process.c: Unregister autokill when stopping a
process
Sometimes it is useful to be able to automatically destroy a guest when
a connection is closed. For example, kill an incoming migration if
the client managing the migration dies. This introduces a map between
guest 'uuid' strings and virConnectPtr objects. When a connection is
closed, any associated guests are killed off.
* src/qemu/qemu_conf.h: Add autokill hash table to qemu driver
* src/qemu/qemu_process.c, src/qemu/qemu_process.h: Add APIs
for performing autokill of guests associated with a connection
* src/qemu/qemu_driver.c: Initialize autodestroy map
When peer-2-peer migration was invoked by a client supporting
v3, but where the target server only supported v2, we'd not
correctly shutdown the guest.
* src/qemu/qemu_migration.c: Ensure guest is shutdown in
v2 peer 2 peer migration
The v2 migration protocol doesn't use cookies, so we should not
be raising an error if the cookie parameters are NULL.
* src/qemu/qemu_migration.c: Don't raise error if cookie is NULL
In v3 migration, once migration is completed, the VM needs
to be left in a paused state until after Finish3 has been
executed on the target. Only then will the VM be killed
off. When using non-JSON QEMU monitor though, we don't
receive any 'STOP' event from QEMU, so we need to manually
set our state offline & thus release lock manager leases.
It doesn't hurt to run this on the JSON case too, just in
case the event gets lost somehow
* src/qemu/qemu_migration.c: Explicitly set VM state to
paused when migration completes
Some lock managers associate state with leases, allowing a process
to temporarily release its leases, and re-acquire them later, safe
in the knowledge that no other process has acquired + released the
leases in between.
This is already used between suspend/resume operations, and must
also be used across migration. This passes the lockstate in the
migration cookie. If the lock manager uses lockstate, then it
becomes compulsory to use the migration v3 protocol to get the
cookie support.
* src/qemu/qemu_driver.c: Validate that migration v2 protocol is
not used if lock manager needs state transfer
* src/qemu/qemu_migration.c: Transfer lock state in migration
cookie XML
Update the qemuDomainMigrateBegin method so that it accepts
an optional incoming XML document. This will be validated
for ABI compatibility against the current domain config,
and if this check passes, will be passed back out for use
by the qemuDomainMigratePrepare method on the target
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h,
src/qemu/qemu_migration.c: Allow custom XML to be passed
Originally most of libvirt domain-specific calls were blocking
during a migration.
A new mechanism to allow specific calls (blkstat/blkinfo) to be
executed in such condition has been implemented.
In the long term it'd be desirable to get a more general
solution to mark further APIs as migration safe, without needing
special case code.
* src/qemu/qemu_migration.c: add some additional job signal
flags for doing blkstat/blkinfo during a migration
* src/qemu/qemu_domain.c: add a condition variable that can be
used to efficiently wait for the migration code to clear the
signal flag
* src/qemu/qemu_driver.c: execute blkstat/blkinfo using the
job signal flags during migration
The current virDomainMigrateFinish3 method signature attempts to
distinguish two types of errors, by allowing return with ret== 0,
but ddomain == NULL, to indicate a failure to start the guest.
This is flawed, because when ret == 0, there is no way for the
virErrorPtr details to be sent back to the client.
Change the signature of virDomainMigrateFinish3 so it simply
returns a virDomainPtr, in the same way as virDomainMigrateFinish2
The disk locking code will protect against the only possible
failure mode this doesn't account for (loosing conenctivity to
libvirtd after Finish3 starts the CPUs, but before the client
sees the reply for Finish3).
* src/driver.h, src/libvirt.c, src/libvirt_internal.h: Change
virDomainMigrateFinish3 to return a virDomainPtr instead of int
* src/remote/remote_driver.c, src/remote/remote_protocol.x,
daemon/remote.c, src/qemu/qemu_driver.c, src/qemu/qemu_migration.c:
Update for API change
When doing migration, if an error occurs in Perform, it must not
be overwritten during Finish/Confirm steps. If an error occurs
in Finish, it must not be overwritten in Confirm.
Previous commit a9d12c2444 added
code to qemudDomainMigrateFinish2 to preserve the error. This
is not the right place, because it is not applicable in non-p2p
migration. The src/libvirt.c virDomainMigrateV2/3 methods need
code to preserve errors for non-p2p migration, while the
doPeer2PeerMigrate2 and doPeer2PeerMigrate3 methods contain
code to preverse errors for p2p migration.
Remove the bogus error preservation from qemudDomainMigrateFinish2
and qemudDomainMigrateFinish3.
Fix virDomainMigrateV3 and doPeer2PeerMigrate3 so that they
preserve any error hit during the Finish3 step, before invoking
Confirm3.
Finally if qemuMigrationFinish fails to resume the CPUs, it must
preserve the error before tearing down the VM, so that VM cleanup
doesn't overwrite it.
* src/libvirt.c: Preserve error before invoking Confirm3
* src/qemu/qemu_driver.c: Remove bogus error preservation
code in qemudDomainMigrateFinish2/qemudDomainMigrateFinish3
* src/qemu/qemu_migration.c: Preserve error before invoking Confirm3
and after resume fails in qemuMigrationFinish.
* src/libvirt.c: Add further debug lines in helper APIs for
migration
* src/qemu/qemu_migration.c: Add debug lines for all internal
migration API parameters
Even when failing to start CPUs, the finish method was returning
a success result. Fix this so that the QEMU process is killed
off when finish fails under v3 protocol. Also rename the
killOnFinish boolean to 'v3proto' to make it clearer that this
is a tunable based on the migration protocol version
* src/qemu/qemu_driver.c: Update for API change
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Kill
VM in qemuMigrationFinish if failing to start CPUs
The SPICE seamless migration process requires data to be passed
back from the target host, to the source host via a cookie.
The cookie includes the target host's hostname, but this was not
stored, merely validated. This patch explicitly records the
remote hostname after parsing the cookie, and uses it when
initiating the SPICE migration
* qemu/qemu_migration.c: Fix SPICE seamless migration hostname
Before running perform in peer-2-peer migration, the current
guest state must be recorded, so that non-live migration can
currently unpause a running guest on completion.
* src/qemu/qemu_migration.c: Move check for offline guest
to fix non-live migration
The virDomainMigratePerform3 currently has a single URI parameter
whose meaning varies. It is either
- A QEMU migration URI (normal migration)
- A libvirtd connection URI (peer2peer migration)
Unfortunately when using peer2peer migration, without also
using tunnelled migration, it is possible that both URIs are
required.
This adds a second URI parameter to the virDomainMigratePerform3
method, to cope with this scenario. Each parameter how has a fixed
meaning.
NB, there is no way to actually take advantage of this yet,
since virDomainMigrate/virDomainMigrateToURI do not have any
way to provide the 2 separate URIs
* daemon/remote.c, src/remote/remote_driver.c,
src/remote/remote_protocol.x, src/remote_protocol-structs: Add
the second URI parameter to perform3 message
* src/driver.h, src/libvirt.c, src/libvirt_internal.h: Add
the second URI parameter to Perform3 method
* src/libvirt_internal.h, src/qemu/qemu_migration.c,
src/qemu/qemu_migration.h: Update to handle URIs correctly
This extends the v3 migration protocol such that the
virDomainMigrateBegin3 and virDomainMigratePerform3
methods accept an application supplied XML config for
the target VM.
If the 'xmlin' parameter is NULL, then Begin3 uses the
current guest XML as normal. A driver implementing the
Begin3 method should either reject all non-NULL 'xmlin'
parameters, or strictly validate that the app supplied
XML does not change guest ABI.
The Perform3 method also needed the xmlin parameter to
cope with the Peer2Peer migration sequence.
NB it is not yet possible to use this capability since
neither of the public virDomainMigrate/virDomainMigrateToURI
methods have a way to pass in XML.
* daemon/remote.c, src/remote/remote_driver.c,
src/remote/remote_protocol.x, src/remote_protocol-structs:
Add 'remote_string xmlin' parameter to begin3/perform3
RPC messages
* src/libvirt.c, src/driver.h, src/libvirt_internal.h: Add
'const char *xmlin' parameter to Begin3/Perform3 methods
* src/qemu/qemu_driver.c, src/qemu/qemu_migration.c,
src/qemu/qemu_migration.h: Pass xmlin parameter around
migration methods
The qemuMigrationConfirm method shouldn't deal with final VM
cleanup, since it can be called from the peer2peer migration,
which expects to still use the 'vm' object afterwards.
Push the cleanup code out of qemuMigrationConfirm, into its
caller, qemuDomainMigrateConfirm3
* src/qemu/qemu_driver.c: Add VM cleanup code to
qemuDomainMigrateConfirm3
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Remove
job handling cleanup from qemuMigrationConfirm
To allow new mandatory migration cookie data to be introduced,
add support for checking supported feature flags when parsing
migration cookie.
* src/qemu/qemu_migration.c: Feature flag checking in migration
cookie parsing
When generating a cookie for a guest with no data, the
QEMU_MIGRATION_COOKIE_GRAPHICS flag was set even if no
graphics data was added. Avoid setting the flag unless
it was needed, also add a safety check for mig->graphics
being non-NULL
* src/qemu/qemu_migration.c: Avoid cookie crash for guest
with no graphics
By running the doTunnelSendAll code in a separate thread, the
main thread can do qemuMigrationWaitForCompletion as with
normal migration. This in turn ensures that job signals work
correctly and that progress monitoring can be done
* src/qemu/qemu_migration.c: Run tunnelled migration in
separate thread
Cancelling the QEMU migration may cause QEMU to flush pending
data on the migration socket. This may in turn block QEMU if
nothing reads from the other end of the socket. Closing the
socket before cancelling QEMU migration avoids this possible
deadlock.
* src/qemu/qemu_migration.c: Close sockets before cancelling
migration on failure
The 'nbytes' variable was not re-initialized to the
buffer size on each iteration of the tunnelled migration
loop. While saferead() will ensure a full read, except
on EOF, it is clearer to use the real buffer size
* src/qemu/qemu_migration.c: Always read full buffer of data
The qemuMigrationWaitForCompletion method contains a loop which
repeatedly queries QEMU to check migration progress, and also
processes job signals (pause, setspeed, setbandwidth, cancel).
The tunnelled migration loop does not currently support this
functionality, but should. Refactor the code to allow it to
be used with tunnelled migration.
Implement the v3 migration protocol, which has two extra
steps, 'begin' on the source host and 'confirm' on the
source host. All other methods also gain both input and
output cookies to allow bi-directional data passing at
all stages.
The QEMU peer2peer migration method gains another impl
to provide the v3 migration. This finally allows migration
cookies to work with tunnelled migration, which is required
for Spice seamless migration & the lock manager transfer
* src/qemu/qemu_driver.c: Wire up migrate v3 APIs
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Add
begin & confirm methods, and peer2peer impl of v3
Merge the doNonTunnelMigrate2 and doTunnelMigrate2 methods
into one doPeer2PeerMigrate2 method, since they are substantially
the same. With the introduction of v3 migration, this will be
even more important, to avoid massive code duplication.
* src/qemu/qemu_migration.c: Merge tunnel & non-tunnel migration
To facilitate the introduction of the v3 migration protocol,
the doTunnelMigrate method is refactored into two pieces. One
piece is intended to mirror the flow of virDomainMigrateVersion2,
while the other is the helper for setting up sockets and processing
the data.
Previously socket setup would be done before the 'prepare' step,
so errors could be dealt with immediately, avoiding need to shut
off the destination QEMU. In the new split, socket setup is done
after the 'prepare' step. This is not a serious problem, since
the control flow already requires calling 'finish' to tear down
the destination QEMU upon several errors.
* src/qemu/qemu_migration.c:
Use the graphics information from the QEMU migration cookie to
issue a 'client_migrate_info' monitor command to QEMU. This causes
the SPICE client to automatically reconnect to the target host
when migration completes
* src/qemu/qemu_migration.c: Set data for SPICE client relocation
before starting migration on src
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
new qemuMonitorGraphicsRelocate() command
Extend the QEMU migration cookie structure to allow information
about the destination host graphics setup to be passed by to
the source host. This will enable seamless migration of any
connected graphics clients
* src/qemu/qemu_migration.c: Add graphics info to migration
cookies
* daemon/libvirtd.c: Always initialize gnutls to enable
x509 cert parsing in QEMU
The migration protocol has support for a 'cookie' parameter which
is an opaque array of bytes as far as libvirt is concerned. Drivers
may use this for passing around arbitrary extra data they might
need during migration. The QEMU driver needs to do a few things:
- Pass hostname/uuid to allow strict protection against localhost
migration attempts
- Pass SPICE/VNC server port from the target back to the source to
allow seamless relocation of client sessions
- Pass lock driver state from source to destination
This patch introduces the basic glue for handling cookies
but only includes the host/guest UUID & name.
* src/libvirt_private.syms: Export virXMLParseStrHelper
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Parsing
and formatting of migration cookies
* src/qemu/qemu_driver.c: Pass in cookie parameters where possible
* src/remote/remote_protocol.h, src/remote/remote_protocol.x: Change
cookie max length to 16384 bytes
The qemuMigrationPrepareTunnel method should not unlock the
qemu driver, since that is the caller's job.
* src/qemu/qemu_migration.c: Fix qemuMigrationPrepareTunnel
unlocking of QEMU driver
Only in drivers which use virDomainObj, drivers that query hypervisor
for domain status need to be updated separately in case their hypervisor
supports this functionality.
The reason is also saved into domain state XML so if a domain is not
running (i.e., no state XML exists) the reason will be lost by libvirtd
restart. I think this is an acceptable limitation.
These VIR_XXXX0 APIs make us confused, use the non-0-suffix APIs instead.
How do these coversions works? The magic is using the gcc extension of ##.
When __VA_ARGS__ is empty, "##" will swallow the "," in "fmt," to
avoid compile error.
example: origin after CPP
high_level_api("%d", a_int) low_level_api("%d", a_int)
high_level_api("a string") low_level_api("a string")
About 400 conversions.
8 special conversions:
VIR_XXXX0("") -> VIR_XXXX("msg") (avoid empty format) 2 conversions
VIR_XXXX0(string_literal_with_%) -> VIR_XXXX(%->%%) 0 conversions
VIR_XXXX0(non_string_literal) -> VIR_XXXX("%s", non_string_literal)
(for security) 6 conversions
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
The two ends of the pipe used for feeding QEMU tunnelled
migration data were interchanged, so QEMU got given the
"write" end instead of the "read" end.
The qemuMigrationPrepareTunnel method was also immediately
closing the "write" end of the pipe, so the stream failed
to actually write anything.
* src/qemu/qemu_migration.c: Swap tunnelled migration
pipe FDs & don't close pipe given to stream
My earlier testing for commit 34fa0de0 was done while starting
just-built libvirt from an unconfined_t shell, where the fds happened
to work when transferring to qemu. But when installed and run under
virtd_t, failure to label the raw file (with no compression) or the
pipe (with compression) triggers SELinux failures when passing fds
over SCM_RIGHTS to svirt_t qemu.
* src/qemu/qemu_migration.c (qemuMigrationToFile): When passing
FDs, make sure they are labeled.
Spawn the compressor ourselves, instead of requiring the shell.
* src/qemu/qemu_migration.c (qemuMigrationToFile): Spawn
compression helper process when needed.
SELinux labeling and cgroup ACLs aren't required if we hand a
pre-opened fd to qemu. All the more reason to love fd: migration.
* src/qemu/qemu_migration.c (qemuMigrationToFile): Skip steps
that are irrelevant in fd migration.
This points out that core dumps (still) don't work for root-squash
NFS, since the fd is not opened correctly. This patch should not
introduce any functionality change, it is just a refactoring to
avoid duplicated code.
* src/qemu/qemu_migration.h (qemuMigrationToFile): New prototype.
* src/qemu/qemu_migration.c (qemuMigrationToFile): New function.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag, doCoreDump): Use
it.
Enhance the QEMU migration monitoring loop, so that it can get
a signal to change migration speed on the fly
* src/qemu/qemu_domain.h: Add signal for changing speed on the fly
* src/qemu/qemu_driver.c: Wire up virDomainMigrateSetSpeed driver
* src/qemu/qemu_migration.c: Support signal for changing speed
Outgoing migration still uses a Unix socket and or exec netcat until
the next patch.
* src/qemu/qemu_migration.c (qemuMigrationPrepareTunnel):
Replace Unix socket with simpler pipe.
Suggested by Paolo Bonzini.
This is done for two reasons:
- we are getting very close to 64 flags which is the maximum we can use
with unsigned long long
- by using LL constants in enum we already violates C99 constraint that
enum values have to fit into int
The introduction of the v3 migration protocol, along with
support for migration cookies, will significantly expand
the size of the migration code. Move it all to a separate
file to make it more manageable
The functions are not moved 100%. The API entry points
remain in the main QEMU driver, but once the public
virDomainPtr is resolved to the internal virDomainObjPtr,
all following code is moved.
This will allow the new v3 API entry points to call into the
same shared internal migration functions
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: Add
qemuDomainFormatXML helper method
* src/qemu/qemu_driver.c: Remove all migration code
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Add
all migration code.