mirror of
https://gitlab.com/libvirt/libvirt.git
synced 2025-01-11 23:37:42 +00:00
258 lines
7.4 KiB
HTML
258 lines
7.4 KiB
HTML
|
<html>
|
||
|
<body>
|
||
|
<h1>Resource Lock Manager</h1>
|
||
|
|
||
|
<ul id="toc"></ul>
|
||
|
|
||
|
<p>
|
||
|
This page describes the design of the resource lock manager
|
||
|
that is used for locking disk images, to ensure exclusive
|
||
|
access to content.
|
||
|
</p>
|
||
|
|
||
|
<h2><a name="goals">Goals</a></h2>
|
||
|
|
||
|
<p>
|
||
|
The high level goal is to prevent the same disk image being
|
||
|
used by more than one QEMU instance at a time (unless the
|
||
|
disk is marked as sharable, or readonly). The scenarios
|
||
|
to be prevented are thus:
|
||
|
</p>
|
||
|
|
||
|
<ol>
|
||
|
<li>
|
||
|
Two different guests running configured to point at the
|
||
|
same disk image.
|
||
|
</li>
|
||
|
<li>
|
||
|
One guest being started more than once on two different
|
||
|
machines due to admin mistake
|
||
|
</li>
|
||
|
<li>
|
||
|
One guest being started more than once on a single machine
|
||
|
due to libvirt driver bug on a single machine.
|
||
|
</li>
|
||
|
</ol>
|
||
|
|
||
|
<h2><a name="requirement">Requirements</a></h2>
|
||
|
|
||
|
<p>
|
||
|
The high level goal leads to a set of requirements
|
||
|
for the lock manager design
|
||
|
</p>
|
||
|
|
||
|
<ol>
|
||
|
<li>
|
||
|
A lock must be held on a disk whenever a QEMU process
|
||
|
has the disk open
|
||
|
</li>
|
||
|
<li>
|
||
|
The lock scheme must allow QEMU to be configured with
|
||
|
readonly, shared write, or exclusive writable disks
|
||
|
</li>
|
||
|
<li>
|
||
|
A lock handover must be performed during the migration
|
||
|
process where 2 QEMU processes will have the same disk
|
||
|
open concurrently.
|
||
|
</li>
|
||
|
<li>
|
||
|
The lock manager must be able to identify and kill the
|
||
|
process accessing the resource if the lock is revoked.
|
||
|
</li>
|
||
|
<li>
|
||
|
Locks can be acquired for arbitrary VM related resources,
|
||
|
as determined by the management application.
|
||
|
</li>
|
||
|
</ol>
|
||
|
|
||
|
<h2><a name="design">Design</a></h2>
|
||
|
|
||
|
<p>
|
||
|
Within a lock manager the following series of operations
|
||
|
will need to be supported.
|
||
|
</p>
|
||
|
|
||
|
<ul>
|
||
|
<li>
|
||
|
<strong>Register object</strong>
|
||
|
Register the identity of an object against which
|
||
|
locks will be acquired
|
||
|
</li>
|
||
|
<li>
|
||
|
<strong>Add resource</strong>
|
||
|
Associate a resource with an object for future
|
||
|
lock acquisition / release
|
||
|
</li>
|
||
|
<li>
|
||
|
<strong>Acquire locks</strong>
|
||
|
Acquire the locks for all resources associated
|
||
|
with the object
|
||
|
</li>
|
||
|
<li>
|
||
|
<strong>Release locks</strong>
|
||
|
Release the locks for all resources associated
|
||
|
with the object
|
||
|
</li>
|
||
|
<li>
|
||
|
<strong>Inquire locks</strong>
|
||
|
Get a representation of the state of the locks
|
||
|
for all resources associated with the object
|
||
|
</li>
|
||
|
</ul>
|
||
|
|
||
|
<h2><a name="impl">Plugin Implementations</a></h2>
|
||
|
|
||
|
<p>
|
||
|
Lock manager implementations are provided as LGPLv2+
|
||
|
licensed, dlopen()able library modules. The plugins
|
||
|
will be loadable from the following location:
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
/usr/{lib,lib64}/libvirt/lock_manager/$NAME.so
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
The lock manager plugin must export a single ELF
|
||
|
symbol named <code>virLockDriverImpl</code>, which is
|
||
|
a static instance of the <code>virLockDriver</code>
|
||
|
struct. The struct is defined in the header file
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
#include <libvirt/plugins/lock_manager.h>
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
All callbacks in the struct must be initialized
|
||
|
to non-NULL pointers. The semantics of each
|
||
|
callback are defined in the API docs embedded
|
||
|
in the previously mentioned header file
|
||
|
</p>
|
||
|
|
||
|
<h2><a name="qemuIntegrate">QEMU Driver integration</a></h2>
|
||
|
|
||
|
<p>
|
||
|
With the QEMU driver, the lock plugin will be set
|
||
|
in the <code>/etc/libvirt/qemu.conf</code> configuration
|
||
|
file by specifying the lock manager name.
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
lockManager="sanlock"
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
By default the lock manager will be a 'no op' implementation
|
||
|
for backwards compatibility
|
||
|
</p>
|
||
|
|
||
|
<h2><a name="usagePatterns">Lock usage patterns</a></h2>
|
||
|
|
||
|
<p>
|
||
|
The following psuedo code illustrates the common
|
||
|
patterns of operations invoked on the lock
|
||
|
manager plugin callbacks.
|
||
|
</p>
|
||
|
|
||
|
<h3><a name="usageLockAcquire">Lock acquisition</a></h3>
|
||
|
|
||
|
<p>
|
||
|
Initial lock acquisition will be performed from the
|
||
|
process that is to own the lock. This is typically
|
||
|
the QEMU child process, in between the fork+exec
|
||
|
pairing. When adding further resources on the fly,
|
||
|
to an existing object holding locks, this will be
|
||
|
done from the libvirtd process.
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
virLockManagerParam params[] = {
|
||
|
{ .type = VIR_LOCK_MANAGER_PARAM_TYPE_UUID,
|
||
|
.key = "uuid",
|
||
|
},
|
||
|
{ .type = VIR_LOCK_MANAGER_PARAM_TYPE_STRING,
|
||
|
.key = "name",
|
||
|
.value = { .str = dom->def->name },
|
||
|
},
|
||
|
{ .type = VIR_LOCK_MANAGER_PARAM_TYPE_UINT,
|
||
|
.key = "id",
|
||
|
.value = { .i = dom->def->id },
|
||
|
},
|
||
|
{ .type = VIR_LOCK_MANAGER_PARAM_TYPE_UINT,
|
||
|
.key = "pid",
|
||
|
.value = { .i = dom->pid },
|
||
|
},
|
||
|
};
|
||
|
mgr = virLockManagerNew(lockPlugin,
|
||
|
VIR_LOCK_MANAGER_TYPE_DOMAIN,
|
||
|
ARRAY_CARDINALITY(params),
|
||
|
params,
|
||
|
0)));
|
||
|
|
||
|
foreach (initial disks)
|
||
|
virLockManagerAddResource(mgr,
|
||
|
VIR_LOCK_MANAGER_RESOURCE_TYPE_DISK,
|
||
|
$path, 0, NULL, $flags);
|
||
|
|
||
|
if (virLockManagerAcquire(lock, NULL, 0) < 0);
|
||
|
...abort...
|
||
|
</pre>
|
||
|
|
||
|
<h3><a name="usageLockAttach">Lock release</a></h3>
|
||
|
|
||
|
<p>
|
||
|
The locks are all implicitly released when the process
|
||
|
that acquired them exits, however, a process may
|
||
|
voluntarily give up the lock by running
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
char *state = NULL;
|
||
|
virLockManagerParam params[] = {
|
||
|
{ .type = VIR_LOCK_MANAGER_PARAM_TYPE_UUID,
|
||
|
.key = "uuid",
|
||
|
},
|
||
|
{ .type = VIR_LOCK_MANAGER_PARAM_TYPE_STRING,
|
||
|
.key = "name",
|
||
|
.value = { .str = dom->def->name },
|
||
|
},
|
||
|
{ .type = VIR_LOCK_MANAGER_PARAM_TYPE_UINT,
|
||
|
.key = "id",
|
||
|
.value = { .i = dom->def->id },
|
||
|
},
|
||
|
{ .type = VIR_LOCK_MANAGER_PARAM_TYPE_UINT,
|
||
|
.key = "pid",
|
||
|
.value = { .i = dom->pid },
|
||
|
},
|
||
|
};
|
||
|
mgr = virLockManagerNew(lockPlugin,
|
||
|
VIR_LOCK_MANAGER_TYPE_DOMAIN,
|
||
|
ARRAY_CARDINALITY(params),
|
||
|
params,
|
||
|
0)));
|
||
|
|
||
|
foreach (initial disks)
|
||
|
virLockManagerAddResource(mgr,
|
||
|
VIR_LOCK_MANAGER_RESOURCE_TYPE_DISK,
|
||
|
$path, 0, NULL, $flags);
|
||
|
|
||
|
virLockManagerRelease(mgr, & state, 0);
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
The returned state string can be passed to the
|
||
|
<code>virLockManagerAcquire</code> method to
|
||
|
later re-acquire the exact same locks. This
|
||
|
state transfer is commonly used when performing
|
||
|
live migration of virtual machines. By validating
|
||
|
the state the lock manager can ensure no other
|
||
|
VM has re-acquire the same locks on a different
|
||
|
host. The state can also be obtained without
|
||
|
releasing the locks, by calling the
|
||
|
<code>virLockManagerInquire</code> method.
|
||
|
</p>
|
||
|
|
||
|
</body>
|
||
|
</html>
|