mirror of
https://gitlab.com/libvirt/libvirt.git
synced 2024-12-29 17:15:23 +00:00
a9e97e0c30
With the majority of fields in the virQEMUDriverPtr struct now immutable or self-locking, there is no need for practically any methods to be using the QEMU driver lock. Only a handful of helper APIs in qemu_conf.c now need it
313 lines
9.0 KiB
Plaintext
313 lines
9.0 KiB
Plaintext
QEMU Driver Threading: The Rules
|
|
=================================
|
|
|
|
This document describes how thread safety is ensured throughout
|
|
the QEMU driver. The criteria for this model are:
|
|
|
|
- Objects must never be exclusively locked for any prolonged time
|
|
- Code which sleeps must be able to time out after suitable period
|
|
- Must be safe against dispatch of asynchronous events from monitor
|
|
|
|
|
|
Basic locking primitives
|
|
------------------------
|
|
|
|
There are a number of locks on various objects
|
|
|
|
* virQEMUDriverPtr
|
|
|
|
The qemu_conf.h file has inline comments describing the locking
|
|
needs for each field. Any field marked immutable, self-locking
|
|
can be accessed without the driver lock. For other fields there
|
|
are typically helper APIs in qemu_conf.c that provide serialized
|
|
access to the data. No code outside qemu_conf.c should ever
|
|
acquire this lock
|
|
|
|
* virDomainObjPtr
|
|
|
|
Will be locked after calling any of the virDomainFindBy{ID,Name,UUID}
|
|
methods.
|
|
|
|
Lock must be held when changing/reading any variable in the virDomainObjPtr
|
|
|
|
If the lock needs to be dropped & then re-acquired for a short period of
|
|
time, the reference count must be incremented first using virDomainObjRef().
|
|
|
|
This lock must not be held for anything which sleeps/waits (i.e. monitor
|
|
commands).
|
|
|
|
|
|
|
|
* qemuMonitorPrivatePtr: Job conditions
|
|
|
|
Since virDomainObjPtr lock must not be held during sleeps, the job
|
|
conditions provide additional protection for code making updates.
|
|
|
|
Qemu driver uses two kinds of job conditions: asynchronous and
|
|
normal.
|
|
|
|
Asynchronous job condition is used for long running jobs (such as
|
|
migration) that consist of several monitor commands and it is
|
|
desirable to allow calling a limited set of other monitor commands
|
|
while such job is running. This allows clients to, e.g., query
|
|
statistical data, cancel the job, or change parameters of the job.
|
|
|
|
Normal job condition is used by all other jobs to get exclusive
|
|
access to the monitor and also by every monitor command issued by an
|
|
asynchronous job. When acquiring normal job condition, the job must
|
|
specify what kind of action it is about to take and this is checked
|
|
against the allowed set of jobs in case an asynchronous job is
|
|
running. If the job is incompatible with current asynchronous job,
|
|
it needs to wait until the asynchronous job ends and try to acquire
|
|
the job again.
|
|
|
|
Immediately after acquiring the virDomainObjPtr lock, any method
|
|
which intends to update state must acquire either asynchronous or
|
|
normal job condition. The virDomainObjPtr lock is released while
|
|
blocking on these condition variables. Once the job condition is
|
|
acquired, a method can safely release the virDomainObjPtr lock
|
|
whenever it hits a piece of code which may sleep/wait, and
|
|
re-acquire it after the sleep/wait. Whenever an asynchronous job
|
|
wants to talk to the monitor, it needs to acquire nested job (a
|
|
special kind of normal job) to obtain exclusive access to the
|
|
monitor.
|
|
|
|
Since the virDomainObjPtr lock was dropped while waiting for the
|
|
job condition, it is possible that the domain is no longer active
|
|
when the condition is finally obtained. The monitor lock is only
|
|
safe to grab after verifying that the domain is still active.
|
|
|
|
|
|
* qemuMonitorPtr: Mutex
|
|
|
|
Lock to be used when invoking any monitor command to ensure safety
|
|
wrt any asynchronous events that may be dispatched from the monitor.
|
|
It should be acquired before running a command.
|
|
|
|
The job condition *MUST* be held before acquiring the monitor lock
|
|
|
|
The virDomainObjPtr lock *MUST* be held before acquiring the monitor
|
|
lock.
|
|
|
|
The virDomainObjPtr lock *MUST* then be released when invoking the
|
|
monitor command.
|
|
|
|
|
|
Helper methods
|
|
--------------
|
|
|
|
To lock the virDomainObjPtr
|
|
|
|
virObjectLock()
|
|
- Acquires the virDomainObjPtr lock
|
|
|
|
virObjectUnlock()
|
|
- Releases the virDomainObjPtr lock
|
|
|
|
|
|
|
|
To acquire the normal job condition
|
|
|
|
qemuDomainObjBeginJob()
|
|
- Increments ref count on virDomainObjPtr
|
|
- Waits until the job is compatible with current async job or no
|
|
async job is running
|
|
- Waits for job.cond condition 'job.active != 0' using virDomainObjPtr
|
|
mutex
|
|
- Rechecks if the job is still compatible and repeats waiting if it
|
|
isn't
|
|
- Sets job.active to the job type
|
|
|
|
|
|
qemuDomainObjEndJob()
|
|
- Sets job.active to 0
|
|
- Signals on job.cond condition
|
|
- Decrements ref count on virDomainObjPtr
|
|
|
|
|
|
|
|
To acquire the asynchronous job condition
|
|
|
|
qemuDomainObjBeginAsyncJob()
|
|
- Increments ref count on virDomainObjPtr
|
|
- Waits until no async job is running
|
|
- Waits for job.cond condition 'job.active != 0' using virDomainObjPtr
|
|
mutex
|
|
- Rechecks if any async job was started while waiting on job.cond
|
|
and repeats waiting in that case
|
|
- Sets job.asyncJob to the asynchronous job type
|
|
|
|
|
|
qemuDomainObjEndAsyncJob()
|
|
- Sets job.asyncJob to 0
|
|
- Broadcasts on job.asyncCond condition
|
|
- Decrements ref count on virDomainObjPtr
|
|
|
|
|
|
|
|
To acquire the QEMU monitor lock
|
|
|
|
qemuDomainObjEnterMonitor()
|
|
- Acquires the qemuMonitorObjPtr lock
|
|
- Releases the virDomainObjPtr lock
|
|
|
|
qemuDomainObjExitMonitor()
|
|
- Releases the qemuMonitorObjPtr lock
|
|
- Acquires the virDomainObjPtr lock
|
|
|
|
These functions must not be used by an asynchronous job.
|
|
|
|
|
|
To acquire the QEMU monitor lock as part of an asynchronous job
|
|
|
|
qemuDomainObjEnterMonitorAsync()
|
|
- Validates that the right async job is still running
|
|
- Acquires the qemuMonitorObjPtr lock
|
|
- Releases the virDomainObjPtr lock
|
|
- Validates that the VM is still active
|
|
|
|
qemuDomainObjExitMonitor()
|
|
- Releases the qemuMonitorObjPtr lock
|
|
- Acquires the virDomainObjPtr lock
|
|
|
|
These functions are for use inside an asynchronous job; the caller
|
|
must check for a return of -1 (VM not running, so nothing to exit).
|
|
Helper functions may also call this with QEMU_ASYNC_JOB_NONE when
|
|
used from a sync job (such as when first starting a domain).
|
|
|
|
|
|
To keep a domain alive while waiting on a remote command
|
|
|
|
qemuDomainObjEnterRemote()
|
|
- Increments ref count on virDomainObjPtr
|
|
- Releases the virDomainObjPtr lock
|
|
|
|
qemuDomainObjExitRemote()
|
|
- Acquires the virDomainObjPtr lock
|
|
- Decrements ref count on virDomainObjPtr
|
|
|
|
|
|
Design patterns
|
|
---------------
|
|
|
|
|
|
* Accessing something directly to do with a virDomainObjPtr
|
|
|
|
virDomainObjPtr obj;
|
|
|
|
obj = virDomainFindByUUID(driver->domains, dom->uuid);
|
|
|
|
...do work...
|
|
|
|
virDomainObjUnlock(obj);
|
|
|
|
|
|
* Updating something directly to do with a virDomainObjPtr
|
|
|
|
virDomainObjPtr obj;
|
|
|
|
obj = virDomainFindByUUID(driver->domains, dom->uuid);
|
|
|
|
qemuDomainObjBeginJob(obj, QEMU_JOB_TYPE);
|
|
|
|
...do work...
|
|
|
|
qemuDomainObjEndJob(obj);
|
|
|
|
virDomainObjUnlock(obj);
|
|
|
|
|
|
|
|
|
|
* Invoking a monitor command on a virDomainObjPtr
|
|
|
|
|
|
virDomainObjPtr obj;
|
|
qemuDomainObjPrivatePtr priv;
|
|
|
|
obj = virDomainFindByUUID(driver->domains, dom->uuid);
|
|
|
|
qemuDomainObjBeginJob(obj, QEMU_JOB_TYPE);
|
|
|
|
...do prep work...
|
|
|
|
if (virDomainObjIsActive(vm)) {
|
|
qemuDomainObjEnterMonitor(obj);
|
|
qemuMonitorXXXX(priv->mon);
|
|
qemuDomainObjExitMonitor(obj);
|
|
}
|
|
|
|
...do final work...
|
|
|
|
qemuDomainObjEndJob(obj);
|
|
virDomainObjUnlock(obj);
|
|
|
|
|
|
|
|
* Running asynchronous job
|
|
|
|
virDomainObjPtr obj;
|
|
qemuDomainObjPrivatePtr priv;
|
|
|
|
obj = virDomainFindByUUID(driver->domains, dom->uuid);
|
|
|
|
qemuDomainObjBeginAsyncJob(obj, QEMU_ASYNC_JOB_TYPE);
|
|
qemuDomainObjSetAsyncJobMask(obj, allowedJobs);
|
|
|
|
...do prep work...
|
|
|
|
if (qemuDomainObjEnterMonitorAsync(driver, obj,
|
|
QEMU_ASYNC_JOB_TYPE) < 0) {
|
|
/* domain died in the meantime */
|
|
goto error;
|
|
}
|
|
...start qemu job...
|
|
qemuDomainObjExitMonitor(driver, obj);
|
|
|
|
while (!finished) {
|
|
if (qemuDomainObjEnterMonitorAsync(driver, obj,
|
|
QEMU_ASYNC_JOB_TYPE) < 0) {
|
|
/* domain died in the meantime */
|
|
goto error;
|
|
}
|
|
...monitor job progress...
|
|
qemuDomainObjExitMonitor(driver, obj);
|
|
|
|
virObjectUnlock(obj);
|
|
sleep(aWhile);
|
|
virObjectLock(obj);
|
|
}
|
|
|
|
...do final work...
|
|
|
|
qemuDomainObjEndAsyncJob(obj);
|
|
virDomainObjUnlock(obj);
|
|
|
|
|
|
* Coordinating with a remote server for migration
|
|
|
|
virDomainObjPtr obj;
|
|
qemuDomainObjPrivatePtr priv;
|
|
|
|
obj = virDomainFindByUUID(driver->domains, dom->uuid);
|
|
|
|
qemuDomainObjBeginAsyncJob(obj, QEMU_ASYNC_JOB_TYPE);
|
|
|
|
...do prep work...
|
|
|
|
if (virDomainObjIsActive(vm)) {
|
|
qemuDomainObjEnterRemote(obj);
|
|
...communicate with remote...
|
|
qemuDomainObjExitRemote(obj);
|
|
/* domain may have been stopped while we were talking to remote */
|
|
if (!virDomainObjIsActive(vm)) {
|
|
qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s",
|
|
_("guest unexpectedly quit"));
|
|
}
|
|
}
|
|
|
|
...do final work...
|
|
|
|
qemuDomainObjEndAsyncJob(obj);
|
|
virDomainObjUnlock(obj);
|