2009-11-03 13:26:32 -05:00
|
|
|
QEMU Driver Threading: The Rules
|
|
|
|
=================================
|
|
|
|
|
|
|
|
This document describes how thread safety is ensured throughout
|
|
|
|
the QEMU driver. The criteria for this model are:
|
|
|
|
|
2011-02-01 17:28:55 -07:00
|
|
|
- Objects must never be exclusively locked for any prolonged time
|
2009-11-03 13:26:32 -05:00
|
|
|
- Code which sleeps must be able to time out after suitable period
|
|
|
|
- Must be safe against dispatch asynchronous events from monitor
|
|
|
|
|
|
|
|
|
|
|
|
Basic locking primitives
|
|
|
|
------------------------
|
|
|
|
|
|
|
|
There are a number of locks on various objects
|
|
|
|
|
|
|
|
* struct qemud_driver: RWLock
|
|
|
|
|
|
|
|
This is the top level lock on the entire driver. Every API call in
|
|
|
|
the QEMU driver is blocked while this is held, though some internal
|
|
|
|
callbacks may still run asynchronously. This lock must never be held
|
|
|
|
for anything which sleeps/waits (ie monitor commands)
|
|
|
|
|
|
|
|
When obtaining the driver lock, under *NO* circumstances must
|
|
|
|
any lock be held on a virDomainObjPtr. This *WILL* result in
|
|
|
|
deadlock.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* virDomainObjPtr: Mutex
|
|
|
|
|
|
|
|
Will be locked after calling any of the virDomainFindBy{ID,Name,UUID}
|
|
|
|
methods.
|
|
|
|
|
|
|
|
Lock must be held when changing/reading any variable in the virDomainObjPtr
|
|
|
|
|
|
|
|
Once the lock is held, you must *NOT* try to lock the driver. You must
|
|
|
|
release all virDomainObjPtr locks before locking the driver, or deadlock
|
2011-02-01 17:28:55 -07:00
|
|
|
*WILL* occur.
|
2009-11-03 13:26:32 -05:00
|
|
|
|
|
|
|
If the lock needs to be dropped & then re-acquired for a short period of
|
|
|
|
time, the reference count must be incremented first using virDomainObjRef().
|
2011-02-01 17:28:55 -07:00
|
|
|
If the reference count is incremented in this way, it is not necessary
|
2009-11-03 13:26:32 -05:00
|
|
|
to have the driver locked when re-acquiring the dropped locked, since the
|
|
|
|
reference count prevents it being freed by another thread.
|
|
|
|
|
|
|
|
This lock must not be held for anything which sleeps/waits (ie monitor
|
|
|
|
commands).
|
|
|
|
|
|
|
|
|
|
|
|
|
2011-06-30 11:23:50 +02:00
|
|
|
* qemuMonitorPrivatePtr: Job conditions
|
2009-11-03 13:26:32 -05:00
|
|
|
|
2011-02-01 17:28:55 -07:00
|
|
|
Since virDomainObjPtr lock must not be held during sleeps, the job
|
2011-06-30 11:23:50 +02:00
|
|
|
conditions provide additional protection for code making updates.
|
|
|
|
|
|
|
|
Qemu driver uses two kinds of job conditions: asynchronous and
|
|
|
|
normal.
|
|
|
|
|
|
|
|
Asynchronous job condition is used for long running jobs (such as
|
|
|
|
migration) that consist of several monitor commands and it is
|
|
|
|
desirable to allow calling a limited set of other monitor commands
|
|
|
|
while such job is running. This allows clients to, e.g., query
|
|
|
|
statistical data, cancel the job, or change parameters of the job.
|
|
|
|
|
|
|
|
Normal job condition is used by all other jobs to get exclusive
|
|
|
|
access to the monitor and also by every monitor command issued by an
|
|
|
|
asynchronous job. When acquiring normal job condition, the job must
|
|
|
|
specify what kind of action it is about to take and this is checked
|
|
|
|
against the allowed set of jobs in case an asynchronous job is
|
|
|
|
running. If the job is incompatible with current asynchronous job,
|
|
|
|
it needs to wait until the asynchronous job ends and try to acquire
|
|
|
|
the job again.
|
2009-11-03 13:26:32 -05:00
|
|
|
|
2011-02-01 17:28:55 -07:00
|
|
|
Immediately after acquiring the virDomainObjPtr lock, any method
|
2011-06-30 11:23:50 +02:00
|
|
|
which intends to update state must acquire either asynchronous or
|
|
|
|
normal job condition. The virDomainObjPtr lock is released while
|
|
|
|
blocking on these condition variables. Once the job condition is
|
|
|
|
acquired, a method can safely release the virDomainObjPtr lock
|
|
|
|
whenever it hits a piece of code which may sleep/wait, and
|
|
|
|
re-acquire it after the sleep/wait. Whenever an asynchronous job
|
|
|
|
wants to talk to the monitor, it needs to acquire nested job (a
|
|
|
|
special kind of normla job) to obtain exclusive access to the
|
|
|
|
monitor.
|
2011-02-01 17:28:55 -07:00
|
|
|
|
|
|
|
Since the virDomainObjPtr lock was dropped while waiting for the
|
|
|
|
job condition, it is possible that the domain is no longer active
|
|
|
|
when the condition is finally obtained. The monitor lock is only
|
|
|
|
safe to grab after verifying that the domain is still active.
|
2009-11-03 13:26:32 -05:00
|
|
|
|
|
|
|
|
|
|
|
* qemuMonitorPtr: Mutex
|
|
|
|
|
|
|
|
Lock to be used when invoking any monitor command to ensure safety
|
|
|
|
wrt any asynchronous events that may be dispatched from the monitor.
|
|
|
|
It should be acquired before running a command.
|
|
|
|
|
|
|
|
The job condition *MUST* be held before acquiring the monitor lock
|
|
|
|
|
|
|
|
The virDomainObjPtr lock *MUST* be held before acquiring the monitor
|
|
|
|
lock.
|
|
|
|
|
|
|
|
The virDomainObjPtr lock *MUST* then be released when invoking the
|
|
|
|
monitor command.
|
|
|
|
|
|
|
|
The driver lock *MUST* be released when invoking the monitor commands.
|
|
|
|
|
|
|
|
This ensures that the virDomainObjPtr & driver are both unlocked while
|
|
|
|
sleeping/waiting for the monitor response.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Helper methods
|
|
|
|
--------------
|
|
|
|
|
|
|
|
To lock the driver
|
|
|
|
|
|
|
|
qemuDriverLock()
|
|
|
|
- Acquires the driver lock
|
|
|
|
|
|
|
|
qemuDriverUnlock()
|
|
|
|
- Releases the driver lock
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
To lock the virDomainObjPtr
|
|
|
|
|
|
|
|
virDomainObjLock()
|
|
|
|
- Acquires the virDomainObjPtr lock
|
|
|
|
|
|
|
|
virDomainObjUnlock()
|
|
|
|
- Releases the virDomainObjPtr lock
|
|
|
|
|
|
|
|
|
|
|
|
|
2011-06-30 11:23:50 +02:00
|
|
|
To acquire the normal job condition
|
2009-11-03 13:26:32 -05:00
|
|
|
|
|
|
|
qemuDomainObjBeginJob() (if driver is unlocked)
|
|
|
|
- Increments ref count on virDomainObjPtr
|
2011-06-30 11:23:50 +02:00
|
|
|
- Waits until the job is compatible with current async job or no
|
|
|
|
async job is running
|
|
|
|
- Waits job.cond condition 'job.active != 0' using virDomainObjPtr
|
|
|
|
mutex
|
|
|
|
- Rechecks if the job is still compatible and repeats waiting if it
|
|
|
|
isn't
|
|
|
|
- Sets job.active to the job type
|
2009-11-03 13:26:32 -05:00
|
|
|
|
|
|
|
qemuDomainObjBeginJobWithDriver() (if driver needs to be locked)
|
|
|
|
- Increments ref count on virDomainObjPtr
|
2011-06-30 11:23:50 +02:00
|
|
|
- Unlocks driver
|
|
|
|
- Waits until the job is compatible with current async job or no
|
|
|
|
async job is running
|
|
|
|
- Waits job.cond condition 'job.active != 0' using virDomainObjPtr
|
|
|
|
mutex
|
|
|
|
- Rechecks if the job is still compatible and repeats waiting if it
|
|
|
|
isn't
|
|
|
|
- Sets job.active to the job type
|
2009-11-03 13:26:32 -05:00
|
|
|
- Unlocks virDomainObjPtr
|
|
|
|
- Locks driver
|
|
|
|
- Locks virDomainObjPtr
|
|
|
|
|
2011-06-30 11:23:50 +02:00
|
|
|
NB: this variant is required in order to comply with lock ordering
|
|
|
|
rules for virDomainObjPtr vs driver
|
2009-11-03 13:26:32 -05:00
|
|
|
|
|
|
|
|
|
|
|
qemuDomainObjEndJob()
|
2011-06-30 11:23:50 +02:00
|
|
|
- Sets job.active to 0
|
|
|
|
- Signals on job.cond condition
|
|
|
|
- Decrements ref count on virDomainObjPtr
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
To acquire the asynchronous job condition
|
|
|
|
|
|
|
|
qemuDomainObjBeginAsyncJob() (if driver is unlocked)
|
|
|
|
- Increments ref count on virDomainObjPtr
|
|
|
|
- Waits until no async job is running
|
|
|
|
- Waits job.cond condition 'job.active != 0' using virDomainObjPtr
|
|
|
|
mutex
|
|
|
|
- Rechecks if any async job was started while waiting on job.cond
|
|
|
|
and repeats waiting in that case
|
|
|
|
- Sets job.asyncJob to the asynchronous job type
|
|
|
|
|
|
|
|
qemuDomainObjBeginAsyncJobWithDriver() (if driver needs to be locked)
|
|
|
|
- Increments ref count on virDomainObjPtr
|
|
|
|
- Unlocks driver
|
|
|
|
- Waits until no async job is running
|
|
|
|
- Waits job.cond condition 'job.active != 0' using virDomainObjPtr
|
|
|
|
mutex
|
|
|
|
- Rechecks if any async job was started while waiting on job.cond
|
|
|
|
and repeats waiting in that case
|
|
|
|
- Sets job.asyncJob to the asynchronous job type
|
|
|
|
- Unlocks virDomainObjPtr
|
|
|
|
- Locks driver
|
|
|
|
- Locks virDomainObjPtr
|
|
|
|
|
|
|
|
NB: this variant is required in order to comply with lock ordering
|
|
|
|
rules for virDomainObjPtr vs driver
|
|
|
|
|
|
|
|
|
|
|
|
qemuDomainObjEndAsyncJob()
|
|
|
|
- Sets job.asyncJob to 0
|
|
|
|
- Broadcasts on job.asyncCond condition
|
2009-11-03 13:26:32 -05:00
|
|
|
- Decrements ref count on virDomainObjPtr
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
To acquire the QEMU monitor lock
|
|
|
|
|
|
|
|
qemuDomainObjEnterMonitor()
|
|
|
|
- Acquires the qemuMonitorObjPtr lock
|
|
|
|
- Releases the virDomainObjPtr lock
|
|
|
|
|
|
|
|
qemuDomainObjExitMonitor()
|
|
|
|
- Releases the qemuMonitorObjPtr lock
|
2011-02-01 17:28:55 -07:00
|
|
|
- Acquires the virDomainObjPtr lock
|
2009-11-03 13:26:32 -05:00
|
|
|
|
2011-02-01 17:28:55 -07:00
|
|
|
NB: caller must take care to drop the driver lock if necessary
|
2009-11-03 13:26:32 -05:00
|
|
|
|
2011-06-30 11:23:50 +02:00
|
|
|
These functions automatically begin/end nested job if called inside an
|
|
|
|
asynchronous job. The caller must then check the return value of
|
|
|
|
qemuDomainObjEnterMonitor to detect if domain died while waiting on
|
|
|
|
the nested job.
|
|
|
|
|
2009-11-03 13:26:32 -05:00
|
|
|
|
|
|
|
To acquire the QEMU monitor lock with the driver lock held
|
|
|
|
|
|
|
|
qemuDomainObjEnterMonitorWithDriver()
|
|
|
|
- Acquires the qemuMonitorObjPtr lock
|
|
|
|
- Releases the virDomainObjPtr lock
|
|
|
|
- Releases the driver lock
|
|
|
|
|
|
|
|
qemuDomainObjExitMonitorWithDriver()
|
2011-02-01 17:28:55 -07:00
|
|
|
- Releases the qemuMonitorObjPtr lock
|
2009-11-03 13:26:32 -05:00
|
|
|
- Acquires the driver lock
|
|
|
|
- Acquires the virDomainObjPtr lock
|
|
|
|
|
2011-02-01 17:28:55 -07:00
|
|
|
NB: caller must take care to drop the driver lock if necessary
|
|
|
|
|
2011-06-30 11:23:50 +02:00
|
|
|
These functions automatically begin/end nested job if called inside an
|
|
|
|
asynchronous job. The caller must then check the return value of
|
|
|
|
qemuDomainObjEnterMonitorWithDriver to detect if domain died while
|
|
|
|
waiting on the nested job.
|
|
|
|
|
2011-02-01 17:28:55 -07:00
|
|
|
|
|
|
|
To keep a domain alive while waiting on a remote command, starting
|
|
|
|
with the driver lock held
|
|
|
|
|
|
|
|
qemuDomainObjEnterRemoterWithDriver()
|
|
|
|
- Increments ref count on virDomainObjPtr
|
|
|
|
- Releases the virDomainObjPtr lock
|
|
|
|
- Releases the driver lock
|
|
|
|
|
|
|
|
qemuDomainObjExitRemoteWithDriver()
|
|
|
|
- Acquires the driver lock
|
|
|
|
- Acquires the virDomainObjPtr lock
|
|
|
|
- Decrements ref count on virDomainObjPtr
|
2009-11-03 13:26:32 -05:00
|
|
|
|
|
|
|
|
|
|
|
Design patterns
|
|
|
|
---------------
|
|
|
|
|
|
|
|
|
|
|
|
* Accessing or updating something with just the driver
|
|
|
|
|
|
|
|
qemuDriverLock(driver);
|
|
|
|
|
|
|
|
...do work...
|
|
|
|
|
|
|
|
qemuDriverUnlock(driver);
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* Accessing something directly todo with a virDomainObjPtr
|
|
|
|
|
|
|
|
virDomainObjPtr obj;
|
|
|
|
|
|
|
|
qemuDriverLock(driver);
|
|
|
|
obj = virDomainFindByUUID(driver->domains, dom->uuid);
|
|
|
|
qemuDriverUnlock(driver);
|
|
|
|
|
|
|
|
...do work...
|
|
|
|
|
|
|
|
virDomainObjUnlock(obj);
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* Accessing something directly todo with a virDomainObjPtr and driver
|
|
|
|
|
|
|
|
virDomainObjPtr obj;
|
|
|
|
|
|
|
|
qemuDriverLock(driver);
|
|
|
|
obj = virDomainFindByUUID(driver->domains, dom->uuid);
|
|
|
|
|
|
|
|
...do work...
|
|
|
|
|
|
|
|
virDomainObjUnlock(obj);
|
|
|
|
qemuDriverUnlock(driver);
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* Updating something directly todo with a virDomainObjPtr
|
|
|
|
|
|
|
|
virDomainObjPtr obj;
|
|
|
|
|
|
|
|
qemuDriverLockRO(driver);
|
|
|
|
obj = virDomainFindByUUID(driver->domains, dom->uuid);
|
|
|
|
qemuDriverUnlock(driver);
|
|
|
|
|
2011-06-30 11:23:50 +02:00
|
|
|
qemuDomainObjBeginJob(obj, QEMU_JOB_TYPE);
|
2009-11-03 13:26:32 -05:00
|
|
|
|
|
|
|
...do work...
|
|
|
|
|
|
|
|
qemuDomainObjEndJob(obj);
|
|
|
|
|
|
|
|
virDomainObjUnlock(obj);
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* Invoking a monitor command on a virDomainObjPtr
|
|
|
|
|
|
|
|
|
|
|
|
virDomainObjPtr obj;
|
|
|
|
qemuDomainObjPrivatePtr priv;
|
|
|
|
|
|
|
|
qemuDriverLockRO(driver);
|
|
|
|
obj = virDomainFindByUUID(driver->domains, dom->uuid);
|
|
|
|
qemuDriverUnlock(driver);
|
|
|
|
|
2011-06-30 11:23:50 +02:00
|
|
|
qemuDomainObjBeginJob(obj, QEMU_JOB_TYPE);
|
2009-11-03 13:26:32 -05:00
|
|
|
|
|
|
|
...do prep work...
|
|
|
|
|
2011-02-01 17:28:55 -07:00
|
|
|
if (virDomainObjIsActive(vm)) {
|
2011-06-30 11:23:50 +02:00
|
|
|
ignore_value(qemuDomainObjEnterMonitor(obj));
|
2011-02-01 17:28:55 -07:00
|
|
|
qemuMonitorXXXX(priv->mon);
|
|
|
|
qemuDomainObjExitMonitor(obj);
|
|
|
|
}
|
2009-11-03 13:26:32 -05:00
|
|
|
|
|
|
|
...do final work...
|
|
|
|
|
|
|
|
qemuDomainObjEndJob(obj);
|
|
|
|
virDomainObjUnlock(obj);
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* Invoking a monitor command on a virDomainObjPtr with driver locked too
|
|
|
|
|
|
|
|
|
|
|
|
virDomainObjPtr obj;
|
|
|
|
qemuDomainObjPrivatePtr priv;
|
|
|
|
|
|
|
|
qemuDriverLock(driver);
|
|
|
|
obj = virDomainFindByUUID(driver->domains, dom->uuid);
|
|
|
|
|
2011-06-30 11:23:50 +02:00
|
|
|
qemuDomainObjBeginJobWithDriver(obj, QEMU_JOB_TYPE);
|
2009-11-03 13:26:32 -05:00
|
|
|
|
|
|
|
...do prep work...
|
|
|
|
|
2011-02-01 17:28:55 -07:00
|
|
|
if (virDomainObjIsActive(vm)) {
|
2011-06-30 11:23:50 +02:00
|
|
|
ignore_value(qemuDomainObjEnterMonitorWithDriver(driver, obj));
|
2011-02-01 17:28:55 -07:00
|
|
|
qemuMonitorXXXX(priv->mon);
|
|
|
|
qemuDomainObjExitMonitorWithDriver(driver, obj);
|
|
|
|
}
|
2009-11-03 13:26:32 -05:00
|
|
|
|
|
|
|
...do final work...
|
|
|
|
|
|
|
|
qemuDomainObjEndJob(obj);
|
|
|
|
virDomainObjUnlock(obj);
|
|
|
|
qemuDriverUnlock(driver);
|
|
|
|
|
|
|
|
|
2011-06-30 11:23:50 +02:00
|
|
|
* Running asynchronous job
|
|
|
|
|
|
|
|
virDomainObjPtr obj;
|
|
|
|
qemuDomainObjPrivatePtr priv;
|
|
|
|
|
|
|
|
qemuDriverLock(driver);
|
|
|
|
obj = virDomainFindByUUID(driver->domains, dom->uuid);
|
|
|
|
|
|
|
|
qemuDomainObjBeginAsyncJobWithDriver(obj, QEMU_ASYNC_JOB_TYPE);
|
|
|
|
qemuDomainObjSetAsyncJobMask(obj, allowedJobs);
|
|
|
|
|
|
|
|
...do prep work...
|
|
|
|
|
|
|
|
if (qemuDomainObjEnterMonitorWithDriver(driver, obj) < 0) {
|
|
|
|
/* domain died in the meantime */
|
|
|
|
goto error;
|
|
|
|
}
|
|
|
|
...start qemu job...
|
|
|
|
qemuDomainObjExitMonitorWithDriver(driver, obj);
|
|
|
|
|
|
|
|
while (!finished) {
|
|
|
|
if (qemuDomainObjEnterMonitorWithDriver(driver, obj) < 0) {
|
|
|
|
/* domain died in the meantime */
|
|
|
|
goto error;
|
|
|
|
}
|
|
|
|
...monitor job progress...
|
|
|
|
qemuDomainObjExitMonitorWithDriver(driver, obj);
|
|
|
|
|
|
|
|
virDomainObjUnlock(obj);
|
|
|
|
sleep(aWhile);
|
|
|
|
virDomainObjLock(obj);
|
|
|
|
}
|
|
|
|
|
|
|
|
...do final work...
|
|
|
|
|
|
|
|
qemuDomainObjEndAsyncJob(obj);
|
|
|
|
virDomainObjUnlock(obj);
|
|
|
|
qemuDriverUnlock(driver);
|
|
|
|
|
|
|
|
|
|
|
|
* Coordinating with a remote server for migration
|
2011-02-01 17:28:55 -07:00
|
|
|
|
|
|
|
virDomainObjPtr obj;
|
|
|
|
qemuDomainObjPrivatePtr priv;
|
|
|
|
|
|
|
|
qemuDriverLock(driver);
|
|
|
|
obj = virDomainFindByUUID(driver->domains, dom->uuid);
|
|
|
|
|
2011-06-30 11:23:50 +02:00
|
|
|
qemuDomainObjBeginAsyncJobWithDriver(obj, QEMU_ASYNC_JOB_TYPE);
|
2011-02-01 17:28:55 -07:00
|
|
|
|
|
|
|
...do prep work...
|
|
|
|
|
|
|
|
if (virDomainObjIsActive(vm)) {
|
|
|
|
qemuDomainObjEnterRemoteWithDriver(driver, obj);
|
|
|
|
...communicate with remote...
|
|
|
|
qemuDomainObjExitRemoteWithDriver(driver, obj);
|
|
|
|
/* domain may have been stopped while we were talking to remote */
|
|
|
|
if (!virDomainObjIsActive(vm)) {
|
|
|
|
qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s",
|
|
|
|
_("guest unexpectedly quit"));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
...do final work...
|
|
|
|
|
2011-06-30 11:23:50 +02:00
|
|
|
qemuDomainObjEndAsyncJob(obj);
|
2011-02-01 17:28:55 -07:00
|
|
|
virDomainObjUnlock(obj);
|
|
|
|
qemuDriverUnlock(driver);
|
|
|
|
|
2009-11-03 13:26:32 -05:00
|
|
|
|
|
|
|
Summary
|
|
|
|
-------
|
|
|
|
|
|
|
|
* Respect lock ordering rules: never lock driver if anything else is
|
|
|
|
already locked
|
|
|
|
|
|
|
|
* Don't hold locks in code which sleeps: unlock driver & virDomainObjPtr
|
|
|
|
when using monitor
|