vmm: Avoid deadlock from waiting on paused device worker threads

A deadlock can happen from the destination VM of live upgrade or
migration due to waiting on paused device worker threads. For example,
when a serialization error happens after the `DeviceManager` struct is
restored (where all virtio device worker threads are spawned but in
paused/parked state), a deadlock will happen from
`DeviceManager::drop()`, as it blocks for waiting worker threads to
join.

This patch ensures that we wake up all device (mostly virtio) worker
threads before we block for them to join.

Signed-off-by: Bo Chen <chen.bo@intel.com>
This commit is contained in:
Bo Chen 2024-03-13 15:14:35 -07:00
parent f898e660b6
commit 1363891df6

View File

@ -4977,6 +4977,12 @@ impl BusDevice for DeviceManager {
impl Drop for DeviceManager {
fn drop(&mut self) {
// Wake up the DeviceManager threads (mainly virtio device workers),
// to avoid deadlock on waiting for paused/parked worker threads.
if let Err(e) = self.resume() {
error!("Error resuming DeviceManager: {:?}", e);
}
for handle in self.virtio_devices.drain(..) {
handle.virtio_device.lock().unwrap().shutdown();
}