libvirt/src/hypervisor
Daniel Henrique Barboza 8c9a600457 virhostdev.c: remove missing PCI devs from hostdev manager
virHostdevReAttachPCIDevices() is called when we want to re-attach
a list of hostdevs back to the host, either on the shutdown path or
via a 'virsh detach-device' call.  This function always count on the
existence of the device in the host to work, but this can lead to
problems. For example, a SR-IOV device can be removed via an admin
"echo 0 > /sys/bus/pci/devices/<addr>/sriov_numvfs", making the kernel
fire up and eventfd_signal() to the process, asking for the process to
release the device. The result might vary depending on the device driver
and OS/arch, but two possible outcomes are:

1) the hypervisor driver will detach the device from the VM, issuing a
delete event to Libvirt. This can be observed in QEMU;

2) the 'echo 0 > ...' will hang waiting for the device to be unplugged.
This means that the VM process failed/refused to release the hostdev back
to the host, and the hostdev will be detached during VM shutdown.

Today we don't behave well for both cases. We'll fail to remove the PCI device
reference from mgr->activePCIHostdevs and mgr->inactivePCIHostdevs because
we rely on the existence of the PCI device conf file in the sysfs. Attempting
to re-utilize the same device (assuming it is now present back in the host)
can result in an error like this:

$ ./run tools/virsh start vm1-sriov --console
error: Failed to start domain vm1-sriov
error: Requested operation is not valid: PCI device 0000:01:00.2 is in use by driver QEMU, domain vm1-sriov

For (1), a VM destroy/start cycle is needed to re-use the VF in the guest.
For (2), the effect is more nefarious, requiring a Libvirtd daemon restart
to use the VF again in any guest.

We can make it a bit better by checking, during virHostdevReAttachPCIDevices(),
if there is any missing PCI device that will be left behind in activePCIHostdevs
and inactivePCIHostdevs lists. Remove any missing device found from both lists,
unconditionally, matching the current state of the host. This change affects
the code path in (1) (processDeviceDeletedEvent into qemuDomainRemoveDevice, all
the way back to qemuHostdevReAttachPCIDevices) and also in (b) (qemuProcessStop
into qemuHostdevReAttachDomainDevices).

NB: Although this patch enables the possibility of 'outside Libvirt' SR-IOV
hotunplug of PCI devices, if the hypervisor and the PCI driver copes with it,
our goal is to mitigate what it is still considered a user oopsie. For all
supported purposes, the admin must remove the SR-IOV VFs from all running domains
before removing the VFs from the host.

Resolves:  https://gitlab.com/libvirt/libvirt/-/issues/72
Reviewed-by: Laine Stump <laine@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2021-03-01 12:25:33 -03:00
..
domain_cgroup.c virsh: include virutil.h where used 2020-02-24 23:15:50 +01:00
domain_cgroup.h domain_cgroup.c: add virDomainCgroupSetMemoryLimitParameters() 2020-02-23 14:02:24 +01:00
domain_driver.c domain_driver.c: use g_auto* in virDomainDriverNodeDeviceDetachFlags() 2021-02-17 15:56:39 -03:00
domain_driver.h qemu, libxl, hypervisor: use virDomainDriverNodeDeviceDetachFlags() helper 2021-02-17 15:56:27 -03:00
meson.build scripts/check-aclrules.py: check ACL for domain_driver.c ACL callers 2021-02-17 15:56:53 -03:00
virclosecallbacks.c util: hash: Retire 'virHashTable' in favor of 'GHashTable' 2020-11-06 10:40:51 +01:00
virclosecallbacks.h virclosecallbacks: move to src/hypervisor 2020-02-24 16:47:21 +01:00
virhostdev.c virhostdev.c: remove missing PCI devs from hostdev manager 2021-03-01 12:25:33 -03:00
virhostdev.h virhostdev.c: add virHostdevIsPCIDevice() helper 2021-03-01 12:25:33 -03:00