244 Commits

Author SHA1 Message Date
Alyssa Ross
50bac1694f vmm: support PCI I/O regions on all architectures
While non-Intel CPU architectures don't have a special concept of IO
address space, support for PCI I/O regions is still needed to be able
to handle PCI devices that use them.

With this change, I'm able to pass through an e1000e device from QEMU
to a cloud-hypervisor VM on aarch64 and use it in the cloud-hypervisor
guest.  Previously, it would hit the unimplemented!().

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2024-12-14 14:12:00 +00:00
Rob Bradford
0d6cef4521 pci: vfio: Release memory slots upon unmap
This prevents starvation of the limited set of memory slots in the
kernel.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-11-24 10:45:15 +00:00
Rob Bradford
81f8a27ef6 pci: vfio: Use MemorySlotAllocator for allocating memory slots
Adapt the existing code to replace the closure with the new of the new
MemorySlotAllocator.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-11-24 10:45:15 +00:00
Ruoqing He
0aab960bf1 misc: Elide needless lifetimes
As clippy of rust-toolchain version 1.83.0-beta.1 suggests, elide
needless lifetimes to `'_`.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-10-18 17:46:39 +00:00
Ruoqing He
b41daddce1 misc: Remove manual implementation of is_power_of_two
As clippy of rust-toolchain version 1.83.0-beta.1 suggests, remove
manual implementation of `is_power_of_two` to improve readability.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-10-18 17:46:39 +00:00
Ruoqing He
61e57e1cb1 misc: Further improve imports styling
By introducing `imports_granularity="Module"` format strategy,
effectively groups imports from the same module into one line or block,
improving maintainability and readability.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-09-29 16:13:48 +00:00
Rob Bradford
88a9f79944 misc: Adapt consistent import style formatting
Historically the Cloud Hypervisor coding style has been to ensure that
all imports are ordered and placed in a single group. Unfortunately
cargo fmt has no support for ensuring that all imports are in a single
group so if whitespace lines were added as part of the import statements
then they would only be odered correctly in the group.

By adopting "group_imports="StdExternalCrate" we can enforce a style
where imports are placed in at most three groups for std, external
crates and the crate itself. Choosing a style enforceable by the tooling
reduces the reviewer burden.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-09-29 13:08:12 +01:00
Rob Bradford
60165bbcfa pci: Fix #[cfg(target_arch)] guards for port I/O
Port I/O is only supported on x86_64 - use inverted conditional logic to
match the other architectures rather than calling them out specifically.
This will be relevant when RISC-V support is added.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-09-20 16:17:43 +00:00
Bo Chen
60c8a72e29 misc: Fix various warnings from clippy 0.1.82
An example warning output is:

error: first doc comment paragraph is too long
   --> virtio-devices/src/lib.rs:158:1
    |
158 | / /// Convert an absolute address into an address space (GuestMemory)
159 | | /// to a host pointer and verify that the provided size define a valid
160 | | /// range within a single memory region.
161 | | /// Return None if it is out of bounds or if addr+size overlaps a single region.
    | |_
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#too_long_first_doc_paragraph
    = note: `-D clippy::too-long-first-doc-paragraph` implied by `-D warnings`
    = help: to override `-D warnings` add `#[allow(clippy::too_long_first_doc_paragraph)]`

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-09-07 09:40:20 +00:00
Wei Liu
78a30012fb pci: validate index before accessing MSI-X arrays
The index is derived from the access offset, so it is controlled by the
guest. Check it before accessing internal data structures.

Since Rust enforces strict bound check even in release builds, the VMM
process will crash if the guest misbehaves. There is no security issue
since the guest can only DoS itself.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2024-08-14 08:01:25 +00:00
Yuanchu Xie
4bf2d4f7dd pci: Remove BusDevice requirement from PciDevice
The BusDevice requirement is not needed, only Send is required.

Signed-off-by: Yuanchu Xie <yuanchu@google.com>
2024-08-05 22:41:56 +00:00
Yuanchu Xie
954f3dd057 vm-device: generalize BusDevice to use a shared reference
BusDevice trait functions currently holds a mutable reference to self,
and exclusive access is guaranteed by taking a Mutex when dispatched by
the Bus object. However, this prevents individual devices from serving
accesses that do not require an mutable reference or is better served
with different synchronization primitives. We switch Bus to dispatch via
BusDeviceSync, which holds a shared reference, and delegate locking to
the BusDeviceSync trait implementation for Mutex<BusDevice>.

Other changes are made to make use of the dyn BusDeviceSync
trait object.

Signed-off-by: Yuanchu Xie <yuanchu@google.com>
2024-08-05 22:41:56 +00:00
Josh Soref
42e9632c53 misc: Fix spelling issues
Misspellings were identified by:
  https://github.com/marketplace/actions/check-spelling

* Initial corrections based on forbidden patterns from the action
* Additional corrections by Google Chrome auto-suggest
* Some manual corrections
* Adding markdown bullets to readme credits section

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2024-06-08 16:31:30 +00:00
Wei Liu
f6cd3bd86d block, pci, rate_limiter, vm-allocator: drop legacy numeric constants
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2024-04-30 07:32:08 +00:00
Wei Liu
0816c8292f pci: drop a useless check
The end_addr is a 64-bit value which cannot possibly be larger than
u64::max_value().

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2024-04-30 07:32:08 +00:00
Rob Bradford
10ab87d6a3 misc: Migrate away from versionize
Replace with serde instead.

Fixes: #6370

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-04-22 17:10:55 +00:00
Ruslan Mstoi
5e9886bba4 build: add REUSE Compliance Check
In accordance with reuse requirements:
- Place each license file in the LICENSES/ directory
- Add missing SPDX-License-Identifier to files.
- Add .reuse/dep5 to bulk-license files

Fixes: #5887

Signed-off-by: Ruslan Mstoi <ruslan.mstoi@intel.com>
2024-04-19 17:35:45 +00:00
Andrew Carp
045964deee virtio-devices: Map mmio over virtio-iommu
Add infrastructure to lookup the host address for mmio regions on
external dma mapping requests. This specifically resolves vfio
passthrough for virtio-iommu, allowing for nested virtualization to pass
external devices through.

Fixes #6110

Signed-off-by: Andrew Carp <acarp@crusoeenergy.com>
2024-04-01 09:16:30 +00:00
Andrew Carp
a5e2460d95 virtio-devices: Move VfioDmaMapping to be in the pci crate
VfioUserDmaMapping is already in the pci crate, this moves
VfioDmaMapping to match the behavior. This is a necessary change to
allow the VfioDmaMapping trait to have access to MmioRegion memory
without creating a circular dependency. The VfioDmaMapping trait
needs to have access to mmio regions to map external devices over
mmio (a follow-up commit).

Signed-off-by: Andrew Carp <acarp@crusoeenergy.com>
2024-04-01 09:16:30 +00:00
Thomas Barrett
b750c332aa vmm: add NVIDIA GPUDirect P2P support
On platforms where PCIe P2P is supported, inject a PCI capability into
NVIDIA GPU to indicate support.

Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-02-29 09:26:29 +00:00
Rob Bradford
adb318f4cd misc: Remove redundant "use" imports
With the nightly toolchain (2024-02-18) cargo check will flag up
redundant imports either because they are pulled in by the prelude on
earlier match.

Remove those redundant imports.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-02-19 17:54:30 +00:00
Thomas Barrett
c9f94be7ab pci: vfio: naturally align bar
According to PCIe specification, a 64-bit MMIO BAR should be
naturally aligned. In addition to being more compliant with
the specification, natural aligned BARs are mapped with
the largest possible page size by the host iommu driver, which
should speed up boot time and reduce IOTLB thrashing for virtual
machines with VFIO devices.

Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-02-16 08:24:56 +00:00
Thomas Barrett
f0c1f8d079 pci: vfio: VFIO_IOMMU_MAP_DMA mmio regions
For all VFIO devices, map all non-emulated MMIO regions to
the vfio container to allow PCIe P2P between all VFIO devices
on the virtual machine. This is required for a wide variety of
advanced GPU workloads such as GPUDirect P2P (DMA between two
GPUs), GPUDirect RDMA (DMA between a GPU and an IB device).

Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-02-08 10:02:58 -08:00
Thomas Barrett
45b01d592a vmm: assign each pci segment 32-bit mmio allocator
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2023-11-20 15:33:50 -08:00
Philipp Schuster
7bf0cc1ed5 misc: Fix various spelling errors using typos
This fixes all typos found by the typos utility with respect to the config file.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
2023-09-09 10:46:21 +01:00
Rob Bradford
363b478040 pci: vfio: Don't assume MSI-X is enabled
The fixup_msix_region() function added in
a718716831
made the assumption that MSI-X was always available. This is the case
with many VFIO devices and all our virtio devices but created regression
with MSI devices.

Simply return the existing region size if MSI-X is not supported by the
device.

Fixes: #5649

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2023-08-07 07:41:15 +01:00
Yong He
0149e65081 vm-device: support batch update interrupt source group GSI
Split interrupt source group restore into two steps, first restore
the irqfd for each interrupt source entry, and second restore the
GSI routing of the entire interrupt source group.

This patch will reduce restore latency of interrupt source group,
and in a 200-concurrent restore test, the patch reduced the
average IOAPIC restore time from 15ms to 1ms.

Signed-off-by: Yong He <alexyonghe@tencent.com>
2023-08-03 15:58:36 +01:00
Ravi kumar Veeramally
5b51024ef7 pci: Remove "from_over_into" clippy
According the std docs implementing From<..> is preferred since it
gives you Into<..> for free where the reverse isn’t true.

Signed-off-by: Ravi kumar Veeramally <ravikumar.veeramally@intel.com>
2023-06-26 10:37:58 -07:00
Jianyong Wu
a718716831 vfio: fix vfio device fail to initialize issue for 64k page size
Currently, vfio device fails to initialize as the msix-cap region in BAR
is mapped as RW region.

To resolve the initialization issue, this commit avoids mapping the
msix-cap region in the BAR. However, this solution introduces another
problem where aligning the msix table offset in the BAR to the page
size may cause overlap with the MMIO RW region, leading to reduced
performance. By enlarging the entire region in the BAR and relocating
the msix table to achieve page size alignment, this problem can be
overcomed effectively.

Fixes: #5292
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-06-19 10:29:23 +08:00
Jianyong Wu
eca75dcfc9 vfio: align memory region size and address to PAGE_SIZE
In current implementation, memory region used in vfio is assumed to
align to 4k which may cause error when the PAGE_SIZE is not 4k, like on
Arm, it can be 16k and 64k.

Remove this assumption and align memory resource used by vfio to
PAGE_SIZE then vfio can run on host with 64k PAGE_SIZE.

Fixes: #5292
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-05-22 13:25:52 +01:00
Rob Bradford
5e52729453 misc: Automatically fix cargo clippy issues added in 1.65 (stable)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-14 14:27:19 +00:00
Sebastien Boeuf
748018ace3 vm-migration: Don't store the id as part of Snapshot structure
The information about the identifier related to a Snapshot is only
relevant from the BTreeMap perspective, which is why we can get rid of
the duplicated identifier in every Snapshot structure.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-09 10:26:06 +01:00
Sebastien Boeuf
5b3bcfa233 vm-migration: Snapshot should have a unique SnapshotDataSection
There's no reason to carry a HashMap of SnapshotDataSection per
Snapshot. And given we now provide at most one SnapshotDataSection per
Snapshot, there's no need to keep the id part of the SnapshotDataSection
structure.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-09 10:26:06 +01:00
Sebastien Boeuf
30e421d2e5 pci: Remove unused restore() implementations
Now that VirtioPciDevice, VfioPciDevice and VfioUserPciDevice have all
been moved to the new restore design, there's no need to keep the old
way around, therefore the restore() implementations for MsiConfig,
MsixConfig and PciConfiguration can be removed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-11-29 13:46:30 +01:00
Sebastien Boeuf
cc3706afe1 pci, vmm: Move VfioPciDevice and VfioUserPciDevice to new restore design
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-11-29 13:46:30 +01:00
Sebastien Boeuf
1eac37bd5f pci: msi: Move MsiConfig to the new restore design
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-11-29 13:46:30 +01:00
Sebastien Boeuf
d6bf1f5eb0 pci: Move VfioCommon creation to a dedicated function
This is some preliminatory work for moving both VfioUser and Vfio to the
new restore design.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-11-29 13:46:30 +01:00
Sebastien Boeuf
eae8043890 pci, virtio-devices: Move VirtioPciDevice to the new restore design
The code for restoring a VirtioPciDevice has been updated, including the
dependencies VirtioPciCommonConfig, MsixConfig and PciConfiguration.

It's important to note that both PciConfiguration and MsixConfig still
have restore() implementations because Vfio and VfioUser devices still
rely on the old way for restore.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-11-23 18:37:40 +00:00
Wei Liu
c5bd8cabc4 pci: add safety comments
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-11-18 12:50:01 +00:00
Bo Chen
a9ec0f33c0 misc: Fix clippy issues
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-11-02 09:41:43 +01:00
Sebastien Boeuf
2b150ac2ea pci, virtio-devices: Restore proper BAR type
When restoring a VM, the BAR type can be found directly from the
snapshot resources. It is more reliable than the previous method which
was using self.use_64bit_bar from VirtioPciDevice because at the time
the BARs are allocated, the VirtioDevice hasn't been restored yet,
meaning the way to determine the value of use_64bit_bar is wrong for a
device like vDPA. At this time, the device type is not known and relying
on the stored resources is the only reliable way.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-10-13 10:03:23 +02:00
Rob Bradford
89a2562c5a pci: Optimise MSI-X table programming
With an updated Linux kernel the kernel reprograms the same table entry
in the MSI-X table multiple times. Fortunately it does so with entry
masked (the default.)

The costly part of MSI-X table reprogramming is going out to the host
kernel to update the GSI routing entries. This change makes three
optimisations.

1. If the table entry is unchanged: skip handking updates
2. Only update the GSI routing table if the table entry is unmasked
   (this skips extra calls to the ioctl() for reprogramming the
    entries.)
3. Only generate a message on entry unmasking if the global MSI-X enable
   bit is set.

Fixes: #4273

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-10-07 09:57:27 -07:00
Sebastien Boeuf
e45e3df64d pci: vfio: Filter out some PCI extended capabilities
There are PCI extended capabilities that can't be passed through the VM
as they would be unusable from a guest perspective. That's why we
introduce a way to patch what is returned to the guest when the PCI
configuration space is accessed. The list of patches is created from the
parsing of the extended capabilities in that case, and particularly
based on the presence of the SRIOV and Resizable BAR capabilities.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Steven Dake <sdake@lambdal.com>
2022-08-08 18:29:16 +02:00
Steven Dake
868d1f6902 pci: vfio: Preserve prefetchable BAR flag
If the BAR for the VFIO device is marked as prefetchable on the
underlying device ensure that the BAR exposed through PciConfiguration
is also marked as prefetchable.

Fixes problem where NVIDIA devices are not usable with PCI VFIO
passthrough. See related NVIDIA kernel driver bug:
 https://github.com/NVIDIA/open-gpu-kernel-modules/issues/344.

Fixes: #4451

Signed-off-by: Steven Dake <sdake@lambdal.com>
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-08-08 14:12:09 +01:00
Rob Bradford
b57d7b258d build: Fix beta clippy issue (needless_return)
warning: unneeded `return` statement
   --> pci/src/vfio_user.rs:627:13
    |
627 | /             return Err(std::io::Error::new(
628 | |                 std::io::ErrorKind::Other,
629 | |                 format!("Region not found for 0x{:x}", gpa),
630 | |             ));
    | |_______________^
    |
    = note: `#[warn(clippy::needless_return)]` on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_return
help: remove `return`
    |
627 ~             Err(std::io::Error::new(
628 +                 std::io::ErrorKind::Other,
629 +                 format!("Region not found for 0x{:x}", gpa),
630 +             ))
    |

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-30 20:50:45 +01:00
Rob Bradford
2716bc3311 build: Fix beta clippy issue (derive_partial_eq_without_eq)
warning: you are deriving `PartialEq` and can implement `Eq`
  --> vmm/src/serial_manager.rs:59:30
   |
59 | #[derive(Debug, Clone, Copy, PartialEq)]
   |                              ^^^^^^^^^ help: consider deriving `Eq` as well: `PartialEq, Eq`
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-30 20:50:45 +01:00
Sebastien Boeuf
81ba70a497 pci, vmm: Defer mapping VFIO MMIO regions on restore
When restoring a VM, the restore codepath will take care of mapping the
MMIO regions based on the information from the snapshot, rather than
having the mapping being performed during device creation.

When the device is created, information such as which BARs contain the
MSI-X tables are missing, preventing to perform the mapping of the MMIO
regions.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-06-09 09:19:58 +02:00
Sebastien Boeuf
7df7061610 pci, vmm: Add migratable support to vfio-user devices
Based on recent changes to VfioUserPciDevice, the vfio-user devices can
now be migrated.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-06-09 09:19:58 +02:00
Sebastien Boeuf
c021dda267 pci, vmm: Add migratable support to VFIO devices
Based on recent changes to VfioPciDevice, the VFIO devices can now be
migrated.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-06-09 09:19:58 +02:00
Sebastien Boeuf
f48b05eee6 pci: vfio_user: Implement Migratable for VfioUserPciDevice
Based on the VfioCommon implementation, the VfioUserPciDevice now
implements the Migratable trait.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-06-09 09:19:58 +02:00