While non-Intel CPU architectures don't have a special concept of IO
address space, support for PCI I/O regions is still needed to be able
to handle PCI devices that use them.
With this change, I'm able to pass through an e1000e device from QEMU
to a cloud-hypervisor VM on aarch64 and use it in the cloud-hypervisor
guest. Previously, it would hit the unimplemented!().
Signed-off-by: Alyssa Ross <hi@alyssa.is>
As clippy of rust-toolchain version 1.83.0-beta.1 suggests, remove
manual implementation of `is_power_of_two` to improve readability.
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
By introducing `imports_granularity="Module"` format strategy,
effectively groups imports from the same module into one line or block,
improving maintainability and readability.
Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
Historically the Cloud Hypervisor coding style has been to ensure that
all imports are ordered and placed in a single group. Unfortunately
cargo fmt has no support for ensuring that all imports are in a single
group so if whitespace lines were added as part of the import statements
then they would only be odered correctly in the group.
By adopting "group_imports="StdExternalCrate" we can enforce a style
where imports are placed in at most three groups for std, external
crates and the crate itself. Choosing a style enforceable by the tooling
reduces the reviewer burden.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Port I/O is only supported on x86_64 - use inverted conditional logic to
match the other architectures rather than calling them out specifically.
This will be relevant when RISC-V support is added.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
An example warning output is:
error: first doc comment paragraph is too long
--> virtio-devices/src/lib.rs:158:1
|
158 | / /// Convert an absolute address into an address space (GuestMemory)
159 | | /// to a host pointer and verify that the provided size define a valid
160 | | /// range within a single memory region.
161 | | /// Return None if it is out of bounds or if addr+size overlaps a single region.
| |_
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#too_long_first_doc_paragraph
= note: `-D clippy::too-long-first-doc-paragraph` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(clippy::too_long_first_doc_paragraph)]`
Signed-off-by: Bo Chen <chen.bo@intel.com>
The index is derived from the access offset, so it is controlled by the
guest. Check it before accessing internal data structures.
Since Rust enforces strict bound check even in release builds, the VMM
process will crash if the guest misbehaves. There is no security issue
since the guest can only DoS itself.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
BusDevice trait functions currently holds a mutable reference to self,
and exclusive access is guaranteed by taking a Mutex when dispatched by
the Bus object. However, this prevents individual devices from serving
accesses that do not require an mutable reference or is better served
with different synchronization primitives. We switch Bus to dispatch via
BusDeviceSync, which holds a shared reference, and delegate locking to
the BusDeviceSync trait implementation for Mutex<BusDevice>.
Other changes are made to make use of the dyn BusDeviceSync
trait object.
Signed-off-by: Yuanchu Xie <yuanchu@google.com>
Misspellings were identified by:
https://github.com/marketplace/actions/check-spelling
* Initial corrections based on forbidden patterns from the action
* Additional corrections by Google Chrome auto-suggest
* Some manual corrections
* Adding markdown bullets to readme credits section
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
In accordance with reuse requirements:
- Place each license file in the LICENSES/ directory
- Add missing SPDX-License-Identifier to files.
- Add .reuse/dep5 to bulk-license files
Fixes: #5887
Signed-off-by: Ruslan Mstoi <ruslan.mstoi@intel.com>
Add infrastructure to lookup the host address for mmio regions on
external dma mapping requests. This specifically resolves vfio
passthrough for virtio-iommu, allowing for nested virtualization to pass
external devices through.
Fixes#6110
Signed-off-by: Andrew Carp <acarp@crusoeenergy.com>
VfioUserDmaMapping is already in the pci crate, this moves
VfioDmaMapping to match the behavior. This is a necessary change to
allow the VfioDmaMapping trait to have access to MmioRegion memory
without creating a circular dependency. The VfioDmaMapping trait
needs to have access to mmio regions to map external devices over
mmio (a follow-up commit).
Signed-off-by: Andrew Carp <acarp@crusoeenergy.com>
On platforms where PCIe P2P is supported, inject a PCI capability into
NVIDIA GPU to indicate support.
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
With the nightly toolchain (2024-02-18) cargo check will flag up
redundant imports either because they are pulled in by the prelude on
earlier match.
Remove those redundant imports.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
According to PCIe specification, a 64-bit MMIO BAR should be
naturally aligned. In addition to being more compliant with
the specification, natural aligned BARs are mapped with
the largest possible page size by the host iommu driver, which
should speed up boot time and reduce IOTLB thrashing for virtual
machines with VFIO devices.
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
For all VFIO devices, map all non-emulated MMIO regions to
the vfio container to allow PCIe P2P between all VFIO devices
on the virtual machine. This is required for a wide variety of
advanced GPU workloads such as GPUDirect P2P (DMA between two
GPUs), GPUDirect RDMA (DMA between a GPU and an IB device).
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
This fixes all typos found by the typos utility with respect to the config file.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
The fixup_msix_region() function added in
a718716831
made the assumption that MSI-X was always available. This is the case
with many VFIO devices and all our virtio devices but created regression
with MSI devices.
Simply return the existing region size if MSI-X is not supported by the
device.
Fixes: #5649
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Split interrupt source group restore into two steps, first restore
the irqfd for each interrupt source entry, and second restore the
GSI routing of the entire interrupt source group.
This patch will reduce restore latency of interrupt source group,
and in a 200-concurrent restore test, the patch reduced the
average IOAPIC restore time from 15ms to 1ms.
Signed-off-by: Yong He <alexyonghe@tencent.com>
According the std docs implementing From<..> is preferred since it
gives you Into<..> for free where the reverse isn’t true.
Signed-off-by: Ravi kumar Veeramally <ravikumar.veeramally@intel.com>
Currently, vfio device fails to initialize as the msix-cap region in BAR
is mapped as RW region.
To resolve the initialization issue, this commit avoids mapping the
msix-cap region in the BAR. However, this solution introduces another
problem where aligning the msix table offset in the BAR to the page
size may cause overlap with the MMIO RW region, leading to reduced
performance. By enlarging the entire region in the BAR and relocating
the msix table to achieve page size alignment, this problem can be
overcomed effectively.
Fixes: #5292
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
In current implementation, memory region used in vfio is assumed to
align to 4k which may cause error when the PAGE_SIZE is not 4k, like on
Arm, it can be 16k and 64k.
Remove this assumption and align memory resource used by vfio to
PAGE_SIZE then vfio can run on host with 64k PAGE_SIZE.
Fixes: #5292
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
The information about the identifier related to a Snapshot is only
relevant from the BTreeMap perspective, which is why we can get rid of
the duplicated identifier in every Snapshot structure.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
There's no reason to carry a HashMap of SnapshotDataSection per
Snapshot. And given we now provide at most one SnapshotDataSection per
Snapshot, there's no need to keep the id part of the SnapshotDataSection
structure.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that VirtioPciDevice, VfioPciDevice and VfioUserPciDevice have all
been moved to the new restore design, there's no need to keep the old
way around, therefore the restore() implementations for MsiConfig,
MsixConfig and PciConfiguration can be removed.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This is some preliminatory work for moving both VfioUser and Vfio to the
new restore design.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The code for restoring a VirtioPciDevice has been updated, including the
dependencies VirtioPciCommonConfig, MsixConfig and PciConfiguration.
It's important to note that both PciConfiguration and MsixConfig still
have restore() implementations because Vfio and VfioUser devices still
rely on the old way for restore.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
When restoring a VM, the BAR type can be found directly from the
snapshot resources. It is more reliable than the previous method which
was using self.use_64bit_bar from VirtioPciDevice because at the time
the BARs are allocated, the VirtioDevice hasn't been restored yet,
meaning the way to determine the value of use_64bit_bar is wrong for a
device like vDPA. At this time, the device type is not known and relying
on the stored resources is the only reliable way.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
With an updated Linux kernel the kernel reprograms the same table entry
in the MSI-X table multiple times. Fortunately it does so with entry
masked (the default.)
The costly part of MSI-X table reprogramming is going out to the host
kernel to update the GSI routing entries. This change makes three
optimisations.
1. If the table entry is unchanged: skip handking updates
2. Only update the GSI routing table if the table entry is unmasked
(this skips extra calls to the ioctl() for reprogramming the
entries.)
3. Only generate a message on entry unmasking if the global MSI-X enable
bit is set.
Fixes: #4273
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
There are PCI extended capabilities that can't be passed through the VM
as they would be unusable from a guest perspective. That's why we
introduce a way to patch what is returned to the guest when the PCI
configuration space is accessed. The list of patches is created from the
parsing of the extended capabilities in that case, and particularly
based on the presence of the SRIOV and Resizable BAR capabilities.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Steven Dake <sdake@lambdal.com>
If the BAR for the VFIO device is marked as prefetchable on the
underlying device ensure that the BAR exposed through PciConfiguration
is also marked as prefetchable.
Fixes problem where NVIDIA devices are not usable with PCI VFIO
passthrough. See related NVIDIA kernel driver bug:
https://github.com/NVIDIA/open-gpu-kernel-modules/issues/344.
Fixes: #4451
Signed-off-by: Steven Dake <sdake@lambdal.com>
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
warning: you are deriving `PartialEq` and can implement `Eq`
--> vmm/src/serial_manager.rs:59:30
|
59 | #[derive(Debug, Clone, Copy, PartialEq)]
| ^^^^^^^^^ help: consider deriving `Eq` as well: `PartialEq, Eq`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When restoring a VM, the restore codepath will take care of mapping the
MMIO regions based on the information from the snapshot, rather than
having the mapping being performed during device creation.
When the device is created, information such as which BARs contain the
MSI-X tables are missing, preventing to perform the mapping of the MMIO
regions.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Based on the VfioCommon implementation, the VfioUserPciDevice now
implements the Migratable trait.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>