The sendmsg() syscall is limited in the number of fds it can handle.
This number matches that used by the vfio-user library and is
conservative (since we've seen it work with 64 fds.)
Fixes: #3401
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
(cherry picked from commit 7444c3a0c5)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Because anyhow version 1.0.46 has been yanked, let's move back to the
previous version 1.0.45.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that vfio-ioctls correctly exposes the list of capabilities related
to each region, Cloud Hypervisor can decide to mmap a region based on
the presence or absence of MSIX_MAPPABLE. Instead of blindly mmap'ing
the region, we check if the MSI-X table or PBA is present on the BAR,
and if that's the case, we look for MSIX_MAPPABLE.
If MSIX_MAPPABLE is present, we can go ahead and mmap the entire region.
If MSIX_MAPPABLE is not present, we simply ignore the mmap'ing of this
region as it wouldn't be supported.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In particular use the accessor for getting the device id from the bdf.
As a side effect the VIOT table is now segment aware.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This will allow making the code that handles bdf parsing simpler and
remove the need for manual shifting.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Since each segment must have a non-overlapping memory range associated
with it the device memory must be equally divided amongst all segments.
A new allocator is used for each segment to ensure that BARs are
allocated from the correct address ranges. This requires changes to
PciDevice::allocate/free_bars to take that allocator and when
reallocating BARs the correct allocator must be identified from the
ranges.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In order to get proper performance out of hardware that might be passed
through the VM with VFIO, we decide to try mapping any region marked as
mappable, ignoring the failure and moving on to the next region. When
the mapping succeeds, we establish a list of user memory regions so that
we enable only subparts of the global mapping through the hypervisor.
This allows MSI-X table and PBA to keep trapping accesses from the guest
so that the VMM can properly emulate MSI-X.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This was causing some issues because of the use of 2 different versions
for the vm-memmory crate. We'll wait for all dependencies to be properly
resolved before we move to 0.7.0.
This reverts commit 76b6c62d07.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Move the PCI related state from the DeviceManager struct to a PciSegment
struct inside the DeviceManager. This is in preparation for multiple
segment support. Currently this state is just the bus itself, the MMIO
and PIO config devices and hotplug related state.
The main change that this required is using the Arc<Mutex<PciBus>> in
the device addition logic in order to ensure that
the bus could be created earlier.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Implement the infrastructure that lets a virtio-mem device map the guest
memory into the device. This is necessary since with virtio-mem zones
memory can be added or removed and the vfio-user device must be
informed.
Fixes: #3025
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
By moving this from the VfioUserPciDevice to DeviceManager the client
can be reused for handling DMA mapping behind an IOMMU.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Looking up devices on the port I/O bus is time consuming during the
boot at there is an O(lg n) tree lookup and the overhead from taking a
lock on the bus contents.
Avoid this by adding a fast path uses the hardcoded port address and
size and directs PCI config requests directly to the device.
Command line:
target/release/cloud-hypervisor --kernel ~/src/linux/vmlinux --cmdline "root=/dev/vda1 console=ttyS0" --serial tty --console off --disk path=~/workloads/focal-server-cloudimg-amd64-custom-20210609-0.raw --api-socket /tmp/api
PIO exit: 17913
PCI fast path: 17871
Percentage on fast path: 99.8%
perf before:
marvin:~/src/cloud-hypervisor (main *)$ perf report -g | grep resolve
6.20% 6.20% vcpu0 cloud-hypervisor [.] vm_device:🚌:Bus::resolve
perf after:
marvin:~/src/cloud-hypervisor (2021-09-17-ioapic-fast-path *)$ perf report -g | grep resolve
0.08% 0.08% vcpu0 cloud-hypervisor [.] vm_device:🚌:Bus::resolve
The compromise required to implement this fast path is bringing the
creation of the PciConfigIo device into the DeviceManager::new() so that
it can be used in the VmmOps struct which is created before
DeviceManager::create_devices() is called.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This message only occurs sporadically and so it should be included at
info!() level. Enhance the output to also include the BAR number.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The rust-vmm crates we're pulling from git have renamed their main
branches. We need to update the branch names we're giving to Cargo,
or people who don't have these dependencies cached will get errors
like this when trying to build:
error: failed to get `vm-fdt` as a dependency of package `arch v0.1.0 (/home/src/cloud-hypervisor/arch)`
Caused by:
failed to load source for dependency `vm-fdt`
Caused by:
Unable to update https://github.com/rust-vmm/vm-fdt?branch=master#031572a6
Caused by:
object not found - no match for id (031572a6edc2f566a7278f1e17088fc5308d27ab); class=Odb (9); code=NotFound (-3)
Signed-off-by: Alyssa Ross <hi@alyssa.is>
This resolves an issue with hotplug -> removal -> hotplug of a vfio-user
device as the allocator was not updated with the now unused entries.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When mapping the region into the guest ensure that all the fields are
updated correctly as the unmap code path checks that they are set.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Taking advantage of the refactored VFIO code implement a new
VfioUserPciDevice that wraps the client for vfio-user and exposes the
BusDevice and PciDevice into the VMM.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This doesn't really affect the build as we ship a Cargo.lock with fixed
versions in. However for clarity it makes sense to use fixed versions
throughout and let dependabot update them.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
After the refactoring to split the common VFIO code out for vfio-user
there were some inconsistencies in the error handling. Correct this so
that the error is independent of the transport (hardware vs user) VFIO
and migrate to anyhow/thiserror in the process. Some unused errors from
earlier refactoring have also been removed.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Returning a reference is not possible for the vfio-user code as it is
constructed for the function.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>