Commit Graph

32 Commits

Author SHA1 Message Date
Rob Bradford
0eb78ab177 vmm: Extract PCI related state from DeviceManager
Move the PCI related state from the DeviceManager struct to a PciSegment
struct inside the DeviceManager. This is in preparation for multiple
segment support. Currently this state is just the bus itself, the MMIO
and PIO config devices and hotplug related state.

The main change that this required is using the Arc<Mutex<PciBus>> in
the device addition logic in order to ensure that
the bus could be created earlier.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-05 10:54:07 +01:00
Rob Bradford
0faa7afac2 vmm: Add fast path for PCI config IO port
Looking up devices on the port I/O bus is time consuming during the
boot at there is an O(lg n) tree lookup and the overhead from taking a
lock on the bus contents.

Avoid this by adding a fast path uses the hardcoded port address and
size and directs PCI config requests directly to the device.

Command line:
target/release/cloud-hypervisor --kernel ~/src/linux/vmlinux --cmdline "root=/dev/vda1 console=ttyS0" --serial tty --console off --disk path=~/workloads/focal-server-cloudimg-amd64-custom-20210609-0.raw --api-socket /tmp/api

PIO exit: 17913
PCI fast path: 17871
Percentage on fast path: 99.8%

perf before:

marvin:~/src/cloud-hypervisor (main *)$ perf report -g | grep resolve
     6.20%     6.20%  vcpu0            cloud-hypervisor    [.] vm_device:🚌:Bus::resolve

perf after:

marvin:~/src/cloud-hypervisor (2021-09-17-ioapic-fast-path *)$ perf report -g | grep resolve
     0.08%     0.08%  vcpu0            cloud-hypervisor    [.] vm_device:🚌:Bus::resolve

The compromise required to implement this fast path is bringing the
creation of the PciConfigIo device into the DeviceManager::new() so that
it can be used in the VmmOps struct which is created before
DeviceManager::create_devices() is called.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-17 17:09:45 +01:00
Rob Bradford
827229d8e4 pci: Address Rust 1.51.0 clippy issue (upper_case_acroynms)
warning: name `IORegion` contains a capitalized acronym
   --> pci/src/configuration.rs:320:5
    |
320 |     IORegion = 0x01,
    |     ^^^^^^^^ help: consider making the acronym lowercase, except the initial letter (notice the capitalization): `IoRegion`
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#upper_case_acronyms

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-26 11:32:09 +00:00
Rob Bradford
7cc729c7d9 pci, virtio-devices: Extend barrier returning through PCI code
We need to be able to return the barrier from the code that prepares to
activate the virtio device. This triggered by a write to the
configuration fields stored in the PCI BAR. Since bars can be accessed
by both memory mapping and through PCI config I/O several prototypes
must be changed.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-12-17 11:23:53 +00:00
Rob Bradford
1fc6d50f3e misc: Make Bus::write() return an Option<Arc<Barrier>>
This can be uses to indicate to the caller that it should wait on the
barrier before returning as there is some asynchronous activity
triggered by the write which requires the KVM exit to block until it's
completed.

This is useful for having vCPU thread wait for the VMM thread to proceed
to activate the virtio devices.

See #1863

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-12-17 11:23:53 +00:00
Rob Bradford
15025d71b1 devices, vm-device: Move BusDevice and Bus into vm-device
This removes the dependency of the pci crate on the devices crate which
now only contains the device implementations themselves.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-10 09:35:38 +01:00
Michael Zhao
17057a0dd9 vmm: Fix build errors with "pci" feature on AArch64
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-07-14 14:34:54 +01:00
Rob Bradford
c31ad72ee9 build: Address issues found by 1.43.0 clippy
These are mostly due to use of "bare use" statements and unnecessary vector
creation.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-27 19:32:12 +02:00
Sebastien Boeuf
1e0ebb760f pci: Allow specific PCI b/d/f to be reserved
In order to let the PciBus user choose where a device should be placed
on the bus, a new function get_device_id() is introduced. This will be
helpful in the context of snapshot/restore as the caller will be able to
place the PCI devices on the same slot they were placed before the
snapshot was taken.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-05-12 17:37:31 +01:00
Rob Bradford
9bd5ec8967 pci, vfio, vm-virtio: Specify a PCI revision ID of 1 for virtio-pci
Add support for specifying the PCI revision in the PCI configuration and
populate this with the value of 1 for virtio-pci devices.

The virtio-pci specification is slightly ambiguous only saying that
transitional (i.e. devices that support legacy and virtio 1.0) should
set this to 0. In practice it seems that software expects the revision
to be set to 1 for modern only devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-17 13:46:48 +02:00
Rob Bradford
56207a0328 pci: Print out details of the BAR moving upon error
In particular include the old and new bases as well as the length.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-15 12:02:19 +02:00
Sebastien Boeuf
8d785bbd5f pci: Fix the PciBus using HashMap instead of Vec
By using a Vec to hold the list of devices on the PciBus, there's a
problem when we use unplug. Indeed, the vector of devices gets reduced
and if the unplugged device was not the last one from the list, every
other device after this one is shifted on the bus.

To solve this problem, a HashMap is used. This allows to keep track of
the exact place where each device stands on the bus.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-13 10:54:34 +01:00
Sebastien Boeuf
b50cbe5064 pci: Give PCI device ID back when removing a device
Upon removal of a PCI device, make sure we don't hold onto the device ID
as it could be reused for another device later.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
df71aaee3f pci: Make the device ID allocation smarter
In order to handle the case where devices are very often plugged and
unplugged from a VM, we need to handle the PCI device ID allocation
better.

Any PCI device could be removed, which means we cannot simply rely on
the vector size to give the next available PCI device ID.

That's why this patch stores in memory the information about the 32
slots availability. Based on this information, whenever a new slot is
needed, the code can correctly provide an available ID, or simply return
an error because all slots are taken.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
f8e2008e0e pci: Add a function to remove a PciDevice from the bus
Simple function relying on the retain() method from std::Vec, allowing
to remove every occurence of the same device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
49268bff3b pci: Remove all Weak references from PciBus
Now that the BusDevice devices are stored as Weak references by the IO
and MMIO buses, there's no need to use Weak references from the PciBus
anymore.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 18:46:44 +01:00
Qiu Wenbo
f6b9445be7 pci: fix pci MMCONFIG address parsing
We should not assume the offset produced by ECAM is identical to the
CONFIG_ADDRESS register of legacy PCI port io enumeration.

Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn>
2020-02-24 17:05:09 +01:00
Sebastien Boeuf
db9f9b7820 pci: Make self mutable when reading from PCI config space
In order to anticipate the need to support more features related to the
access of a device's PCI config space, this commits changes the self
reference in the function read_config_register() to be mutable.

This also brings some more flexibility for any implementation of the
PciDevice trait.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-30 09:25:52 +01:00
Sebastien Boeuf
c7cabc88b4 vmm: Conditionally update ioeventfds for virtio PCI device
The specific part of PCI BAR reprogramming that happens for a virtio PCI
device is the update of the ioeventfds addresses KVM should listen to.
This should not be triggered for every BAR reprogramming associated with
the virtio device since a virtio PCI device might have multiple BARs.

The update of the ioeventfds addresses should only happen when the BAR
related to those addresses is being moved.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Sebastien Boeuf
de21c9ba4f pci: Remove ioeventfds() from PciDevice trait
The PciDevice trait is supposed to describe only functions related to
PCI. The specific method ioeventfds() has nothing to do with PCI, but
instead would be more specific to virtio transport devices.

This commit removes the ioeventfds() method from the PciDevice trait,
adding some convenient helper as_any() to retrieve the Any trait from
the structure behing the PciDevice trait. This is the only way to keep
calling into ioeventfds() function from VirtioPciDevice, so that we can
still properly reprogram the PCI BAR.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Sebastien Boeuf
d6c68e4738 pci: Add error propagation to PCI BAR reprogramming
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
3e819ac797 pci: Use a weak reference to the AddressManager
Storing a strong reference to the AddressManager behind the
DeviceRelocation trait results in a cyclic reference count.
Use a weak reference to break that dependency.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
149b61b213 pci: Detect BAR reprogramming
Based on the value being written to the BAR, the implementation can
now detect if the BAR is being moved to another address. If that is the
case, it invokes move_bar() function from the DeviceRelocation trait.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
04a449d3f3 pci: Pass DeviceRelocation to PciBus
In order to trigger the PCI BAR reprogramming from PciConfigIo and
PciConfigMmmio, we need the PciBus to have a hold onto the trait
implementation of DeviceRelocation.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
1870eb4295 devices: Lock the BtreeMap inside to avoid deadlocks
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
b918220b49 vmm: Support virtio-pci devices attached to a virtual IOMMU
This commit is the glue between the virtio-pci devices attached to the
vIOMMU, and the IORT ACPI table exposing them to the guest as sitting
behind this vIOMMU.

An important thing is the trait implementation provided to the virtio
vrings for each device attached to the vIOMMU, as they need to perform
proper address translation before they can access the buffers.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Rob Bradford
833a3d456c pci, vmm: Expose the PCI bus for configuration via MMIO
Refactor the PCI datastructures to move the device ownership to a PciBus
struct. This PciBus struct can then be used by both a PciConfigIo and
PciConfigMmio in order to expose the configuration space via both IO
port and also via MMIO for PCI MMCONFIG.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-30 18:00:31 +01:00
Sebastien Boeuf
658c076eb2 linters: Fix clippy issues
Latest clippy version complains about our existing code for the
following reasons:

- trait objects without an explicit `dyn` are deprecated
- `...` range patterns are deprecated
- lint `clippy::const_static_lifetime` has been renamed to
  `clippy::redundant_static_lifetimes`
- unnecessary `unsafe` block
- unneeded return statement

All these issues have been fixed through this patch, and rustfmt has
been run to cleanup potential formatting errors due to those changes.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-15 09:10:04 -07:00
Sebastien Boeuf
b6ae2ccda4 pci: Disable multiple functions
Every PCI device is exposed as a separate device, on a specific PCI
slot, and we explicitely don't support to expose a device as a multi
function device. For this reason, this patch makes sure the enumeration
of the PCI bus will not find some multi function device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-07-31 09:28:29 +02:00
Sebastien Boeuf
1268165040 pci: Allow for registering IO and Memory BAR
This patch adds the support for both IO and Memory BARs by expecting
the function allocate_bars() to identify the type of each BAR.
Based on the type, register_mapping() insert the address range on the
appropriate bus (PIO or MMIO).

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-07-22 09:50:10 -07:00
Samuel Ortiz
2b2c31d206 pci: Use device PCI header type for our root bridge
The Bridge header type is really meant to be for PCI-to-PCI bridges.

Fixes: #85

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-07-10 09:53:02 +02:00
Jing Liu
2bb0b22cc1 pci: Refine pci topology
PciConfigIo is a legacy pci bus dispatcher, which manages all pci
devices including a pci root bridge. However, it is unnecessary to
design a complex hierarchy which redirects every access by PciRoot.

Since pci root bridge is also a pci device instance, and only contains
easy config space read/write, and PciConfigIo actually acts as a pci bus
to dispatch resource based resolving when VMExit, we re-arrange to make
the pci hierarchy clean.

Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
2019-07-09 10:01:18 +02:00