Commit Graph

311 Commits

Author SHA1 Message Date
Muminul Islam
e4dee57e81 arch, pci, vmm: Initial switch to the hypervisor crate
Start moving the vmm, arch and pci crates to being hypervisor agnostic
by using the hypervisor trait and abstractions. This is not a complete
switch and there are still some remaining KVM dependencies.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-06-22 15:03:15 +02:00
Henry Wang
99e72be169 unit tests: Fix unit tests and docs for AArch64
Currently, not every feature of the cloud-hypervisor is enabled
on AArch64, which means that on AArch64 machines, the
`run_unit_tests.sh` needs to be tailored and some unit test cases
should be run on x86_64 only.

Also this commit fixes the typo and unifies `Arm64` and `AArch64`
in the AArch64 document.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-06-15 17:28:05 +01:00
LiYa'nan
acc234088f vfio: fix for bug as below:
cloud-hypervisor: 763.978581807s: ERROR:pci/src/vfio.rs:651 -- failed to remove all guest memory regions from iommu table

when poweroff a vm with vfio device, clh will finally remove all guest memory region from iommu table
with the method unset_dma_map, not method setup_dma_map.

Signed-off-by: LiYa'nan <oliverliyn@gmail.com>
2020-06-15 08:51:13 +02:00
Anatol Belski
abd6204d27 source: Fix file permissions
Rust sources and some data files should not be executable. The perms are
set to 644.

Signed-off-by: Anatol Belski <ab@php.net>
2020-06-10 18:47:27 +01:00
Samuel Ortiz
3336e80192 vfio: Switch to the vfio-ioctls crate ch branch
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-06-04 08:48:55 +02:00
Samuel Ortiz
d24aa72d3e vfio: Rename to vfio-ioctls
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-06-04 08:48:55 +02:00
Samuel Ortiz
53ce529875 vfio: Move the PCI implementation to the PCI crate
There is a much stronger PCI dependency from vfio_pci.rs than a VFIO one
from pci/src/vfio.rs. It seems more natural to have the PCI specific
VFIO implementation in the PCI crate rather than the other way around.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-06-04 08:48:55 +02:00
dependabot-preview[bot]
aac87196d6 build(deps): bump vm-memory from 0.2.0 to 0.2.1
Bumps [vm-memory](https://github.com/rust-vmm/vm-memory) from 0.2.0 to 0.2.1.
- [Release notes](https://github.com/rust-vmm/vm-memory/releases)
- [Changelog](https://github.com/rust-vmm/vm-memory/blob/v0.2.1/CHANGELOG.md)
- [Commits](https://github.com/rust-vmm/vm-memory/compare/v0.2.0...v0.2.1)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-05-28 17:06:48 +01:00
Rob Bradford
c31ad72ee9 build: Address issues found by 1.43.0 clippy
These are mostly due to use of "bare use" statements and unnecessary vector
creation.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-27 19:32:12 +02:00
dependabot-preview[bot]
a4bb96d45c build(deps): bump libc from 0.2.70 to 0.2.71
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.70 to 0.2.71.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.70...0.2.71)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-05-27 09:02:13 +02:00
dependabot-preview[bot]
2991fd2a48 build(deps): bump libc from 0.2.69 to 0.2.70
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.69 to 0.2.70.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.69...0.2.70)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-05-12 20:26:43 +02:00
Sebastien Boeuf
1e0ebb760f pci: Allow specific PCI b/d/f to be reserved
In order to let the PciBus user choose where a device should be placed
on the bus, a new function get_device_id() is introduced. This will be
helpful in the context of snapshot/restore as the caller will be able to
place the PCI devices on the same slot they were placed before the
snapshot was taken.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-05-12 17:37:31 +01:00
Sebastien Boeuf
e1701f11b1 pci: Implement Snapshottable trait for PciConfiguration
The PCI configuration from each PCI device is modified at runtime as we
can expect the guest OS to write to some PCI capability structure, or
move the BAR to a different location in the guest address space.

For all the reasons why such configuration might differ from the initial
configuration, we must store the registers values to be able to restore
them with the right values whenever a PCI device is being restored.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-05-11 11:38:16 +01:00
Sebastien Boeuf
376db31107 pci: Implement Snapshottable trait for MsixConfig
In order to restore devices relying on MSI-X, the MsixConfig structure
must be restored with the correct values. Additionally, the KVM routes
must be restored so that interrupts can be delivered through KVM the way
they were configured before the snapshot was taken.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-05-11 11:38:16 +01:00
Rob Bradford
b8cfdab8b6 pci: configuration: Use correct algorithm for BAR size reporting
When reporting the BAR size it is necessary to return a value that is
encoded such that all the bits are set that represent the mask of the
natural alignment of the field.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-17 15:20:50 +02:00
Rob Bradford
9bd5ec8967 pci, vfio, vm-virtio: Specify a PCI revision ID of 1 for virtio-pci
Add support for specifying the PCI revision in the PCI configuration and
populate this with the value of 1 for virtio-pci devices.

The virtio-pci specification is slightly ambiguous only saying that
transitional (i.e. devices that support legacy and virtio 1.0) should
set this to 0. In practice it seems that software expects the revision
to be set to 1 for modern only devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-17 13:46:48 +02:00
Rob Bradford
56207a0328 pci: Print out details of the BAR moving upon error
In particular include the old and new bases as well as the length.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-15 12:02:19 +02:00
dependabot-preview[bot]
886c0f9093 build(deps): bump libc from 0.2.68 to 0.2.69
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.68 to 0.2.69.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.68...0.2.69)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-04-14 09:27:04 +01:00
Rob Bradford
8acc15a63c build: Bump vm-memory and linux-loader dependencies
linux-loader depends on vm-memory so must be updated at the same time.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-23 14:27:41 +00:00
dependabot-preview[bot]
51f51ea17d build(deps): bump libc from 0.2.67 to 0.2.68
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.67 to 0.2.68.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.67...0.2.68)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-03-17 21:36:38 +00:00
Sebastien Boeuf
8d785bbd5f pci: Fix the PciBus using HashMap instead of Vec
By using a Vec to hold the list of devices on the PciBus, there's a
problem when we use unplug. Indeed, the vector of devices gets reduced
and if the unplugged device was not the last one from the list, every
other device after this one is shifted on the bus.

To solve this problem, a HashMap is used. This allows to keep track of
the exact place where each device stands on the bus.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-13 10:54:34 +01:00
Sebastien Boeuf
f3dc245c4f pci: Extend PciDevice trait with new free_bars() method
The point of this new method is to let the caller decide when the
implementation of the PciDevice should free the BARs previously
allocated through the other method allocate_bars().

This provides a way to perform proper cleanup for any PCI device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-11 13:10:30 +00:00
Sebastien Boeuf
b50cbe5064 pci: Give PCI device ID back when removing a device
Upon removal of a PCI device, make sure we don't hold onto the device ID
as it could be reused for another device later.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
df71aaee3f pci: Make the device ID allocation smarter
In order to handle the case where devices are very often plugged and
unplugged from a VM, we need to handle the PCI device ID allocation
better.

Any PCI device could be removed, which means we cannot simply rely on
the vector size to give the next available PCI device ID.

That's why this patch stores in memory the information about the 32
slots availability. Based on this information, whenever a new slot is
needed, the code can correctly provide an available ID, or simply return
an error because all slots are taken.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
f8e2008e0e pci: Add a function to remove a PciDevice from the bus
Simple function relying on the retain() method from std::Vec, allowing
to remove every occurence of the same device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Rob Bradford
f0a3e7c4a1 build: Bump linux-loader and vm-memory dependencies
linux-loader now uses the released vm-memory so we must move to that
version at the same time.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-05 11:01:30 +01:00
Sebastien Boeuf
49268bff3b pci: Remove all Weak references from PciBus
Now that the BusDevice devices are stored as Weak references by the IO
and MMIO buses, there's no need to use Weak references from the PciBus
anymore.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 18:46:44 +01:00
Sebastien Boeuf
b77fdeba2d msi/msi-x: Prevent from losing masked interrupts
We want to prevent from losing interrupts while they are masked. The
way they can be lost is due to the internals of how they are connected
through KVM. An eventfd is registered to a specific GSI, and then a
route is associated with this same GSI.

The current code adds/removes a route whenever a mask/unmask action
happens. Problem with this approach, KVM will consume the eventfd but
it won't be able to find an associated route and eventually it won't
be able to deliver the interrupt.

That's why this patch introduces a different way of masking/unmasking
the interrupts, simply by registering/unregistering the eventfd with the
GSI. This way, when the vector is masked, the eventfd is going to be
written but nothing will happen because KVM won't consume the event.
Whenever the unmask happens, the eventfd will be registered with a
specific GSI, and if there's some pending events, KVM will trigger them,
based on the route associated with the GSI.

Suggested-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-02-25 08:31:14 +00:00
Qiu Wenbo
4cf89d373d pci: handle extended configuration space properly
This is critical to support extended capabilitiy list.

Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn>
2020-02-24 17:05:09 +01:00
Qiu Wenbo
f6b9445be7 pci: fix pci MMCONFIG address parsing
We should not assume the offset produced by ECAM is identical to the
CONFIG_ADDRESS register of legacy PCI port io enumeration.

Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn>
2020-02-24 17:05:09 +01:00
dependabot-preview[bot]
f190cb05b5 build(deps): bump libc from 0.2.66 to 0.2.67
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.66 to 0.2.67.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.66...0.2.67)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-02-21 08:03:30 +00:00
dependabot-preview[bot]
d46c61c5d4 build(deps): bump byteorder from 1.3.2 to 1.3.4
Bumps [byteorder](https://github.com/BurntSushi/byteorder) from 1.3.2 to 1.3.4.
- [Release notes](https://github.com/BurntSushi/byteorder/releases)
- [Changelog](https://github.com/BurntSushi/byteorder/blob/master/CHANGELOG.md)
- [Commits](https://github.com/BurntSushi/byteorder/compare/1.3.2...1.3.4)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-02-07 14:18:07 +00:00
Sebastien Boeuf
db9f9b7820 pci: Make self mutable when reading from PCI config space
In order to anticipate the need to support more features related to the
access of a device's PCI config space, this commits changes the self
reference in the function read_config_register() to be mutable.

This also brings some more flexibility for any implementation of the
PciDevice trait.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-30 09:25:52 +01:00
Sebastien Boeuf
e91638e6c5 pci: Cleanup the crate from unneeded types
Both InterruptDelivery and InterruptParameters can be removed from the
pci crate as they are not used anymore.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-21 10:44:48 +01:00
Sebastien Boeuf
99f39291fd pci: Simplify PciDevice trait
There's no need for assign_irq() or assign_msix() functions from the
PciDevice trait, as we can see it's never used anywhere in the codebase.
That's why it's better to remove these methods from the trait, and
slightly adapt the existing code.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-21 10:44:48 +01:00
Sebastien Boeuf
50a4c16d34 pci: Cleanup the crate from kvm_iotcls and kvm_bindings dependencies
Now that KVM specific interrupts are handled through InterruptManager
trait implementation, the pci crate does not need to rely on kvm_ioctls
and kvm_bindings crates.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
4bb12a2d8d interrupt: Reorganize all interrupt management with InterruptManager
Based on all the previous changes, we can at this point replace the
entire interrupt management with the implementation of InterruptManager
and InterruptSourceGroup traits.

By using KvmInterruptManager from the DeviceManager, we can provide both
VirtioPciDevice and VfioPciDevice a way to pick the kind of
InterruptSourceGroup they want to create. Because they choose the type
of interrupt to be MSI/MSI-X, they will be given a MsiInterruptGroup.

Both MsixConfig and MsiConfig are responsible for the update of the GSI
routes, which is why, by passing the MsiInterruptGroup to them, they can
still perform the GSI route management without knowing implementation
details. That's where the InterruptSourceGroup is powerful, as it
provides a generic way to manage interrupt, no matter the type of
interrupt and no matter which hypervisor might be in use.

Once the full replacement has been achieved, both SystemAllocator and
KVM specific dependencies can be removed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
c396baca46 vm-virtio: Modify VirtioInterrupt callback into a trait
Callbacks are not the most Rust idiomatic way of programming. The right
way is to use a Trait to provide multiple implementation of the same
interface.

Additionally, a Trait will allow for multiple functions to be defined
while using callbacks means that a new callback must be introduced for
each new function we want to add.

For these two reasons, the current commit modifies the existing
VirtioInterrupt callback into a Trait of the same name.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
ef7d889a79 vfio: Remove unused GSI routing functions
At this point, both MSI and MSI-X handle the KVM GSI routing update,
which means the vfio crate does not have to deal with it anymore.
Therefore, several functions can be removed from the vfio-pci code, as
they are not needed anymore.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
1a4b5ecc75 msi: Set KVM routes from MsiConfig instead of VFIO
Now that MsiConfig has access to both KVM VmFd and the list of GSI
routes, the update of the KVM GSI routes can be directly done from
MsiConfig instead of specifically from the vfio-pci implementation.

By moving the KVM GSI routes update at the MsiConfig level, any PCI
device such as vfio-pci, virtio-pci, or any other emulated PCI device
can benefit from it, without having to implement it on their own.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
f3c3870159 msi: Create MsiConfig to embed MsiCap
The same way we have MsixConfig in charge of managing whatever relates
to MSI-X vectors, we need a MsiConfig structure to manage MSI vectors.
The MsiCap structure is still needed as a low level API, but it is now
part of the MsiConfig which oversees anything related to MSI.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
1e5e02801f msix: Perform interrupt enabling/disabling
In order to factorize one step further, we let MsixConfig perform the
interrupt enabling/disabling. This is done by registering/unregistering
the KVM irq_fds of all GSI routes related to this device.

And now that MsixConfig is in charge of the irq_fds, vfio-pci must rely
on it to retrieve them and provide them to the vfio driver.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
19aeac40c9 msix: Remove the need for interrupt callback
Now that MsixConfig has access to the irq_fd descriptors associated with
each vector, it can directly write to it anytime it needs to trigger an
interrupt.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
3fe362e3bd msix: Set KVM routes from MsixConfig instead of VFIO
Now that MsixConfig has access to both KVM VmFd and the list of GSI
routes, the update of the KVM GSI routes can be directly done from
MsixConfig instead of specifically from the vfio-pci implementation.

By moving the KVM GSI routes update at the MsixConfig level, both
vfio-pci and virtio-pci (or any other emulated PCI device) can benefit
from it, without having to implement it on their own.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
2381f32ae0 msix: Add gsi_msi_routes to MsixConfig
Because MsixConfig will be responsible for updating KVM GSI routes at
some point, it is necessary that it can access the list of routes
contained by gsi_msi_routes.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
9b60fcdc39 msix: Add VmFd to MsixConfig
Because MsixConfig will be responsible for updating the KVM GSI routes
at some point, it must have access to the VmFd to invoke the KVM ioctl
KVM_SET_GSI_ROUTING.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
86c760a0d9 msix: Add SystemAllocator to MsixConfig
The point here is to let MsixConfig take care of the GSI allocation,
which means the SystemAllocator must be passed from the vmm crate all
the way down to the pci crate.

Once this is done, the GSI allocation and irq_fd creation is performed
by MsixConfig directly.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
f77d2c2d16 pci: Add some KVM and interrupt utilities to the crate
In order to anticipate the need for both msi.rs and msix.rs to rely on
some KVM utils and InterruptRoute structure to handle the update of the
KVM GSI routes, this commit adds these utilities directly to the pci
crate. So far, these were exclusively used by the vfio crate, which is
why there were located there.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
c2ae380503 pci: Refine detection of BAR reprogramming
The current code was always considering 0xffffffff being written to the
register as a sign it was expecting to get the size, hence the BAR
reprogramming detection was stating this case was not a reprogramming
case.

Problem is, if the value 0xffffffff is directed at a 64bits BAR, this
might be the high or low part of a 64bits address which is meant to be
the new address of the BAR, which means we would miss the detection of
the BAR being reprogrammed here.

This commit improves the code using finer granularity checks in order to
detect this corner case correctly.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-09 08:05:35 +01:00
Sebastien Boeuf
07bad79fd7 pci: Fix detection of expansion ROM BAR reprogramming
The expansion ROM BAR reprogramming was being triggered for the wrong
reason and was causing the following error to be reported:

ERROR:pci/src/bus.rs:207 -- Failed moving device BAR: failed allocating
new 32 bits MMIO range

Following the PCI specification, here is what is defined:

Device independent configuration software can determine how much address
space the device requires by writing a value of all 1's to the address
portion of the register and then reading the value back.

This means we cannot expect 0xffffffff to be written, as the address
portion corresponds to the bits 31-11. That's why whenever the size of
this special BAR is being asked for, the value being written is
0xfffff800.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-09 08:05:35 +01:00
Rob Bradford
c61104df47 vmm: Port to latest vmm-sys-util
The signal handling for vCPU signals has changed in the latest release
so switch to the new API.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-11 14:11:11 +00:00
Sebastien Boeuf
1379abb94b pci: msi: Fix MSG_CTL update through 32 bits write
If the MSG_CTL is being written from a 32 bits write access, the offset
won't be 0x2, but 0x0 instead. That's simply because 32 bits access have
to be aligned on each double word.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-12-04 08:48:17 +01:00
Samuel Ortiz
0f21781fbe cargo: Bump the kvm and vmm-sys-util crates
Since the kvm crates now depend on vmm-sys-util, the bump must be
atomic.
The kvm-bindings and ioctls 0.2.0 and 0.4.0 crates come with a few API
changes, one of them being the use of a kvm_ioctls specific error type.
Porting our code to that type makes for a fairly large diff stat.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-11-29 17:48:02 +00:00
Sebastien Boeuf
587a420429 cargo: Update to the latest kvm-ioctls version
We need to rely on the latest kvm-ioctls version to benefit from the
recent addition of unregister_ioevent(), allowing us to detach a
previously registered eventfd to a PIO or MMIO guest address.

Because of this update, we had to modify the current constraint we had
on the vmm-sys-util crate, using ">= 0.1.1" instead of being strictly
tied to "0.2.0".

Once the dependency conflict resolved, this commit took care of fixing
build issues caused by recent modification of kvm-ioctls relying on
EventFd reference instead of RawFd.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Sebastien Boeuf
c7cabc88b4 vmm: Conditionally update ioeventfds for virtio PCI device
The specific part of PCI BAR reprogramming that happens for a virtio PCI
device is the update of the ioeventfds addresses KVM should listen to.
This should not be triggered for every BAR reprogramming associated with
the virtio device since a virtio PCI device might have multiple BARs.

The update of the ioeventfds addresses should only happen when the BAR
related to those addresses is being moved.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Sebastien Boeuf
de21c9ba4f pci: Remove ioeventfds() from PciDevice trait
The PciDevice trait is supposed to describe only functions related to
PCI. The specific method ioeventfds() has nothing to do with PCI, but
instead would be more specific to virtio transport devices.

This commit removes the ioeventfds() method from the PciDevice trait,
adding some convenient helper as_any() to retrieve the Any trait from
the structure behing the PciDevice trait. This is the only way to keep
calling into ioeventfds() function from VirtioPciDevice, so that we can
still properly reprogram the PCI BAR.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Samuel Ortiz
3be95dbf93 pci: Remove KVM dependency
The PCI crate should not depend on the KVM crates.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-10-29 20:09:04 -07:00
Sebastien Boeuf
d6c68e4738 pci: Add error propagation to PCI BAR reprogramming
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
3e819ac797 pci: Use a weak reference to the AddressManager
Storing a strong reference to the AddressManager behind the
DeviceRelocation trait results in a cyclic reference count.
Use a weak reference to break that dependency.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
149b61b213 pci: Detect BAR reprogramming
Based on the value being written to the BAR, the implementation can
now detect if the BAR is being moved to another address. If that is the
case, it invokes move_bar() function from the DeviceRelocation trait.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
04a449d3f3 pci: Pass DeviceRelocation to PciBus
In order to trigger the PCI BAR reprogramming from PciConfigIo and
PciConfigMmmio, we need the PciBus to have a hold onto the trait
implementation of DeviceRelocation.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
4f8054fa82 pci: Store the type of BAR to return correct address
Based on the type of BAR, we can now provide the correct address related
to a BAR index provided by the caller.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
b51a9e1ef1 pci: Make PciBarRegionType implement PartialEq
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
1870eb4295 devices: Lock the BtreeMap inside to avoid deadlocks
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
c865f93c9b pci: Extend PciDevice trait with move_bar() function
In order to support PCI BAR reprogramming, the PciDevice trait defines a
new method move_bar() dedicated for taking appropriate action when a BAR
is moved to a different guest address.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
3e37f59933 pci: Add new DeviceRelocation trait
This new trait will allow the VMM to implement the mechanism following a
BAR modification regarding its location in the guest address space.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Samuel Ortiz
de9eb3e0fa Bump vmm-sys-utils to 0.2.0
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-10-23 11:35:11 +03:00
Samuel Ortiz
14eb071b29 Cargo: Move to crates.io vmm-sys-util
Use the newly published 0.1.1 version.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-10-08 07:28:53 -07:00
Sebastien Boeuf
b918220b49 vmm: Support virtio-pci devices attached to a virtual IOMMU
This commit is the glue between the virtio-pci devices attached to the
vIOMMU, and the IORT ACPI table exposing them to the guest as sitting
behind this vIOMMU.

An important thing is the trait implementation provided to the virtio
vrings for each device attached to the vIOMMU, as they need to perform
proper address translation before they can access the buffers.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Rob Bradford
833a3d456c pci, vmm: Expose the PCI bus for configuration via MMIO
Refactor the PCI datastructures to move the device ownership to a PciBus
struct. This PciBus struct can then be used by both a PciConfigIo and
PciConfigMmio in order to expose the configuration space via both IO
port and also via MMIO for PCI MMCONFIG.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-30 18:00:31 +01:00
Sebastien Boeuf
658c076eb2 linters: Fix clippy issues
Latest clippy version complains about our existing code for the
following reasons:

- trait objects without an explicit `dyn` are deprecated
- `...` range patterns are deprecated
- lint `clippy::const_static_lifetime` has been renamed to
  `clippy::redundant_static_lifetimes`
- unnecessary `unsafe` block
- unneeded return statement

All these issues have been fixed through this patch, and rustfmt has
been run to cleanup potential formatting errors due to those changes.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-15 09:10:04 -07:00
Samuel Ortiz
76e3a30c31 pci: Simplify PciDevice trait
We do not use the on_device_sandboxed() and
register_device_capabilities() methods.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-08-14 18:13:44 +02:00
Sebastien Boeuf
c0e2bbb23f pci: Add MSI-X helper to check if interrupts are enabled
In order to check if device's interrupts are enabled, this patch adds
a helper function to the MsixConfig structure so that at any point in
time we can check if an interrupt should be delivered or not.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-08 17:38:47 +01:00
Sebastien Boeuf
87195c9ccc pci: Fix vector control read/write from/to MSI-X table
The vector control offset is at the 4th byte of each MSI-X table entry.
For that reason, it is located at 0xc, and not 0x10 as implemented.

This commit fixes the current MSI-X code, allowing proper reading and
writing of each vector control register in the MSI-X table.

Fixes #156

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-08 08:10:18 +01:00
Sebastien Boeuf
846505d360 pci: Fix add_capability unit test
The structure provided to the add_capability() function should only
contain what comes after the capability ID and the next capability
pointer, which are located on the first WORD.

Because the structure TestCap included _vndr and _next fields, they
were directly set after the first WORD, while the assertion was
expecting to find the values of len and foo fields.

Fixes #105

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-03 08:43:44 +01:00
Rob Bradford
9caad7394d build, misc: Bump vmm-sys-util dependency
The structure of the vmm-sys-util crate has changed with lots of code
moving to submodules.

This change adjusts the use of the imported structs to reference the
submodules.

Fixes: #145

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-08-02 07:42:20 -07:00
Rob Bradford
ac950d9a97 build: Bulk update dependencies
Update all dependencies with "cargo upgrade" with the exception of
vmm-sys-utils which needs some extra porting work.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-08-02 15:22:37 +02:00
Sebastien Boeuf
a548a01423 pci: Fix MSI-X table and PBA offsets
The offsets returned by the table_offset() and pba_offset() function
were wrong as they were shifting the value by 3 bits. The MSI-X spec
defines the MSI-X table and PBA offsets as being defined on 3-31 bits,
but this does not mean it has to be shifted. Instead, the address is
still on 32 bits and assumes the LSB bits 0-2 are set to 0.

VFIO was working fine with devices were the MSI-X offset were 0x0, but
the bug was found on a device where the offset was non-null.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-02 09:45:20 +02:00
Sebastien Boeuf
d217089b54 pci: Add support for expansion ROM BAR
The expansion ROM BAR can be considered like a 32-bit memory BAR with a
slight difference regarding the amount of reserved bits at the beginning
of its 32-bit value. Bit 0 indicates if the BAR is enabled or disabled,
while bits 1-10 are reserved. The remaining upper 21 bits hold the BAR
address.

This commit extends the pci crate in order to support expansion ROM BAR.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-07-31 09:28:29 +02:00
Sebastien Boeuf
b6ae2ccda4 pci: Disable multiple functions
Every PCI device is exposed as a separate device, on a specific PCI
slot, and we explicitely don't support to expose a device as a multi
function device. For this reason, this patch makes sure the enumeration
of the PCI bus will not find some multi function device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-07-31 09:28:29 +02:00
Sebastien Boeuf
927861ced2 pci: Fix end of address space check
The check performed on the end address was wrong since the end address
was actually the address right after the end. To get the right end
address, we need to add (region size - 1) to the start address.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-07-25 11:45:38 +01:00
Sebastien Boeuf
1268165040 pci: Allow for registering IO and Memory BAR
This patch adds the support for both IO and Memory BARs by expecting
the function allocate_bars() to identify the type of each BAR.
Based on the type, register_mapping() insert the address range on the
appropriate bus (PIO or MMIO).

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-07-22 09:50:10 -07:00
Sebastien Boeuf
b157181656 pci: Fix the way PCI configuration registers are being written
The way the function write_reg() was implemented, it was not keeping
the bits supposed to be read-only whenever the guest was writing to one
of those. That's why this commit takes care of protecting those bits,
preventing them from being updated.

The tricky part is about the BARs since we also need to handle the very
specific case where the BAR is being written with all 1's. In that case
we want to return the size of the BAR instead of its address.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-07-22 09:50:10 -07:00
Sebastien Boeuf
185b1082fb pci: Add a helper to set the BAR type
A BAR can be three different types: IO, 32 bits Memory, or 64 bits
Memory. The VMM needs a way to set the right type depending on its
needs.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-07-22 09:50:10 -07:00
Sebastien Boeuf
ee39e46568 pci: Add MSI capability structure
In order to support use cases that require MSI, the pci crate is
being expanded with the description of an MSI PCI capability
structure through the new MsiCap Rust structure.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-07-22 09:50:10 -07:00
Sebastien Boeuf
72007f016a pci: Improve MSI-X code to let VFIO rely on it
This commit enhances the current msi-x code hosted in the pci crate
in order to be reused by the vfio crate. Specifically, it creates
several useful methods for the MsixCap structure that can simplify
the caller's code.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-07-22 09:50:10 -07:00
Samuel Ortiz
29878956bd pci: Implement the From trait for the PciCapabilityID structure
This will be needed by the VFIO crate for managing MSI capabilities.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-07-22 09:50:10 -07:00
Rob Bradford
9a17871630 pci: Make unit tests compile
Another member was added to the configuration struct.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-07-16 17:09:05 +02:00
Rob Bradford
74d079f7da pci: Mark add_capability test as #[ignore] as it is currently failing
See #105

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-07-16 17:09:05 +02:00
Samuel Ortiz
2b2c31d206 pci: Use device PCI header type for our root bridge
The Bridge header type is really meant to be for PCI-to-PCI bridges.

Fixes: #85

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-07-10 09:53:02 +02:00
Jing Liu
2bb0b22cc1 pci: Refine pci topology
PciConfigIo is a legacy pci bus dispatcher, which manages all pci
devices including a pci root bridge. However, it is unnecessary to
design a complex hierarchy which redirects every access by PciRoot.

Since pci root bridge is also a pci device instance, and only contains
easy config space read/write, and PciConfigIo actually acts as a pci bus
to dispatch resource based resolving when VMExit, we re-arrange to make
the pci hierarchy clean.

Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
2019-07-09 10:01:18 +02:00
Samuel Ortiz
4605ecf1a8 pci: Extend the Device trait to carry the device BARs
When reading from or writing to a PCI BAR to handle a VM exit, we need
to have the BAR address itself to be able to support multiple BARs PCI
devices.

Fixes: #87

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-07-08 07:39:21 +02:00
Samuel Ortiz
8173e1ccd7 devices: Extend the Bus trait to carry the device range base
With the range base for the IO/MMIO vm exit address, a device with
multiple ranges has all the needed information for resolving which of
its range the exit is coming from

Fixes: #87

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-07-08 07:39:21 +02:00
Samuel Ortiz
0b7fb42a6c pci: Export network and mass storage sub classes
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-07-02 17:37:12 +02:00
Jing Liu
9da2343cb7 device: Improvement for BusDevice trait and PciDevice trait
BusDevice includes two methods which are only for PCI devices, which should
be as members of PciDevice trait for a better clean high level APIs.

Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
2019-06-25 06:17:30 -07:00
Sebastien Boeuf
4d98dcb077 msix: Handle MSI-X device masking
As mentioned in the PCI specification, the Function Mask from the
Message Control Register can be set to prevent a device from injecting
MSI-X messages. This supersedes the vector masking as it interacts at
the device level.

Here quoted from the specification:
For MSI and MSI-X, while a vector is masked, the function is prohibited
from sending the associated message, and the function must set the
associated Pending bit whenever the function would otherwise send the
message. When software unmasks a vector whose associated Pending bit is
set, the function must schedule sending the associated message, and
clear the Pending bit as soon as the message has been sent. Note that
clearing the MSI-X Function Mask bit may result in many messages
needing to be sent.

This commit implements the behavior described above by reorganizing
the way the PCI configuration space is being written. It is indeed
important to be able to catch a change in the Message Control
Register without having to implement it for every PciDevice
implementation. Instead, the PciConfiguration has been modified to
take care of handling any update made to this register.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-07 13:33:53 +01:00
Sebastien Boeuf
d810c7712d msix: Handle MSI-X vector masking
The current MSI-X implementation completely ignores the values found
in the Vector Control register related to a specific vector, and never
updates the Pending Bit Array.

According to the PCI specification, MSI-X vectors can be masked
through the Vector Control register on bit 0. If this bit is set,
the device should not inject any MSI message. When the device
runs into such situation, it must not inject the interrupt, but
instead it must update the bit corresponding to the vector number
in the Pending Bit Array.

Later on, if/when the Vector Control register is updated, and if
the bit 0 is flipped from 0 to 1, the device must look into the PBA
to find out if there was a pending interrupt for this specific
vector. If that's the case, an MSI message is injected and the
bit from the PBA is cleared.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-07 13:33:53 +01:00
Sebastien Boeuf
edd1279609 pci: Allow QWORD read and write to MSI-X table
As mentioned in the PCI specification, MSI-X table supports both
DWORD and QWORD accesses:

For all accesses to MSI-X Table and MSI-X PBA fields, software must
use aligned full DWORD or aligned full QWORD transactions; otherwise,
the result is undefined.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-07 13:33:53 +01:00
Sebastien Boeuf
00cdbbc673 pci: Make MSI-X PBA read only
Relying on the PCI specification, the Pending Bit Array is read only.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-07 13:33:53 +01:00
Sebastien Boeuf
47a4065aaf interrupt: Use a single closure to describe pin based and MSI-X
In order to factorize the complexity brought by closures, this commit
merges IrqClosure and MsixClosure into a generic InterruptDelivery one.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-06 15:27:35 +01:00
Sebastien Boeuf
8df05b72dc vmm: Add MSI-X support to virtio-pci devices
In order to allow virtio-pci devices to use MSI-X messages instead
of legacy pin based interrupts, this patch implements the MSI-X
support for cloud-hypervisor. The VMM code and virtio-pci bits have
been modified based on the "msix" module previously added to the pci
crate.

Fixes #12

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-06 15:27:35 +01:00
Sebastien Boeuf
13a065d2cd dep: Rely on latest kvm-ioctls crate
In order to have access to the newly added signal_msi() function
from the kvm-ioctls crate, this commit updates the version of the
kvm-ioctls to the latest one.

Because set_user_memory_region() has been swtiched to "unsafe", we
also need to handle this small change in our cloud-hypervisor code
directly.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-06 15:27:35 +01:00
Sebastien Boeuf
4b53dc4921 pci: Add MSI-X implementation
In order to support MSI-X, this commit adds to the pci crate a new
module called "msix". This module brings all the necessary pieces
to let any PCI device implement MSI-X support.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-06 15:27:35 +01:00
Sebastien Boeuf
d3c7b45542 interrupt: Make IRQ delivery generic
Because we cannot always assume the irq fd will be the way to send
an IRQ to the guest, this means we cannot make the assumption that
every virtio device implementation should expect an EventFd to
trigger an IRQ.

This commit organizes the code related to virtio devices so that it
now expects a Rust closure instead of a known EventFd. This lets the
caller decide what should be done whenever a device needs to trigger
an interrupt to the guest.

The closure will allow for other type of interrupt mechanism such as
MSI to be implemented. From the device perspective, it could be a
pin based interrupt or an MSI, it does not matter since the device
will simply call into the provided callback, passing the appropriate
Queue as a reference. This design keeps the device model generic.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-06 15:27:35 +01:00
Samuel Ortiz
a6b7715f4b vendor: Move to the rust-vmm vmm-sys-util package
Locked to 60fe35be but no longer dependent on liujing2 repo.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-06-04 17:51:52 +02:00
Samuel Ortiz
9299502955 cloud-hypervisor: Switch to crates.io kvm-ioctls
Fixes: #15

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-05-15 05:59:08 +01:00
Samuel Ortiz
ac328df87c cloud-hypervisor: Switch to the vmm-sys-util pending PR branch
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-05-10 16:32:39 +02:00
Rob Bradford
4b58eb4867 pci: configuration: Fix rustfmt issue
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-05-10 16:32:39 +02:00
Samuel Ortiz
040ea5432d cloud-hypervisor: Add proper licensing
Add the BSD and Apache license.
Make all crosvm references point to the BSD license.
Add the right copyrights and identifier to our VMM code.
Add Intel copyright to the vm-virtio and pci crates.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-05-09 15:44:17 +02:00
Sebastien Boeuf
b67e0b3dad vmm: Use virtio-blk to support booting from disk image
After the virtio-blk device support has been introduced in the
previous commit, the vmm need to rely on this new device to boot
from disk images instead of initrd built into the kernel.

In order to achieve the proper support of virtio-blk, this commit
had to handle a few things:

  - Register an ioevent fd for each virtqueue. This important to be
    notified from the virtio driver that something has been written
    on the queue.

  - Fix the retrieval of 64bits BAR address. This is needed to provide
    the right address which need to be registered as the notification
    address from the virtio driver.

  - Fix the write_bar and read_bar functions. They were both assuming
    to be provided with an address, from which they were trying to
    find the associated offset. But the reality is that the offset is
    directly provided by the Bus layer.

  - Register a new virtio-blk device as a virtio-pci device from the
    vm.rs code. When the VM is started, it expects a block device to
    be created, using this block device as the VM rootfs.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-05-08 08:55:09 +02:00
Samuel Ortiz
e8308dd13b pci: Add minimal PCI host emulation crate
This crate is based on the crosvm devices/src/pci implementation from 107edb3e
We introduced a few changes:

- This one is a standalone crate. The device crate does not carry any
  PCI specific bits.
- Simplified PCI root configuration. We only carry a pointer to a
  PciConfiguration, not a wrapper around it.
- Simplified BAR allocation API. All BARs from the PciDevice instance
  must be generated at once through the PciDevice.allocate_bars()
  method.
- The PCI BARs are added to the MMIO bus from the PciRoot add_device()
  method.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-05-08 08:55:06 +02:00