Commit Graph

37 Commits

Author SHA1 Message Date
Sebastien Boeuf
c7cabc88b4 vmm: Conditionally update ioeventfds for virtio PCI device
The specific part of PCI BAR reprogramming that happens for a virtio PCI
device is the update of the ioeventfds addresses KVM should listen to.
This should not be triggered for every BAR reprogramming associated with
the virtio device since a virtio PCI device might have multiple BARs.

The update of the ioeventfds addresses should only happen when the BAR
related to those addresses is being moved.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Sebastien Boeuf
de21c9ba4f pci: Remove ioeventfds() from PciDevice trait
The PciDevice trait is supposed to describe only functions related to
PCI. The specific method ioeventfds() has nothing to do with PCI, but
instead would be more specific to virtio transport devices.

This commit removes the ioeventfds() method from the PciDevice trait,
adding some convenient helper as_any() to retrieve the Any trait from
the structure behing the PciDevice trait. This is the only way to keep
calling into ioeventfds() function from VirtioPciDevice, so that we can
still properly reprogram the PCI BAR.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Sebastien Boeuf
149b61b213 pci: Detect BAR reprogramming
Based on the value being written to the BAR, the implementation can
now detect if the BAR is being moved to another address. If that is the
case, it invokes move_bar() function from the DeviceRelocation trait.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
4f8054fa82 pci: Store the type of BAR to return correct address
Based on the type of BAR, we can now provide the correct address related
to a BAR index provided by the caller.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
0acb1e329d vm-virtio: Translate addresses for devices attached to IOMMU
In case some virtio devices are attached to the virtual IOMMU, their
vring addresses need to be translated from IOVA into GPA. Otherwise it
makes no sense to try to access them, and they would cause out of range
errors.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
2cd406ba50 vm-virtio: Fix virtio-pci BAR type
The 32 or 64 bits type for the memory BAR was not set correctly. This
patch ensure the right type is applied to the BAR.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-09-21 09:11:30 +01:00
Rob Bradford
180e6d1e78 vm-virtio: Allocate BARs for virtio-block devices in 32-bit hole
Currently all devices and guest memory share the same 64GiB
allocation. With guest memory working upwards and devices working
downwards. This creates issues if you want to either have a VM with a
large amount of memory or want to have devices with a large allocation
(e.g. virtio-pmem.)

As it is possible for the hypervisor to place devices anywhere in its
address range it is required for simplistic users like the firmware to
set up an identity page table mapping across the full range. Currently
the hypervisor sets up an identify mapping of 1GiB which the firmware
extends to 64GiB to match the current address space size of the
hypervisor.

A simpler solution is to place the device needed for booting with the
firmware (virtio-block) inside the 32-bit memory hole. This allows the
firmware to easily access the block device and paves the way for
increasing the address space beyond the current 64GiB limit.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-19 10:43:55 +01:00
Rob Bradford
c042483953 build: make PCI (virtio and vfio) disableable at build time
Although included by default it is now possible to build without PCI
support.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-13 12:30:13 +01:00
Sebastien Boeuf
0b8856d148 vmm: Add RwLock to the GuestMemoryMmap
Following the refactoring of the code allowing multiple threads to
access the same instance of the guest memory, this patch goes one step
further by adding RwLock to it. This anticipates the future need for
being able to modify the content of the guest memory at runtime.

The reasons for adding regions to an existing guest memory could be:
- Add virtio-pmem and virtio-fs regions after the guest memory was
  created.
- Support future hotplug of devices, memory, or anything that would
  require more memory at runtime.

Because most of the time, the lock will be taken as read only, using
RwLock instead of Mutex is the right approach.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-22 08:24:15 +01:00
Sebastien Boeuf
ec0b5567c8 vmm: Share the guest memory instead of cloning it
The VMM guest memory was cloned (copied) everywhere the code needed to
have ownership of it. In order to clean the code, and in anticipation
for future support of modifying this guest memory instance at runtime,
it is important that every part of the code share the same instance.

Because VirtioDevice implementations need to have access to it from
different threads, that's why Arc must be used in this case.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-22 08:24:15 +01:00
Sebastien Boeuf
658c076eb2 linters: Fix clippy issues
Latest clippy version complains about our existing code for the
following reasons:

- trait objects without an explicit `dyn` are deprecated
- `...` range patterns are deprecated
- lint `clippy::const_static_lifetime` has been renamed to
  `clippy::redundant_static_lifetimes`
- unnecessary `unsafe` block
- unneeded return statement

All these issues have been fixed through this patch, and rustfmt has
been run to cleanup potential formatting errors due to those changes.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-15 09:10:04 -07:00
Sebastien Boeuf
f30ba069b7 vm-virtio: Allocate shared memory regions on dedicated BAR
In the context of shared memory regions, they could not be present for
most of the virtio devices. For this reason, we prefer dedicate a BAR
for the shared memory regions.

Another reason is that memory regions, if there are several, can be
allocated all at once as a contiguous region, which then can be used as
its own BAR. It would be more complicated to try to allocate the BAR 0
holding the regular information about the virtio-pci device along with
the shared memory regions.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-13 13:57:53 +02:00
Sebastien Boeuf
e0fda0611c vm-virtio: Remove virtio-pci dependency from VirtioDevice
This patch cleans up the VirtioDevice trait. Since some function are PCI
specific and since they are not even used, it makes sense to remove them
from the trait definition.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-13 13:57:53 +02:00
Sebastien Boeuf
d97079d793 vm-virtio: Update VirtioPciCap and introduce VirtioPciCap64
Based on the latest version of the virtio specification, the structure
virtio_pci_cap has been updated and a new structure virtio_pci_cap64 has
been introduced.

virtio_pci_cap now includes a field "id" that does not modify the
existing structure size since there was a 3 bytes reserved field
already there. The id is used in the context of shared memory regions
which need to be identified since there could be more than one of this
kind of capability.

virtio_pci_cap64 is a new structure that includes virtio_pci_cap and
extends it to allow 64 bits offsets and 64 bits region length. This is
used in the context of shared memory regions capability, as we might need
to describe regions of 4G or more, that could be placed at a 4G offset
or more in the associated BAR.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-13 13:57:53 +02:00
Sebastien Boeuf
d180deb679 vm-virtio: pci: Fix PCI capability length
The length of the PCI capability as it is being calculated by the guest
was not accurate since it was not including the implicit 2 bytes offset.

The reason for this offset is that the structure itself does not contain
the capability ID (1 byte) and the next capability pointer (1 byte), but
the structure exposed through PCI config space does include those bytes.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-13 13:57:53 +02:00
Sebastien Boeuf
aa44726658 vm-virtio: Don't trigger an MSI-X interrupt if not enabled
Relying on the newly added MSI-X helper, the interrupt callback checks
the interrupts are enabled on the device before to try triggering the
interrupt.

Fixes #156

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-08 17:38:47 +01:00
Rob Bradford
9caad7394d build, misc: Bump vmm-sys-util dependency
The structure of the vmm-sys-util crate has changed with lots of code
moving to submodules.

This change adjusts the use of the imported structs to reference the
submodules.

Fixes: #145

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-08-02 07:42:20 -07:00
Sebastien Boeuf
98d7955e34 vm-virtio: Add support for notifying about virtio config update
As per the VIRTIO specification, every virtio device configuration can
be updated while the guest is running. The guest needs to be notified
when this happens, and it can be done in two different ways, depending
on the type of interrupt being used for those devices.

In case the device uses INTx, the allocated IRQ pin is shared between
queues and configuration updates. The way for the guest to differentiate
between an interrupt meant for a virtqueue or meant for a configuration
update is tied to the value of the ISR status field. This field is a
simple 32 bits bitmask where only bit 0 and 1 can be changed, the rest
is reserved.

In case the device uses MSI/MSI-X, the driver should allocate a
dedicated vector for configuration updates. This case is much simpler as
it only requires the device to send the appropriate MSI vector.

The cloud-hypervisor codebase was not supporting the update of a virtio
device configuration. This patch extends the existing VirtioInterrupt
closure to accept a type that can be Config or Queue, so that based on
this type, the closure implementation can make the right choice about
which interrupt pin or vector to trigger.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-07-29 15:34:37 +01:00
Chao Peng
96fb38a5aa vm-allocator: Align address at allocation time
There is alignment support for AddressAllocator but there are occations
that the alignment is known only when we call allocate(). One example
is PCI BAR which is natually aligned, means for which we have to align
the base address to its size.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
2019-07-22 09:51:16 -07:00
Sebastien Boeuf
1268165040 pci: Allow for registering IO and Memory BAR
This patch adds the support for both IO and Memory BARs by expecting
the function allocate_bars() to identify the type of each BAR.
Based on the type, register_mapping() insert the address range on the
appropriate bus (PIO or MMIO).

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-07-22 09:50:10 -07:00
Sebastien Boeuf
72007f016a pci: Improve MSI-X code to let VFIO rely on it
This commit enhances the current msi-x code hosted in the pci crate
in order to be reused by the vfio crate. Specifically, it creates
several useful methods for the MsixCap structure that can simplify
the caller's code.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-07-22 09:50:10 -07:00
Samuel Ortiz
4605ecf1a8 pci: Extend the Device trait to carry the device BARs
When reading from or writing to a PCI BAR to handle a VM exit, we need
to have the BAR address itself to be able to support multiple BARs PCI
devices.

Fixes: #87

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-07-08 07:39:21 +02:00
Samuel Ortiz
8173e1ccd7 devices: Extend the Bus trait to carry the device range base
With the range base for the IO/MMIO vm exit address, a device with
multiple ranges has all the needed information for resolving which of
its range the exit is coming from

Fixes: #87

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-07-08 07:39:21 +02:00
Samuel Ortiz
4a15316101 vm-virtio: Fix the network and storage PCI class and sub-class
Use the virtio device type to generate the righ class and subclass.

Fixes: #83

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-07-02 17:37:12 +02:00
Jing Liu
9da2343cb7 device: Improvement for BusDevice trait and PciDevice trait
BusDevice includes two methods which are only for PCI devices, which should
be as members of PciDevice trait for a better clean high level APIs.

Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
2019-06-25 06:17:30 -07:00
Sebastien Boeuf
24dbe7003a irq: Fix pin based interrupt for virtio-pci
When the KVM capability KVM_CAP_SIGNAL_MSI is not present, the VMM
falls back from MSI-X onto pin based interrupts. Unfortunately, this
was not working as expected because the VirtioPciDevice object was
always creating an MSI-X capability structure in the PCI configuration
space. This was causing the guest drivers to expect MSI-X interrupts
instead of the pin based generated ones.

This patch takes care of avoiding the creation of a dedicated MSI-X
capability structure when MSI is not supported by KVM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-07 18:19:52 +01:00
Sebastien Boeuf
4d98dcb077 msix: Handle MSI-X device masking
As mentioned in the PCI specification, the Function Mask from the
Message Control Register can be set to prevent a device from injecting
MSI-X messages. This supersedes the vector masking as it interacts at
the device level.

Here quoted from the specification:
For MSI and MSI-X, while a vector is masked, the function is prohibited
from sending the associated message, and the function must set the
associated Pending bit whenever the function would otherwise send the
message. When software unmasks a vector whose associated Pending bit is
set, the function must schedule sending the associated message, and
clear the Pending bit as soon as the message has been sent. Note that
clearing the MSI-X Function Mask bit may result in many messages
needing to be sent.

This commit implements the behavior described above by reorganizing
the way the PCI configuration space is being written. It is indeed
important to be able to catch a change in the Message Control
Register without having to implement it for every PciDevice
implementation. Instead, the PciConfiguration has been modified to
take care of handling any update made to this register.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-07 13:33:53 +01:00
Sebastien Boeuf
d810c7712d msix: Handle MSI-X vector masking
The current MSI-X implementation completely ignores the values found
in the Vector Control register related to a specific vector, and never
updates the Pending Bit Array.

According to the PCI specification, MSI-X vectors can be masked
through the Vector Control register on bit 0. If this bit is set,
the device should not inject any MSI message. When the device
runs into such situation, it must not inject the interrupt, but
instead it must update the bit corresponding to the vector number
in the Pending Bit Array.

Later on, if/when the Vector Control register is updated, and if
the bit 0 is flipped from 0 to 1, the device must look into the PBA
to find out if there was a pending interrupt for this specific
vector. If that's the case, an MSI message is injected and the
bit from the PBA is cleared.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-07 13:33:53 +01:00
Sebastien Boeuf
42378caa8b vm-virtio: Fix alignment and MSI-X table size on the BAR
As mentioned in the PCI specification:

If a dedicated Base Address register is not feasible, it is
recommended that a function isolate the MSI-X structures from
the non-MSI-X structures with aligned 8 KB ranges rather than
the mandatory aligned 4 KB ranges.

That's why this patch ensures that each structure present on the
BAR is 8KiB aligned.

It also fixes the MSI-X table and PBA sizes so that they can support
up to 2048 vectors, as specified for MSI-X.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-07 13:33:53 +01:00
Sebastien Boeuf
47a4065aaf interrupt: Use a single closure to describe pin based and MSI-X
In order to factorize the complexity brought by closures, this commit
merges IrqClosure and MsixClosure into a generic InterruptDelivery one.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-06 15:27:35 +01:00
Sebastien Boeuf
8df05b72dc vmm: Add MSI-X support to virtio-pci devices
In order to allow virtio-pci devices to use MSI-X messages instead
of legacy pin based interrupts, this patch implements the MSI-X
support for cloud-hypervisor. The VMM code and virtio-pci bits have
been modified based on the "msix" module previously added to the pci
crate.

Fixes #12

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-06 15:27:35 +01:00
Sebastien Boeuf
d3c7b45542 interrupt: Make IRQ delivery generic
Because we cannot always assume the irq fd will be the way to send
an IRQ to the guest, this means we cannot make the assumption that
every virtio device implementation should expect an EventFd to
trigger an IRQ.

This commit organizes the code related to virtio devices so that it
now expects a Rust closure instead of a known EventFd. This lets the
caller decide what should be done whenever a device needs to trigger
an interrupt to the guest.

The closure will allow for other type of interrupt mechanism such as
MSI to be implemented. From the device perspective, it could be a
pin based interrupt or an MSI, it does not matter since the device
will simply call into the provided callback, passing the appropriate
Queue as a reference. This design keeps the device model generic.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-06 15:27:35 +01:00
Samuel Ortiz
fe99c29743 vm-virtio: Remove useless PCI BAR debug log
We should not unconditionally display our virtio PCI BAR setting.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-05-10 16:32:39 +02:00
Rob Bradford
3b2faa9f11 vm-virtio: Reset underlying device on driver request
If the driver triggers a reset by writing zero into the status register
then reset the underlying device if supported. A device reset also
requires resetting various aspects of the queue.

In order to be able to do a subsequent reactivate it is required to
reclaim certain resources (interrupt and queue EventFDs.) If a device
reset is requested by the driver but the underlying device does not
support it then generate an error as the driver would not be able to
configure it anyway.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-05-09 15:44:18 +02:00
Samuel Ortiz
040ea5432d cloud-hypervisor: Add proper licensing
Add the BSD and Apache license.
Make all crosvm references point to the BSD license.
Add the right copyrights and identifier to our VMM code.
Add Intel copyright to the vm-virtio and pci crates.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-05-09 15:44:17 +02:00
Sebastien Boeuf
b67e0b3dad vmm: Use virtio-blk to support booting from disk image
After the virtio-blk device support has been introduced in the
previous commit, the vmm need to rely on this new device to boot
from disk images instead of initrd built into the kernel.

In order to achieve the proper support of virtio-blk, this commit
had to handle a few things:

  - Register an ioevent fd for each virtqueue. This important to be
    notified from the virtio driver that something has been written
    on the queue.

  - Fix the retrieval of 64bits BAR address. This is needed to provide
    the right address which need to be registered as the notification
    address from the virtio driver.

  - Fix the write_bar and read_bar functions. They were both assuming
    to be provided with an address, from which they were trying to
    find the associated offset. But the reality is that the offset is
    directly provided by the Bus layer.

  - Register a new virtio-blk device as a virtio-pci device from the
    vm.rs code. When the VM is started, it expects a block device to
    be created, using this block device as the VM rootfs.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-05-08 08:55:09 +02:00
Samuel Ortiz
c2c51dc9d1 vm-virtio: Add PCI transport support
Copied from crosvm 107edb3e with one main modification: VirtioPciDevice
implements BusDevice.

We need this modification because it is the only way for us to be able
to add a VirtioPciDevice to the MMIO bus. Bus insertion takes a
BusDevice. The fact that VirtioPciDevice implements PciDevice which
itself implements BusDevice does not mean that Rust will automatically
downcast a VirtioPciDevice into a BusDevice.

crosvm works around that issue by having the PCI, virtio and BusDevice
implementations in the same crate.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-05-08 08:55:06 +02:00