Commit Graph

79 Commits

Author SHA1 Message Date
Samuel Ortiz
35dd1523c9 vmm: device_manager: Implement the Pausable trait
Since the Snapshotable placeholder and Migratable traits are provided as
well, the DeviceManager object and all its objects are now Migratable.

All Migratable devices are tracked as Arc<Mutex<dyn Migratable>>
references.

Keeping track of all migratable devices allows for implementing the
Migratable trait for the DeviceManager structure, making the whole
device model potentially migratable.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-12-12 08:50:36 +01:00
Samuel Ortiz
35d7721683 vmm: Convert virtio devices to Arc<Mutex<T>>
Migratable devices can be virtio or legacy devices.
In any case, they can potentially be tracked through one of the IO bus
as an Arc<Mutex<dyn BusDevice>>. In order for the DeviceManager to also
keep track of such devices as Migratable trait objects, they must be
shared as mutable atomic references, i.e. Arc<Mutex<T>>. That forces all
Migratable objects to be tracked as Arc<Mutex<dyn Migratable>>.

Virtio devices are typically migratable, and thus for them to be
referenced by the DeviceManager, they now should be built as
Arc<Mutex<VirtioDevice>>.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-12-12 08:50:36 +01:00
Sebastien Boeuf
64c5e3d8cb vmm: api: Adjust FsConfig for OpenAPI
The FsConfig structure has been recently adjusted so that the default
value matches between OpenAPI and CLI. Unfortunately, with the current
description, there is no way from the OpenAPI to describe a cache_size
value "None", so that DAX does not get enabled. Usually, using a Rust
"Option" works because the default value is None. But in this case, the
default value is Some(8G), which means we cannot describe a None.

This commit tackles the problem, introducing an explicit parameter
"dax", and leaving "cache_size" as a simple u64 integer.

This way, the default value is dax=true and cache_size=8G, but it lets
the opportunity to disable DAX entirely with dax=false, which will
simply ignore the cache_size value.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-12-11 15:50:24 +00:00
Sebastien Boeuf
5e0bbf9c3b vmm: Don't factorize vhost-user configurations
We want to set different default configurations for vhost-user-net and
vhost-user-blk, which is the reason why the common part corresponding to
the number of queues and the queue size cannot be embedded.

This prepares for the following commit, matching API and CLI behaviors.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-12-11 15:50:24 +00:00
Rob Bradford
ba59c62044 vmm, devices: Remove hardcoded IRQ number for GED device
Remove the previously hardcoded IRQ number used for the GED device.
Instead allocate the IRQ using the allocator and use that value in the
definition in the ACPI device.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-09 16:58:00 +00:00
Sebastien Boeuf
aa94e9b8f3 Revert "vmm: api: Modify FsConfig to be OpenAPI friendly"
This reverts commit defc5dcd9c.
2019-12-06 18:08:10 +00:00
Rob Bradford
9b1ba14f2d vmm: Delegate device related ACPI DSDT table work to DeviceManager
Move the code for handling the creation of the DSDT entries for devices
into the DeviceManager.

This will make it easier to handle device hotplug and also in the future
remove some hardcoded ACPI constants.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-06 17:44:00 +00:00
Sebastien Boeuf
defc5dcd9c vmm: api: Modify FsConfig to be OpenAPI friendly
When consumer of the HTTP API try to interact with cloud-hypervisor,
they have to provide the equivalent of the config structure related to
each component they need. Problem is, the Rust enum type "Option" cannot
be obtained from the OpenAPI YAML definition.

This patch intends to fix this inconsistency between what is possible
through the CLI and what's possible through the HTTP API by using simple
types bool and int64 instead of Option<u64>.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-12-06 06:38:48 -08:00
Rob Bradford
59d01712ad vmm: Remove kernel based IOAPIC handling from the device manager
Previously the device setup code assumed that if no IOAPIC was passed in
then the device should be added to the kernel irqchip. As an earlier
change meant that there was always a userspace IOAPIC this kernel based
code can be removed.

The accessor still returns an Option type to leave scope for
implementing a situation without an IOAPIC (no serial or GED device).
This change does not add support no-IOAPIC mode as the original code did
not either.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-06 12:34:06 +01:00
Rob Bradford
9b1cb9621f vmm: Remove pin based interrupt setup for virtio devices
With MSI now required remove pin based interrupt support from all the
virtio PCI device setup.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-06 12:34:06 +01:00
Rob Bradford
1722708612 vmm: Switch to storing VmConfig inside an Arc<Mutex<>>
This permits the runtime reconfiguration of the VM.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-05 16:39:19 +00:00
Sebastien Boeuf
08258d5dad vfio: pci: Allow multiple devices to be passed through
The KVM_SET_GSI_ROUTING ioctl is very simple, it overrides the previous
routes configuration with the new ones being applied. This means the
caller, in this case cloud-hypervisor, needs to maintain the list of all
interrupts which needs to be active at all times. This allows to
correctly support multiple devices to be passed through the VM and being
functional at the same time.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-12-04 08:48:17 +01:00
Rob Bradford
791ca3388f vmm: device_manager: Add ability to notify via GED device
Add ability to notify via the GED device that there is some new hotplug
activity. This will be used by the CpuManager (and later DeviceManager
itself) to notify of new hotplug activity.

Currently it has a hardcoded IRQ of 5 as the ACPI tables also need to
refer to this IRQ and the IRQ allocation does not permit the allocation
of specific IRQs.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-02 13:49:04 +00:00
Rob Bradford
7ad68d499a vmm: device_manager: Allocate I/O port for ACPI shutdown device
The refactoring in ce1765c8af dropped the
code to allocate the I/O port.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-02 13:49:04 +00:00
Samuel Ortiz
0f21781fbe cargo: Bump the kvm and vmm-sys-util crates
Since the kvm crates now depend on vmm-sys-util, the bump must be
atomic.
The kvm-bindings and ioctls 0.2.0 and 0.4.0 crates come with a few API
changes, one of them being the use of a kvm_ioctls specific error type.
Porting our code to that type makes for a fairly large diff stat.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-11-29 17:48:02 +00:00
Sebastien Boeuf
f979380620 vmm: Mark guest persistent memory pages as mergeable
In case the VM is started with the flag "--pmem mergeable=on", it means
the user expects the guest persistent memory pages to be marked as
mergeable. This commit relies on the madvise(MADV_MERGEABLE) system call
to inform the host kernel about these pages.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-11-22 15:28:10 +00:00
Rob Bradford
50c8335d3d vmm: device_manager: Expose the SystemAllocator
This allows other code to allocate I/O ports for use on the (already)
exposed IO bus.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-11-21 09:17:15 -08:00
Samuel Ortiz
f0e618431d vmm: device_manager: Use consistent naming when adding devices
When adding devices to the guest, and populating the device model, we
should prefix the routines with add_. When we're just creating the
device objects but not yet adding them we use make_.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-11-19 13:36:21 -08:00
Samuel Ortiz
a2ee681665 vmm: device_manager: Add an MMIO devices creation routine
In order to reduce the DeviceManager's new() complexity, we can move the
MMIO devices creation code into its own routine.

Fixes: #441

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-11-19 13:36:21 -08:00
Samuel Ortiz
79b8f8e477 vmm: device_manager: Add a PCI devices creation routine
In order to reduce the DeviceManager's new() complexity, we can move the
PCI devices creation code into its own routine.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-11-19 13:36:21 -08:00
Samuel Ortiz
5087f633f6 vmm: device_manager: Add an IOAPIC creation routine
In order to reduce the DeviceManager's new() complexity, we can move the
ACPI device creation code into its own routine.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-11-19 13:36:21 -08:00
Samuel Ortiz
ce1765c8af vmm: device_manager: Add an ACPI device creation routine
In order to reduce the DeviceManager's new() complexity, we can move the
ACPI device creation code into its own routine.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-11-19 13:36:21 -08:00
Samuel Ortiz
cfca2759fc vmm: device_manager: Add a legacy devices creation routine
In order to reduce the DeviceManager's new() complexity, we can move the
legacy devices creation code into its own routine.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-11-19 13:36:21 -08:00
Samuel Ortiz
4b469b98cf vmm: device_manager: Add a console creation routine
In order to reduce the DeviceManager's new() complexity, we can move the
console creation code into its own routine.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-11-19 13:36:21 -08:00
Rob Bradford
b3388c343d vmm: device_manager: Ensure I/O ports are allocated
Ensure that we tell the allocator about all the I/O ports that we are
using for I/O bus attached devices (serial, i8042, ACPI device.)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-11-05 10:13:01 +00:00
Sebastien Boeuf
5694ac2b1e vm-virtio: Create new VirtioTransport trait to abstract ioeventfds
In order to group together some functions that can be shared across
virtio transport layers, this commit introduces a new trait called
VirtioTransport.

The first function of this trait being ioeventfds() as it is needed from
both virtio-mmio and virtio-pci devices, represented by MmioDevice and
VirtioPciDevice structures respectively.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Sebastien Boeuf
3fa5df4161 vmm: Unregister old ioeventfds when reprogramming PCI BAR
Now that kvm-ioctls has been updated, the function unregister_ioevent()
can be used to remove eventfd previously associated with some specific
PIO or MMIO guest address. Particularly, it is useful for the PCI BAR
reprogramming case, as we want to ensure the eventfd will only get
triggered by the new BAR address, and not the old one.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Sebastien Boeuf
587a420429 cargo: Update to the latest kvm-ioctls version
We need to rely on the latest kvm-ioctls version to benefit from the
recent addition of unregister_ioevent(), allowing us to detach a
previously registered eventfd to a PIO or MMIO guest address.

Because of this update, we had to modify the current constraint we had
on the vmm-sys-util crate, using ">= 0.1.1" instead of being strictly
tied to "0.2.0".

Once the dependency conflict resolved, this commit took care of fixing
build issues caused by recent modification of kvm-ioctls relying on
EventFd reference instead of RawFd.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Sebastien Boeuf
c7cabc88b4 vmm: Conditionally update ioeventfds for virtio PCI device
The specific part of PCI BAR reprogramming that happens for a virtio PCI
device is the update of the ioeventfds addresses KVM should listen to.
This should not be triggered for every BAR reprogramming associated with
the virtio device since a virtio PCI device might have multiple BARs.

The update of the ioeventfds addresses should only happen when the BAR
related to those addresses is being moved.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Sebastien Boeuf
de21c9ba4f pci: Remove ioeventfds() from PciDevice trait
The PciDevice trait is supposed to describe only functions related to
PCI. The specific method ioeventfds() has nothing to do with PCI, but
instead would be more specific to virtio transport devices.

This commit removes the ioeventfds() method from the PciDevice trait,
adding some convenient helper as_any() to retrieve the Any trait from
the structure behing the PciDevice trait. This is the only way to keep
calling into ioeventfds() function from VirtioPciDevice, so that we can
still properly reprogram the PCI BAR.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Sebastien Boeuf
d6c68e4738 pci: Add error propagation to PCI BAR reprogramming
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
3e819ac797 pci: Use a weak reference to the AddressManager
Storing a strong reference to the AddressManager behind the
DeviceRelocation trait results in a cyclic reference count.
Use a weak reference to break that dependency.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
149b61b213 pci: Detect BAR reprogramming
Based on the value being written to the BAR, the implementation can
now detect if the BAR is being moved to another address. If that is the
case, it invokes move_bar() function from the DeviceRelocation trait.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
04a449d3f3 pci: Pass DeviceRelocation to PciBus
In order to trigger the PCI BAR reprogramming from PciConfigIo and
PciConfigMmmio, we need the PciBus to have a hold onto the trait
implementation of DeviceRelocation.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
e93467a96c vmm: Implement DeviceRelocation trait
By implementing the DeviceRelocation trait for the AddressManager
structure, we now have a way to let the PCI BAR reprogramming happen.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
8746c16593 vmm: Create AddressManager to own SystemAllocator
In order to reuse the SystemAllocator later at runtime, it is moved into
the new structure AddressManager. The goal is to have a hold onto the
SystemAllocator and both IO and MMIO buses so that we can use them
later.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
1870eb4295 devices: Lock the BtreeMap inside to avoid deadlocks
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
3acf9dfcf3 vfio: Don't map guest memory for VFIO devices attached to vIOMMU
In case a VFIO devices is being attached behind a virtual IOMMU, we
should not automatically map the entire guest memory for the specific
device.

A VFIO device attached to the virtual IOMMU will be driven with IOVAs,
hence we should simply wait for the requests coming from the virtual
IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-16 07:27:06 +02:00
Sebastien Boeuf
63c30a6e79 vmm: Build and set the list of external mappings for VFIO
When VFIO devices are created and if the device is attached to the
virtual IOMMU, the ExternalDmaMapping trait implementation is created
and associated with the device. The idea is to build a hash map of
device IDs with their associated trait implementation.

This hash map is provided to the virtual IOMMU device so that it knows
how to properly trigger external mappings associated with VFIO devices.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-16 07:27:06 +02:00
Sebastien Boeuf
837bcbc6ba vfio: Create VFIO implementation of ExternalDmaMapping
With this implementation of the trait ExternalDmaMapping, we now have
the tool to provide to the virtual IOMMU to trigger the map/unmap on
behalf of the guest.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-16 07:27:06 +02:00
Sebastien Boeuf
3598e603d5 vfio: Add a public function to retrive VFIO container
The VFIO container is the object needed to update the VFIO mapping
associated with a VFIO device. This patch allows the device manager
to have access to the VFIO container.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-16 07:27:06 +02:00
Sebastien Boeuf
9085a39c7d vmm: Attach VFIO devices to IORT table
This patch attaches VFIO devices to the virtual IOMMU if they are
identified as they should be, based on the option "iommu=on". This
simply takes care of adding the PCI device ID to the ACPI IORT table.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-16 07:27:06 +02:00
Sebastien Boeuf
b918220b49 vmm: Support virtio-pci devices attached to a virtual IOMMU
This commit is the glue between the virtio-pci devices attached to the
vIOMMU, and the IORT ACPI table exposing them to the guest as sitting
behind this vIOMMU.

An important thing is the trait implementation provided to the virtio
vrings for each device attached to the vIOMMU, as they need to perform
proper address translation before they can access the buffers.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
0acb1e329d vm-virtio: Translate addresses for devices attached to IOMMU
In case some virtio devices are attached to the virtual IOMMU, their
vring addresses need to be translated from IOVA into GPA. Otherwise it
makes no sense to try to access them, and they would cause out of range
errors.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
6566c739e1 vm-virtio: Add IOMMU support to virtio-vsock
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
9ab00dcb75 vm-virtio: Add IOMMU support to virtio-rng
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
ee1899c6f6 vm-virtio: Add IOMMU support to virtio-pmem
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
392f1ec155 vm-virtio: Add IOMMU support to virtio-console
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
9fad680db1 vm-virtio: Add IOMMU support to virtio-net
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
9ebb1a55bc vm-virtio: Add IOMMU support to virtio-blk
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00