In case a VFIO devices is being attached behind a virtual IOMMU, we
should not automatically map the entire guest memory for the specific
device.
A VFIO device attached to the virtual IOMMU will be driven with IOVAs,
hence we should simply wait for the requests coming from the virtual
IOMMU.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
When VFIO devices are created and if the device is attached to the
virtual IOMMU, the ExternalDmaMapping trait implementation is created
and associated with the device. The idea is to build a hash map of
device IDs with their associated trait implementation.
This hash map is provided to the virtual IOMMU device so that it knows
how to properly trigger external mappings associated with VFIO devices.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This patch relies on the trait implementation provided for each device
which requires some sort of external update based on a map or unmap.
Whenever a MAP or UNMAP request comes through the virtqueues, it
triggers a call to the external mapping trait with map()/unmap()
functions being invoked.
Those external mappings are meant to be used from VFIO and vhost-user
devices as they need to update their own mappings. In case of VFIO, the
goal is to update the DMAR table in the physical IOMMU, while vhost-user
devices needs to update their internal representation of the virtqueues.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
With this implementation of the trait ExternalDmaMapping, we now have
the tool to provide to the virtual IOMMU to trigger the map/unmap on
behalf of the guest.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The VFIO container is the object needed to update the VFIO mapping
associated with a VFIO device. This patch allows the device manager
to have access to the VFIO container.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The new crate vm-device is created here to host the definitions of
traits not meant to be tied to virtio of VFIO specifically. We need to
add a new trait to update external DMA mappings for devices, which is
why the vm-device crate is the right fit for this.
We can expect this crate to be extended later once the design gets
approved from a rust-vmm perspective.
In this specific use case, we can have some devices like VFIO or
vhost-user ones requiring to be notified about mapping updates. This
new trait ExternalDmaMapping will allow such devices to implement their
own way to handle such event.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This patch attaches VFIO devices to the virtual IOMMU if they are
identified as they should be, based on the option "iommu=on". This
simply takes care of adding the PCI device ID to the ACPI IORT table.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Having the virtual IOMMU created with --iommu is one thing, but we also
need a way to decide if a VFIO device should be attached to the virtual
IOMMU or not. That's why we introduce an extra option "iommu" with the
value "on" or "off". By default, the device is not attached, which means
"iommu=off".
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Clean up the error handling and ensure that where possible errors are
propagated. Make use of std::convert::From in order to translate error
types.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Simplify the check for the unusual situation where the memory is not
configured by using .ok_or() on the option to convert it to a result.
This cleans up a bunch of extra indentation.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Remove messages that are left over from the development of the project
that represent normal operation for the backend. This cleans up the
console output and improves performance.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Fix invalid type for version:
- VmInfo.version.type string
Change Null value from enum as it has problems to build clients with
openapi tools.
- ConsoleConfig.mode.enum Null -> Nil
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
We should return an explicit error when the transition from on VM state
to another is invalid.
The valid_transition() routine for the VmState enum essentially
describes the VM state machine.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
We pause a VM from the API, then SSH'ing into it should fail.
After resuming, SSH'ing should work again.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order to pause a VM, we signal all the vCPU threads to get them out
of vmx non-root. Once out, the vCPU thread will check for a an atomic
pause boolean. If it's set to true, then the thread will park until
being resumed.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Because the L2 VM running in the VFIO integration test is actually
running as L3 (since the CI runs in a VM), it can take quite some
time for this VM to boot.
The way to solve this issue is to extend the sleep time before to try
communicating with the L2 VM, but also to speed up the boot time by
using virtio-console instead of serial. We suspect the use of serial,
implying PIO VM exits for each character on the serial port is quite
expensive compared to the paravirtualized console.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Azure virtual machines can have private IPs in the 172.16.x.x range,
causing some issues with the VFIO test. By using 172.17.x.x for this
test, we avoid IP conflicts.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that our custom kernel includes all the patches for the full support
of virtio-iommu, we can go one step further by attaching the virtio-net
device to the virtual IOMMU and use it to SSH some commands validating
both disks and the network card are isolated into their own IOMMU group.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that cloud-hypervisor can expose a virtual IOMMU to its guest VM,
the integration test validating the VFIO support with virtio-net can be
updated to use cloud-hypervisor exclusively.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Because we want both early support for virtio-fs and virtio-iommu, our
custom kernel is now based on the kernel branch virtio-fs-virtio-iommu.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to support nested virtualization and nested device passthrough
from our CI tests, we need some extra kernel configuration options to be
enabled.
CONFIG_KVM and CONFIG_VIRTUALIZATION for nested virtualization.
CONFIG_VFIO for nested device passthrough.
CONFIG_VIRTIO_IOMMU and CONFIG_ACPI_IORT for virtio-iommu support.
With all these new options applied, we can leverage virtio-iommu to
attach some VFIO devices to it and pass them through a second layer of
virtualization.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
So that we don't need to forward an ExitBehaviour up to the VMM thread.
This simplifies the control loop and the VMM thread even further.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The firmware has been recently updated to find the EFI partition, no
matter if the disk is not the first one of the list.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This patch introduces a specific documentation for the virtual IOMMU
device. This is important to understand what the use cases are for this
new device and how to properly use it with virtio devices.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This commit is the glue between the virtio-pci devices attached to the
vIOMMU, and the IORT ACPI table exposing them to the guest as sitting
behind this vIOMMU.
An important thing is the trait implementation provided to the virtio
vrings for each device attached to the vIOMMU, as they need to perform
proper address translation before they can access the buffers.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Having the virtual IOMMU created with --iommu is one thing, but we also
need a way to decide if a virtio-vsock device should be attached to this
virtual IOMMU or not. That's why we introduce an extra option "iommu"
with the value "on" or "off". By default, the device is not attached,
which means "iommu=off".
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Having the virtual IOMMU created with --iommu is one thing, but we also
need a way to decide if a virtio-console device should be attached to
this virtual IOMMU or not. That's why we introduce an extra option
"iommu" with the value "on" or "off". By default, the device is not
attached, which means "iommu=off".
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Having the virtual IOMMU created with --iommu is one thing, but we also
need a way to decide if a virtio-pmem device should be attached to this
virtual IOMMU or not. That's why we introduce an extra option "iommu"
with the value "on" or "off". By default, the device is not attached,
which means "iommu=off".
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Having the virtual IOMMU created with --iommu is one thing, but we also
need a way to decide if a virtio-rng device should be attached to this
virtual IOMMU or not. That's why we introduce an extra option "iommu"
with the value "on" or "off". By default, the device is not attached,
which means "iommu=off".
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>