cloud-hypervisor

mirror of https://github.com/cloud-hypervisor/cloud-hypervisor.git synced 2024-09-19 21:41:07 +00:00

Author	SHA1	Message	Date
Sebastien Boeuf	c7cabc88b4	vmm: Conditionally update ioeventfds for virtio PCI device The specific part of PCI BAR reprogramming that happens for a virtio PCI device is the update of the ioeventfds addresses KVM should listen to. This should not be triggered for every BAR reprogramming associated with the virtio device since a virtio PCI device might have multiple BARs. The update of the ioeventfds addresses should only happen when the BAR related to those addresses is being moved. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-10-31 09:30:59 +01:00
Sebastien Boeuf	de21c9ba4f	pci: Remove ioeventfds() from PciDevice trait The PciDevice trait is supposed to describe only functions related to PCI. The specific method ioeventfds() has nothing to do with PCI, but instead would be more specific to virtio transport devices. This commit removes the ioeventfds() method from the PciDevice trait, adding some convenient helper as_any() to retrieve the Any trait from the structure behing the PciDevice trait. This is the only way to keep calling into ioeventfds() function from VirtioPciDevice, so that we can still properly reprogram the PCI BAR. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-10-31 09:30:59 +01:00
Sebastien Boeuf	149b61b213	pci: Detect BAR reprogramming Based on the value being written to the BAR, the implementation can now detect if the BAR is being moved to another address. If that is the case, it invokes move_bar() function from the DeviceRelocation trait. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-10-29 16:48:02 +01:00
Sebastien Boeuf	4f8054fa82	pci: Store the type of BAR to return correct address Based on the type of BAR, we can now provide the correct address related to a BAR index provided by the caller. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-10-29 16:48:02 +01:00
Sebastien Boeuf	0acb1e329d	vm-virtio: Translate addresses for devices attached to IOMMU In case some virtio devices are attached to the virtual IOMMU, their vring addresses need to be translated from IOVA into GPA. Otherwise it makes no sense to try to access them, and they would cause out of range errors. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-10-07 10:12:07 +02:00
Sebastien Boeuf	2cd406ba50	vm-virtio: Fix virtio-pci BAR type The 32 or 64 bits type for the memory BAR was not set correctly. This patch ensure the right type is applied to the BAR. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-09-21 09:11:30 +01:00
Rob Bradford	180e6d1e78	vm-virtio: Allocate BARs for virtio-block devices in 32-bit hole Currently all devices and guest memory share the same 64GiB allocation. With guest memory working upwards and devices working downwards. This creates issues if you want to either have a VM with a large amount of memory or want to have devices with a large allocation (e.g. virtio-pmem.) As it is possible for the hypervisor to place devices anywhere in its address range it is required for simplistic users like the firmware to set up an identity page table mapping across the full range. Currently the hypervisor sets up an identify mapping of 1GiB which the firmware extends to 64GiB to match the current address space size of the hypervisor. A simpler solution is to place the device needed for booting with the firmware (virtio-block) inside the 32-bit memory hole. This allows the firmware to easily access the block device and paves the way for increasing the address space beyond the current 64GiB limit. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-19 10:43:55 +01:00
Rob Bradford	c042483953	build: make PCI (virtio and vfio) disableable at build time Although included by default it is now possible to build without PCI support. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-13 12:30:13 +01:00
Sebastien Boeuf	0b8856d148	vmm: Add RwLock to the GuestMemoryMmap Following the refactoring of the code allowing multiple threads to access the same instance of the guest memory, this patch goes one step further by adding RwLock to it. This anticipates the future need for being able to modify the content of the guest memory at runtime. The reasons for adding regions to an existing guest memory could be: - Add virtio-pmem and virtio-fs regions after the guest memory was created. - Support future hotplug of devices, memory, or anything that would require more memory at runtime. Because most of the time, the lock will be taken as read only, using RwLock instead of Mutex is the right approach. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-08-22 08:24:15 +01:00
Sebastien Boeuf	ec0b5567c8	vmm: Share the guest memory instead of cloning it The VMM guest memory was cloned (copied) everywhere the code needed to have ownership of it. In order to clean the code, and in anticipation for future support of modifying this guest memory instance at runtime, it is important that every part of the code share the same instance. Because VirtioDevice implementations need to have access to it from different threads, that's why Arc must be used in this case. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-08-22 08:24:15 +01:00
Sebastien Boeuf	658c076eb2	linters: Fix clippy issues Latest clippy version complains about our existing code for the following reasons: - trait objects without an explicit `dyn` are deprecated - `...` range patterns are deprecated - lint `clippy::const_static_lifetime` has been renamed to `clippy::redundant_static_lifetimes` - unnecessary `unsafe` block - unneeded return statement All these issues have been fixed through this patch, and rustfmt has been run to cleanup potential formatting errors due to those changes. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-08-15 09:10:04 -07:00
Sebastien Boeuf	f30ba069b7	vm-virtio: Allocate shared memory regions on dedicated BAR In the context of shared memory regions, they could not be present for most of the virtio devices. For this reason, we prefer dedicate a BAR for the shared memory regions. Another reason is that memory regions, if there are several, can be allocated all at once as a contiguous region, which then can be used as its own BAR. It would be more complicated to try to allocate the BAR 0 holding the regular information about the virtio-pci device along with the shared memory regions. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-08-13 13:57:53 +02:00
Sebastien Boeuf	e0fda0611c	vm-virtio: Remove virtio-pci dependency from VirtioDevice This patch cleans up the VirtioDevice trait. Since some function are PCI specific and since they are not even used, it makes sense to remove them from the trait definition. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-08-13 13:57:53 +02:00
Sebastien Boeuf	d97079d793	vm-virtio: Update VirtioPciCap and introduce VirtioPciCap64 Based on the latest version of the virtio specification, the structure virtio_pci_cap has been updated and a new structure virtio_pci_cap64 has been introduced. virtio_pci_cap now includes a field "id" that does not modify the existing structure size since there was a 3 bytes reserved field already there. The id is used in the context of shared memory regions which need to be identified since there could be more than one of this kind of capability. virtio_pci_cap64 is a new structure that includes virtio_pci_cap and extends it to allow 64 bits offsets and 64 bits region length. This is used in the context of shared memory regions capability, as we might need to describe regions of 4G or more, that could be placed at a 4G offset or more in the associated BAR. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-08-13 13:57:53 +02:00
Sebastien Boeuf	d180deb679	vm-virtio: pci: Fix PCI capability length The length of the PCI capability as it is being calculated by the guest was not accurate since it was not including the implicit 2 bytes offset. The reason for this offset is that the structure itself does not contain the capability ID (1 byte) and the next capability pointer (1 byte), but the structure exposed through PCI config space does include those bytes. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-08-13 13:57:53 +02:00
Sebastien Boeuf	aa44726658	vm-virtio: Don't trigger an MSI-X interrupt if not enabled Relying on the newly added MSI-X helper, the interrupt callback checks the interrupts are enabled on the device before to try triggering the interrupt. Fixes #156 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-08-08 17:38:47 +01:00
Rob Bradford	9caad7394d	build, misc: Bump vmm-sys-util dependency The structure of the vmm-sys-util crate has changed with lots of code moving to submodules. This change adjusts the use of the imported structs to reference the submodules. Fixes: #145 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-08-02 07:42:20 -07:00
Sebastien Boeuf	98d7955e34	vm-virtio: Add support for notifying about virtio config update As per the VIRTIO specification, every virtio device configuration can be updated while the guest is running. The guest needs to be notified when this happens, and it can be done in two different ways, depending on the type of interrupt being used for those devices. In case the device uses INTx, the allocated IRQ pin is shared between queues and configuration updates. The way for the guest to differentiate between an interrupt meant for a virtqueue or meant for a configuration update is tied to the value of the ISR status field. This field is a simple 32 bits bitmask where only bit 0 and 1 can be changed, the rest is reserved. In case the device uses MSI/MSI-X, the driver should allocate a dedicated vector for configuration updates. This case is much simpler as it only requires the device to send the appropriate MSI vector. The cloud-hypervisor codebase was not supporting the update of a virtio device configuration. This patch extends the existing VirtioInterrupt closure to accept a type that can be Config or Queue, so that based on this type, the closure implementation can make the right choice about which interrupt pin or vector to trigger. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-29 15:34:37 +01:00
Chao Peng	96fb38a5aa	vm-allocator: Align address at allocation time There is alignment support for AddressAllocator but there are occations that the alignment is known only when we call allocate(). One example is PCI BAR which is natually aligned, means for which we have to align the base address to its size. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>	2019-07-22 09:51:16 -07:00
Sebastien Boeuf	1268165040	pci: Allow for registering IO and Memory BAR This patch adds the support for both IO and Memory BARs by expecting the function allocate_bars() to identify the type of each BAR. Based on the type, register_mapping() insert the address range on the appropriate bus (PIO or MMIO). Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-22 09:50:10 -07:00
Sebastien Boeuf	72007f016a	pci: Improve MSI-X code to let VFIO rely on it This commit enhances the current msi-x code hosted in the pci crate in order to be reused by the vfio crate. Specifically, it creates several useful methods for the MsixCap structure that can simplify the caller's code. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-22 09:50:10 -07:00
Samuel Ortiz	4605ecf1a8	pci: Extend the Device trait to carry the device BARs When reading from or writing to a PCI BAR to handle a VM exit, we need to have the BAR address itself to be able to support multiple BARs PCI devices. Fixes: #87 Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-08 07:39:21 +02:00
Samuel Ortiz	8173e1ccd7	devices: Extend the Bus trait to carry the device range base With the range base for the IO/MMIO vm exit address, a device with multiple ranges has all the needed information for resolving which of its range the exit is coming from Fixes: #87 Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-08 07:39:21 +02:00
Samuel Ortiz	4a15316101	vm-virtio: Fix the network and storage PCI class and sub-class Use the virtio device type to generate the righ class and subclass. Fixes: #83 Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-02 17:37:12 +02:00
Jing Liu	9da2343cb7	device: Improvement for BusDevice trait and PciDevice trait BusDevice includes two methods which are only for PCI devices, which should be as members of PciDevice trait for a better clean high level APIs. Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>	2019-06-25 06:17:30 -07:00
Sebastien Boeuf	24dbe7003a	irq: Fix pin based interrupt for virtio-pci When the KVM capability KVM_CAP_SIGNAL_MSI is not present, the VMM falls back from MSI-X onto pin based interrupts. Unfortunately, this was not working as expected because the VirtioPciDevice object was always creating an MSI-X capability structure in the PCI configuration space. This was causing the guest drivers to expect MSI-X interrupts instead of the pin based generated ones. This patch takes care of avoiding the creation of a dedicated MSI-X capability structure when MSI is not supported by KVM. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-06-07 18:19:52 +01:00
Sebastien Boeuf	4d98dcb077	msix: Handle MSI-X device masking As mentioned in the PCI specification, the Function Mask from the Message Control Register can be set to prevent a device from injecting MSI-X messages. This supersedes the vector masking as it interacts at the device level. Here quoted from the specification: For MSI and MSI-X, while a vector is masked, the function is prohibited from sending the associated message, and the function must set the associated Pending bit whenever the function would otherwise send the message. When software unmasks a vector whose associated Pending bit is set, the function must schedule sending the associated message, and clear the Pending bit as soon as the message has been sent. Note that clearing the MSI-X Function Mask bit may result in many messages needing to be sent. This commit implements the behavior described above by reorganizing the way the PCI configuration space is being written. It is indeed important to be able to catch a change in the Message Control Register without having to implement it for every PciDevice implementation. Instead, the PciConfiguration has been modified to take care of handling any update made to this register. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-06-07 13:33:53 +01:00
Sebastien Boeuf	d810c7712d	msix: Handle MSI-X vector masking The current MSI-X implementation completely ignores the values found in the Vector Control register related to a specific vector, and never updates the Pending Bit Array. According to the PCI specification, MSI-X vectors can be masked through the Vector Control register on bit 0. If this bit is set, the device should not inject any MSI message. When the device runs into such situation, it must not inject the interrupt, but instead it must update the bit corresponding to the vector number in the Pending Bit Array. Later on, if/when the Vector Control register is updated, and if the bit 0 is flipped from 0 to 1, the device must look into the PBA to find out if there was a pending interrupt for this specific vector. If that's the case, an MSI message is injected and the bit from the PBA is cleared. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-06-07 13:33:53 +01:00
Sebastien Boeuf	42378caa8b	vm-virtio: Fix alignment and MSI-X table size on the BAR As mentioned in the PCI specification: If a dedicated Base Address register is not feasible, it is recommended that a function isolate the MSI-X structures from the non-MSI-X structures with aligned 8 KB ranges rather than the mandatory aligned 4 KB ranges. That's why this patch ensures that each structure present on the BAR is 8KiB aligned. It also fixes the MSI-X table and PBA sizes so that they can support up to 2048 vectors, as specified for MSI-X. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-06-07 13:33:53 +01:00
Sebastien Boeuf	47a4065aaf	interrupt: Use a single closure to describe pin based and MSI-X In order to factorize the complexity brought by closures, this commit merges IrqClosure and MsixClosure into a generic InterruptDelivery one. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-06-06 15:27:35 +01:00
Sebastien Boeuf	8df05b72dc	vmm: Add MSI-X support to virtio-pci devices In order to allow virtio-pci devices to use MSI-X messages instead of legacy pin based interrupts, this patch implements the MSI-X support for cloud-hypervisor. The VMM code and virtio-pci bits have been modified based on the "msix" module previously added to the pci crate. Fixes #12 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-06-06 15:27:35 +01:00
Sebastien Boeuf	d3c7b45542	interrupt: Make IRQ delivery generic Because we cannot always assume the irq fd will be the way to send an IRQ to the guest, this means we cannot make the assumption that every virtio device implementation should expect an EventFd to trigger an IRQ. This commit organizes the code related to virtio devices so that it now expects a Rust closure instead of a known EventFd. This lets the caller decide what should be done whenever a device needs to trigger an interrupt to the guest. The closure will allow for other type of interrupt mechanism such as MSI to be implemented. From the device perspective, it could be a pin based interrupt or an MSI, it does not matter since the device will simply call into the provided callback, passing the appropriate Queue as a reference. This design keeps the device model generic. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-06-06 15:27:35 +01:00
Samuel Ortiz	fe99c29743	vm-virtio: Remove useless PCI BAR debug log We should not unconditionally display our virtio PCI BAR setting. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-05-10 16:32:39 +02:00
Rob Bradford	3b2faa9f11	vm-virtio: Reset underlying device on driver request If the driver triggers a reset by writing zero into the status register then reset the underlying device if supported. A device reset also requires resetting various aspects of the queue. In order to be able to do a subsequent reactivate it is required to reclaim certain resources (interrupt and queue EventFDs.) If a device reset is requested by the driver but the underlying device does not support it then generate an error as the driver would not be able to configure it anyway. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-05-09 15:44:18 +02:00
Samuel Ortiz	040ea5432d	cloud-hypervisor: Add proper licensing Add the BSD and Apache license. Make all crosvm references point to the BSD license. Add the right copyrights and identifier to our VMM code. Add Intel copyright to the vm-virtio and pci crates. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-05-09 15:44:17 +02:00
Sebastien Boeuf	b67e0b3dad	vmm: Use virtio-blk to support booting from disk image After the virtio-blk device support has been introduced in the previous commit, the vmm need to rely on this new device to boot from disk images instead of initrd built into the kernel. In order to achieve the proper support of virtio-blk, this commit had to handle a few things: - Register an ioevent fd for each virtqueue. This important to be notified from the virtio driver that something has been written on the queue. - Fix the retrieval of 64bits BAR address. This is needed to provide the right address which need to be registered as the notification address from the virtio driver. - Fix the write_bar and read_bar functions. They were both assuming to be provided with an address, from which they were trying to find the associated offset. But the reality is that the offset is directly provided by the Bus layer. - Register a new virtio-blk device as a virtio-pci device from the vm.rs code. When the VM is started, it expects a block device to be created, using this block device as the VM rootfs. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-05-08 08:55:09 +02:00
Samuel Ortiz	c2c51dc9d1	vm-virtio: Add PCI transport support Copied from crosvm 107edb3e with one main modification: VirtioPciDevice implements BusDevice. We need this modification because it is the only way for us to be able to add a VirtioPciDevice to the MMIO bus. Bus insertion takes a BusDevice. The fact that VirtioPciDevice implements PciDevice which itself implements BusDevice does not mean that Rust will automatically downcast a VirtioPciDevice into a BusDevice. crosvm works around that issue by having the PCI, virtio and BusDevice implementations in the same crate. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-05-08 08:55:06 +02:00

37 Commits