cloud-hypervisor

mirror of https://github.com/cloud-hypervisor/cloud-hypervisor.git synced 2024-12-22 05:35:20 +00:00

Author	SHA1	Message	Date
Samuel Ortiz	299d887856	arch: Add SubRegion memory type We want to be able to differentiate between memory regions that must be managed separately from the main address space (e.g. the 32-bit memory hole) and ones that are reserved (i.e. from which we don't want to allow the VMM to allocate address ranges. We are going to use a reserved memory region for restricting the 32-bit memory hole from expanding beyond the IOAPIC and TSS addresses. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-25 11:45:38 +01:00
Samuel Ortiz	792cc27435	vfio: Propagate the KVM routes setting error This will trigger a logged error once we have an actual logger. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-25 11:45:38 +01:00
Sebastien Boeuf	421b896ab7	vfio: Don't expose an Interrupt Pin Since our VFIO code does not support pin based interrupt, but only MSI and MSI-X, it is cleaner to not expose any Interrupt Pin to the guest by setting its value to 0. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-25 11:45:38 +01:00
Sebastien Boeuf	2f802880c0	vfio: Disable the ROM expansion BAR Until the codebase can properly expose the ROM BAR into the guest, it is better to disable it for now, returning always 0 when the register is being read. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-25 11:45:38 +01:00
Sebastien Boeuf	e18052120a	vfio: Fix Memory BAR alignment The IO BAR alignment was already set to 4 bytes, this patch simply added a comment for it. The Memory BAR alignment was also set to the right value, but it was not explained why 0x1000 was needed, and also why 0x10 could sometimes be used as correct alignment. A Memory BAR must be aligned at least on 16 bytes since the first 4 bits are dedicated to some specific information about the BAR itself. But in case a BAR is identified as mappable from VFIO, this means our VMM might memory map it into the VMM address space, and set KVM accordingly using the ioctl KVM_SET_USER_MEMORY_REGION. In case of KVM, we have to take into account that it expects addresses to be page aligned, which means 4K in this case. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-25 11:45:38 +01:00
Sebastien Boeuf	d92d797896	vfio: Update memory slot index to support multiple VFIO devices In order to correctly support multiple VFIO devices, we need to increment the memory slot index every time it is being used to set some user memory region through KVM. That's why the mem_slot parameter is made mutable. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-25 11:45:38 +01:00
Sebastien Boeuf	b9f677c46c	vmm: Fix the memory slot index The memory slot index provided to the DeviceManager was wrong since only the RAM memory regions are set as user memory regions to KVM. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-25 11:45:38 +01:00
Sebastien Boeuf	b5eab43aa5	vfio: Create a global KVM VFIO device for all VFIO devices KVM does not support multiple KVM VFIO devices to be created when trying to support multiple VFIO devices. This commit creates one global KVM VFIO device being shared with every VFIO device, which makes possible the support for passing several devices through the VM. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-25 11:45:38 +01:00
Sebastien Boeuf	0ff074d2b8	vm-allocator: Fix potential allocation errors There is one corner case which was not properly handled by the current code from our AddressAllocator. If both the address start (from the next range) and the requested region size are already aligned on the same value as "alignment", when doing the substract of the requested size + alignment, the returned address is already aligned. The problem is that we might end up overlapping with an existing range since the check between the available delta and the requested size does not take into account a full extra alignment. By substracting the requested size + alignment - 1 from the address start of the next range, we ensure this kind of corner case would not happen since the address won't be naturally aligned and after some adjustment from the function align_address(), the correct start address will be returned. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-25 11:45:38 +01:00
Sebastien Boeuf	927861ced2	pci: Fix end of address space check The check performed on the end address was wrong since the end address was actually the address right after the end. To get the right end address, we need to add (region size - 1) to the start address. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-25 11:45:38 +01:00
Rob Bradford	1971c94e4e	tests: Adjust down entropy expectation The newer kernel is resulting in entropy being slightly lower than previously. Adjust the expected entropy downwards. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-07-24 16:26:33 +02:00
Rob Bradford	ebe04f6db9	tests: Use custom kernel for all tests This should reduce the integration testing time considerably. When a custom kernel is no longer required we can pull kernel from tarball again. Fixes: #100 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-07-24 16:26:33 +02:00
Samuel Ortiz	3cc6f48c31	docs: Add VFIO usage example Fixes: #117 Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-24 07:17:03 -07:00
Samuel Ortiz	46eaea1627	README: Fix kernel command line console argument We use the virtio console device now. Fixes: #116 Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-24 07:16:48 -07:00
Rob Bradford	1f6f52249e	build: Upload release binary on tag Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-07-24 12:49:35 +01:00
Samuel Ortiz	5ae3144f5b	tests: Add VFIO integration test The VFIO integration test first boots a QEMU guest and then assigns the QEMU virtio-pci networking device into a nested cloud-hypervisor guest. We then check that we can ssh into the nested guest and verify that it's running with the right kernel command line. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-24 11:55:08 +02:00
Samuel Ortiz	4d16ca8ae7	vmm: Support direct device assignment With the VFIO crate, we can now support directly assigned PCI devices into cloud-hypervisor guests. We support assigning multiple host devices, through the --device command line parameter. This parameter takes the host device sysfs path. Fixes: #60 Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-24 11:55:08 +02:00
Chao Peng	b746dd7116	vfio: Map MMIO regions into the guest VFIO explictly tells us if a MMIO region can be mapped into the guest address space or not. Except for MSI-X table BARs, we try to map them into the guest whenever VFIO allows us to do so. This avoids unnecessary VM exits when the guest tries to access those regions. Signed-off-by: Zhang, Xiong Y <xiong.y.zhang@intel.com> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-24 11:55:08 +02:00
Sebastien Boeuf	c93d5361b8	vfio: pci: Build the KVM routes We track all MSI and MSI-X capabilities changes, which allows us to also track all MSI and MSI-X table changes. With both pieces of information we can build kvm irq routing tables and map the physical device MSI/X vectors to the guest ones. Once that mapping is in place we can toggle the VFIO IRQ API accordingly and enable disable MSI or MSI-X interrupts, from the physical device up to the guest. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-24 11:55:08 +02:00
Sebastien Boeuf	20f0116111	vfio: pci: Track MSI and MSI-X capabilities In order to properly manage the VFIO device interrupt settings, we need to keep track of both MSI and MSI-X PCI config capabilities changes. When the guest programs the device for interrupt delivery, it writes to the MSI and MSI-X capabilities. This information must be trapped and cached in order to map the physical device interrupt delivery path to the guest one. In other words, tracking MSI and MSI-X capabilites will allow us to accurately build the KVM interrupt routes. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-24 11:55:08 +02:00
Samuel Ortiz	db5b4763c2	vfio: Initial PCI support This brings the initial PCI support to the VFIO crate. The VfioPciDevice is the main structure and holds an inner VfioDevice. VfioPciDevice implements the PCI trait, leaving the IRQ assignments empty as this will be driven by both the guest and the VFIO PCI device, not by the VMM. As we must trap BAR programming from the guest (We don't want to program the actual device with guest addresses), we use our local PCI configuration cache to read and write BARs. Signed-off-by: Zhang, Xiong Y <xiong.y.zhang@intel.com> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-24 11:55:08 +02:00
Samuel Ortiz	2cec3aad7f	vfio: VFIO API wrappers and helpers The Virtual Function I/O (VFIO) kernel subsystem exposes a vast and relatively complex userspace API. This commit abstracts and simplifies this API into both an internal and external API. The external API is to be consumed by VFIO device implementation through the VfioDevice structure. A VfioDevice instance can: - Enable and disable all interrupts (INTX, MSI and MSI-X) on the underlying VFIO device. - Read and write all of the VFIO device memory regions. - Set the system's IOMMU tables for the underlying device. Signed-off-by: Zhang, Xiong Y <xiong.y.zhang@intel.com> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-24 11:55:08 +02:00
Samuel Ortiz	5372554ed4	vfio-bindings: Initial commit The default bindings are generated from the 5.0.0 Linux userspace API. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-24 11:55:08 +02:00
Samuel Ortiz	4e48309660	vm: Factorize all virtio devices creation routines Our DeviceManager::new() routine is reaching north of 250 lines. For simplicity and readbility sake, extract all virtio devices creation code into their own routines. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-23 08:41:37 +01:00
fazlamehrab	8ba54af71d	vm-virtio: Add integration test for virtio console device Two integration tests are added for testing the implemented virtio console device for single port operation. One checks the presence and the simple stdout operation. The other test checks the stdout on file (option: file) using virtio console. Signed-off-by: A K M Fazla Mehrab <fazla.mehrab.akm@intel.com>	2019-07-22 23:08:56 +01:00
fazlamehrab	24438e0390	vm-virtio: Enable the vmm support for virtio-console To use the implemented virtio console device, the users can select one of the three options ("off", "tty" or "file=/path/to/the/file") with the command line argument "--console". By default, the console is enabled as a device named "hvc0" (option: tty). When "off" option is used, the console device is not added to the VM configuration at all. Signed-off-by: A K M Fazla Mehrab <fazla.mehrab.akm@intel.com>	2019-07-22 23:08:56 +01:00
fazlamehrab	577d44c8eb	vm-virtio: Add virtio console device for single port operation The virtio console device is a console for the communication between the host and guest userspace. It has two parts: the device and the driver. The console device is implemented here as a virtio-pci device to the guest. On the other side, the guest OS expected to have a character device driver which provides an interface to the userspace applications. The console device can have multiple ports where each port has one transmit queue and one receive queue. The current implementation only supports one port. For data IO communication, one or more empty buffers are placed in the receive queue for incoming data, and outgoing characters are placed in the transmit queue. Details spec can be found from the following link. https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.pdf#e7 Apart from the console, for the communication between guest and host, the Cloud Hypervisor has a legacy serial device implemented. However, the implementation of a console device lets us be independent of legacy pin-based interrupts without losing the logs and access to the VM. Signed-off-by: A K M Fazla Mehrab <fazla.mehrab.akm@intel.com>	2019-07-22 23:08:56 +01:00
Sebastien Boeuf	f98a69f42e	vm-allocator: Introduce an MMIO hole address allocator With this new AddressAllocator as part of the SystemAllocator, the VMM can now decide with finer granularity where to place memory. By allocating the RAM and the hole into the MMIO address space, we ensure that no memory will be allocated by accident where the RAM or where the hole is. And by creating the new MMIO hole address space, we create a subset of the entire MMIO address space where we can place 32 bits BARs for example. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-22 09:51:16 -07:00
Sebastien Boeuf	a761b820c7	vm-allocator: Fix the aligned address check The requested address for a range can be the base of the entire address space, this is a valid use case. In particular, when creating an MMIO address space of 0-64GiB, we might want to create a range of 0-1GiB if the RAM of our VM is 1G. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-22 09:51:16 -07:00
Sebastien Boeuf	709148803e	vm-allocator: Fix free range allocation This patch fixes the function first_available_range() responsible for finding the first range that could fit the requested size. The algorithm was working, that is allocating ranges from the end of the address space because we created an empty region right at the end. But the problem is, the VMM might request for some specific allocations at fixed address to allocate the RAM for example. In this case, the RAM range could be 0-1GiB, which means with the previous algorithm, the new available range would have been found right after 1GiB. This is not the intended behavior, and that's why the algorithm has been fixed by this patch, making sure to walk down existing ranges starting from the end. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-22 09:51:16 -07:00
Samuel Ortiz	0a04a950a1	vm-allocator: Expand the IRQ allocation API to support GSI GSI (Global System Interrupt) is an extension of just a linear array of IRQs. It takes IOAPICs into account for example. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-22 09:51:16 -07:00
Chao Peng	96fb38a5aa	vm-allocator: Align address at allocation time There is alignment support for AddressAllocator but there are occations that the alignment is known only when we call allocate(). One example is PCI BAR which is natually aligned, means for which we have to align the base address to its size. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>	2019-07-22 09:51:16 -07:00
Chao Peng	af7cd74e04	vm-allocator: Make port IO non optional This is only for allocating the port IO address range. If a platform does not have PIO devices at all, the address range will simply be unused. So, simplify the vm-allocator data structure by making both MMIO and PIO mandatory. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>	2019-07-22 09:51:16 -07:00
Sebastien Boeuf	1268165040	pci: Allow for registering IO and Memory BAR This patch adds the support for both IO and Memory BARs by expecting the function allocate_bars() to identify the type of each BAR. Based on the type, register_mapping() insert the address range on the appropriate bus (PIO or MMIO). Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-22 09:50:10 -07:00
Sebastien Boeuf	b157181656	pci: Fix the way PCI configuration registers are being written The way the function write_reg() was implemented, it was not keeping the bits supposed to be read-only whenever the guest was writing to one of those. That's why this commit takes care of protecting those bits, preventing them from being updated. The tricky part is about the BARs since we also need to handle the very specific case where the BAR is being written with all 1's. In that case we want to return the size of the BAR instead of its address. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-22 09:50:10 -07:00
Sebastien Boeuf	185b1082fb	pci: Add a helper to set the BAR type A BAR can be three different types: IO, 32 bits Memory, or 64 bits Memory. The VMM needs a way to set the right type depending on its needs. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-22 09:50:10 -07:00
Sebastien Boeuf	ee39e46568	pci: Add MSI capability structure In order to support use cases that require MSI, the pci crate is being expanded with the description of an MSI PCI capability structure through the new MsiCap Rust structure. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-22 09:50:10 -07:00
Sebastien Boeuf	72007f016a	pci: Improve MSI-X code to let VFIO rely on it This commit enhances the current msi-x code hosted in the pci crate in order to be reused by the vfio crate. Specifically, it creates several useful methods for the MsixCap structure that can simplify the caller's code. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-22 09:50:10 -07:00
Samuel Ortiz	29878956bd	pci: Implement the From trait for the PciCapabilityID structure This will be needed by the VFIO crate for managing MSI capabilities. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-22 09:50:10 -07:00
Rob Bradford	3f02ccaa8c	qcow: Add support for QCOW v2 header The QCOW2 format is documented here: https://git.qemu.org/?p=qemu.git;a=blob;f=docs/interop/qcow2.txt;hb=HEAD The only difference between v2 and v3 is the addition of some extra fields into the header in v3 for which there are default values in v2. This introduces a new unit test for the behaviour but it has been manually verified by the converting the image from v3 to v2 with a command like: qemu-img convert -O qcow2 -o compat=0.10 clear-29620-cloud.img clear-29620-cloud.img.v2 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-07-19 17:21:54 +02:00
Rob Bradford	6f65f3406e	build: Ensure caps needed for unit test are set In some situations it is possible for the setting of the capabilities to fail due to the variable naming of the build artifacts resulting in the first parameter to setcap being rejected and thus the whole command failing. Use xargs -n 1 to ensure that every potential target independently has its caps set. Further it was observed that in some situations the binary produced by cargo test --all --no-run would not be used and instead a new binary would be produced when the test was run using the second method. This again would result in test failures as that binary did not have the desired capabilities set. Therefore build the test binaries with the same methodology used to run them. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-07-19 12:00:10 +02:00
Rob Bradford	998140f1b0	tests: Remove single test limit Run the tests with default parallelisation. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-07-18 18:01:18 +02:00
Rob Bradford	492ab7a1a8	build: Use tmpfs for /tmp On the Jenkins build slaves disk I/O is a bottlneck so make /tmp a tmpfs which removes I/O issues when running lots of VMs at the same time. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-07-18 18:01:18 +02:00
Rob Bradford	80f33113cb	tests: Use incrementing IP and mac address for VMs This allows us to test multiple VMs at once. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-07-18 18:01:18 +02:00
Rob Bradford	93c2099ab6	tests: Abstract guest management under a struct Create a struct to handle all the details for the guest under test including details of network and disks. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-07-18 18:01:18 +02:00
Rob Bradford	eab639efe3	tests: Support customising the cloud-init network details Allow replacement of the network details used for the VM. By replacing those from the file checked into the source tree we can continue to use the file in the tree for manual testing but adjust the network per-VM to allow parallel testing. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-07-18 18:01:18 +02:00
Rob Bradford	e9f01740e8	tests: Create cloud-init image from source files in tests In the future this will provide the basis for the ability to customise the cloud-init file per VM. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-07-18 18:01:18 +02:00
Rob Bradford	0776d9d7ae	tests: Sleep more in order to speed up tests By sleeping more earlier this will speed up the tests as the SSH connection will complete on the first attempt and thus alleviate timeout and backoff delays. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-07-18 18:01:18 +02:00
Rob Bradford	7ebfe90985	tests: Use a temporary directory for the temporary test files Use the tempdir crate to create a temporary directory that is deleted when the structure goes out of scope. Use this temporary directory for all temporary test files created by the tests. The cloud init file is still in /tmp as that is created by the test wrapper code. This is the first stage towards being able to run the integration tests in parallel. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-07-18 18:01:18 +02:00
Rob Bradford	78fe807284	build: Run unit tests on the Jenkins server The addition of [workspace] to the top level Cargo.toml is necessary to have the binaries colocated together. The Cargo.lock files have also been refreshed by the change to the Cargo.toml. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-07-16 17:09:05 +02:00

... 163 164 165 166 167 ...

8413 Commits