cloud-hypervisor

mirror of https://github.com/cloud-hypervisor/cloud-hypervisor.git synced 2024-12-29 00:55:18 +00:00

Author	SHA1	Message	Date
Michael Zhao	c445513976	hypervisor: Refactor `core_registers` on AArch64 On AArch64, the function `core_registers` and `set_core_registers` are the same thing of `get/set_regs` on x86_64. Now the names are aligned. This will benefit supporting `gdb`. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-07-14 22:55:19 +08:00
Wei Liu	0e8769d76a	device_manager: assert passthrough_device has the correct type There is a lot of unsafe code in such a small function. Add an assert to help detect issues earlier. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-14 08:09:50 +01:00
Wei Liu	84bbaf06d1	hypervisor: turn boot_msr_entries into a trait method This allows dispatching to either KVM or MSHV automatically. No functional change. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-08 16:49:58 +01:00
Rob Bradford	121729a3b0	vmm: Split signal handling for VM and VMM signals The VM specific signal (currently only SIGWINCH) should only be handled when the VM is running. The generic VMM signals (SIGINT and SIGTERM) need handling at all times. Split the signal handling into two separate threads which have differing lifetimes. Tested by: 1.) Boot full VM and check resize handling (SIGWINCH) works & sending SIGTERM leads to cleanup (tested that API socket is removed.) 2.) Start without a VM and send SIGTERM/SIGINT and observe cleanup (API socket removed) 3.) Boot full VM, delete VM and observe 2.) holds. 4.) Boot full VM, delete VM, recreate VM and observe 1.) holds. Fixes: #4269 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-07-08 15:15:46 +01:00
Rob Bradford	93237f0106	vmm: Set MADT "Online Capable" flag The Linux kernel now checks for this before marking CPUs as hotpluggable: commit aa06e20f1be628186f0c2dcec09ea0009eb69778 Author: Mario Limonciello <mario.limonciello@amd.com> Date: Wed Sep 8 16:41:46 2021 -0500 x86/ACPI: Don't add CPUs that are not online capable A number of systems are showing "hotplug capable" CPUs when they are not really hotpluggable. This is because the MADT has extra CPU entries to support different CPUs that may be inserted into the socket with different numbers of cores. Starting with ACPI 6.3 the spec has an Online Capable bit in the MADT used to determine whether or not a CPU is hotplug capable when the enabled bit is not set. Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/05_ACPI_Software_Programming_Model/ACPI_Software_Programming_Model.html?#local-apic-flags Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-07-01 18:45:05 +01:00
Rob Bradford	adf5881757	build: #[allow(clippy::significant_drop_in_scrutinee) in some crates This check is new in the beta version of clippy and exists to avoid potential deadlocks by highlighting when the test in an if or for loop is something that holds a lock. In many cases we would need to make significant refactorings to be able to pass this check so disable in the affected crates. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-30 20:50:45 +01:00
Rob Bradford	b57d7b258d	build: Fix beta clippy issue (needless_return) warning: unneeded `return` statement --> pci/src/vfio_user.rs:627:13 \| 627 \| / return Err(std::io::Error::new( 628 \| \| std::io::ErrorKind::Other, 629 \| \| format!("Region not found for 0x{:x}", gpa), 630 \| \| )); \| \|_______________^ \| = note: `#[warn(clippy::needless_return)]` on by default = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_return help: remove `return` \| 627 ~ Err(std::io::Error::new( 628 + std::io::ErrorKind::Other, 629 + format!("Region not found for 0x{:x}", gpa), 630 + )) \| Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-30 20:50:45 +01:00
Rob Bradford	2716bc3311	build: Fix beta clippy issue (derive_partial_eq_without_eq) warning: you are deriving `PartialEq` and can implement `Eq` --> vmm/src/serial_manager.rs:59:30 \| 59 \| #[derive(Debug, Clone, Copy, PartialEq)] \| ^^^^^^^^^ help: consider deriving `Eq` as well: `PartialEq, Eq` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-30 20:50:45 +01:00
Rob Bradford	2e664dca64	vmm: Always reset the console mode on VMM exit Tested: 1. SIGTERM based 2. VM shutdown/poweroff 3. Injected VM boot failure after calling Vm::setup_tty() Fixes: #4248 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-28 16:45:27 +01:00
Rob Bradford	65ec6631fb	vmm: cpu: Store the vCPU snapshots in ascending order The snapshots are stored in a BTree which is ordered however as the ids are strings lexical ordering places "11" ahead of "2". So encode the vCPU id with zero padding so it is lexically sorted. This fixes issues with CPU restore on aarch64. See: #4239 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-27 16:20:57 +01:00
Wei Liu	bccd7c7e48	vmm: drop Sync+Send bounds for EndpointHandler Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-06-20 23:28:57 +01:00
Wei Liu	8fa1098629	vmm: switch from lazy_static to once_cell Once_cell does not require using macro and is slated to become part of Rust std at some point. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-06-20 16:03:07 +01:00
Sebastien Boeuf	335a4e1cc0	vmm: api: Expose kvm_hyperv parameter in OpenAPI description Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-06-17 15:11:53 +01:00
Sebastien Boeuf	81ba70a497	pci, vmm: Defer mapping VFIO MMIO regions on restore When restoring a VM, the restore codepath will take care of mapping the MMIO regions based on the information from the snapshot, rather than having the mapping being performed during device creation. When the device is created, information such as which BARs contain the MSI-X tables are missing, preventing to perform the mapping of the MMIO regions. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-06-09 09:19:58 +02:00
Sebastien Boeuf	7df7061610	pci, vmm: Add migratable support to vfio-user devices Based on recent changes to VfioUserPciDevice, the vfio-user devices can now be migrated. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-06-09 09:19:58 +02:00
Sebastien Boeuf	c021dda267	pci, vmm: Add migratable support to VFIO devices Based on recent changes to VfioPciDevice, the VFIO devices can now be migrated. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-06-09 09:19:58 +02:00
Rob Bradford	94fb9f817d	vmm: Fix clippy issues under "guest_debug" feature Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-08 11:40:56 +01:00
Michael Zhao	a7a15d56dd	aarch64: Move `setup_regs` to `hypervisor` `setup_regs` of AArch64 calls KVM sepecific code. Now move it to `hypervisor` crate. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-06-06 11:07:46 +01:00
Sebastien Boeuf	65dc1c83a9	vmm: cpu: Save and restore CPU states during snapshot/restore Based on recent KVM host patches (merged in Linux 5.16), it's forbidden to call into KVM_SET_CPUID2 after the first successful KVM_RUN returned. That means saving CPU states during the pause sequence, and restoring these states during the resume sequence will not work with the current design starting with kernel version 5.16. In order to solve this problem, let's simply move the save/restore logic to the snapshot/restore sequences rather than the pause/resume ones. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-06-06 11:07:29 +01:00
Sebastien Boeuf	3edaa8adb6	vmm: Ensure restore matches boot sequence The vCPU is created and set after all the devices on a VM's boot. There's no reason to follow a different order on the restore codepath as this could cause some unexpected behaviors. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-06-06 11:07:17 +01:00
Michael Zhao	9260c3816e	vmm: Update unit test for GIC refactoring Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-06-06 10:17:26 +08:00
Michael Zhao	5d45d6d0fb	vmm: Move GIC unit test to `hypervisor` crate Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-06-06 10:17:26 +08:00
Michael Zhao	957d3a7443	aarch64: Simplify GIC related structs definition Combined the `GicDevice` struct in `arch` crate and the `Gic` struct in `devices` crate. After moving the KVM specific code for GIC in `arch`, a very thin wapper layer `GicDevice` was left in `arch` crate. It is easy to combine it with the `Gic` in `devices` crate. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-06-06 10:17:26 +08:00
Michael Zhao	04949755c0	arch: Switch to new GIC interface Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-06-06 10:17:26 +08:00
Rob Bradford	ade3a9c8f6	virtio-devices, vmm: Optimised async virtio device activation In order to ensure that the virtio device thread is spawned from the vmm thread we use an asynchronous activation mechanism for the virtio devices. This change optimises that code so that we do not need to iterate through all virtio devices on the platform in order to find the one that requires activation. We solve this by creating a separate short lived VirtioPciDeviceActivator that holds the required state for the activation (e.g. the clones of the queues) this can then be stored onto the device manager ready for asynchronous activation. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-01 09:42:02 +02:00
Yi Wang	dbeb922882	doc: add vm coredump support Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Co-authored-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-30 13:41:40 +02:00
Yi Wang	8b585b96c1	vmm: enable coredump Based on the newly added guest_debug feature, this patch adds http endpoint support. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Co-authored-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-30 13:41:40 +02:00
Yi Wang	ccb604e1e1	vmm: add cpu segment note for coredump The crash tool use a special note segment which named 'QEMU' to analyze kaslr info and so on. If we don't add the 'QEMU' note segment, crash tool can't find linux version to move on. For now, the most convenient way is to add 'QEMU' note segment to make crash tool happy. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>	2022-05-30 13:41:40 +02:00
Yi Wang	0e65ca4a6c	vmm: save guest memory for coredump Guest memory is needed for analysis in crash tool, so save it for coredump. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Co-authored-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-30 13:41:40 +02:00
Yi Wang	7e280b6f70	vmm: save elf header for coredump The vmcore file of guest is an elf format, so the first step of coredump is to save the elf header. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>	2022-05-30 13:41:40 +02:00
Yi Wang	90034fd6ba	vmm: add GuestDebuggable trait It's useful to dump the guest, which named coredump so that crash tool can be used to analysize it when guest hung up. Let's add GuestDebuggable trait and Coredumpxxx error to support coredump firstly. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Co-authored-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-30 13:41:40 +02:00
Rob Bradford	465db7f08c	vmm: config: Remove mergeable option from PmemConfig Fixes: #3968 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-27 09:48:49 +02:00
Rob Bradford	55c5961f43	vmm: config: Remove dax & cache_size options from FsConfig Fixes: #3889 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-27 09:47:13 +02:00
Rob Bradford	7c3582b4a8	vmm: config: Fix error message regarding use of cache size without dax The error message incorrectly said that the user was trying to combine cache_size without dax whereas it is only usuable with dax. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-27 09:47:13 +02:00
Rob Bradford	979797786d	vmm: Remove DAX cache setup for virtio-fs devices Remove the code from the DeviceManager that prepares the DAX cache since the functionality has now been removed. Fixes: #3889 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-27 09:47:13 +02:00
Michael Zhao	0fd6521759	aarch64: Avoid depending on `layout` in GIC code Removing the dependency on `layout` helps moving GIC code into `hypervisor` crate. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-05-27 10:57:50 +08:00
Michael Zhao	3fe20cc09a	aarch64: Remove `GicDevice` trait `GicDevice` trait was defined for the common part of GicV3 and ITS. Now that the standalone GicV3 do not exist, `GicDevice` is not needed. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-05-27 10:57:50 +08:00
Rob Bradford	fa07d83565	Revert "virtio-devices, vmm: Optimised async virtio device activation" This reverts commit `f160572f9d`. There has been increased flakiness around the live migration tests since this was merged. Speculatively reverting to see if there is increased stability. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-21 21:27:33 +01:00
Rob Bradford	f160572f9d	virtio-devices, vmm: Optimised async virtio device activation In order to ensure that the virtio device thread is spawned from the vmm thread we use an asynchronous activation mechanism for the virtio devices. This change optimises that code so that we do not need to iterate through all virtio devices on the platform in order to find the one that requires activation. We solve this by creating a separate short lived VirtioPciDeviceActivator that holds the required state for the activation (e.g. the clones of the queues) this can then be stored onto the device manager ready for asynchronous activation. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-20 17:07:13 +01:00
Sebastien Boeuf	49db713124	virtio-devices, vmm: Remove unused macro rules Latest cargo beta version raises warnings about unused macro rules. Simply remove them to fix the beta build. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-20 09:59:43 +01:00
Maksym Pavlenko	3a0429c998	cargo: Clean up serde dependencies There is no need to include serde_derive separately, as it can be specified as serde feature instead. Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>	2022-05-18 08:21:19 +02:00
Rob Bradford	16a9882153	vmm: cpu: tdx: Don't use fd suffix for something not an FD The hypervisor::Vcpu is the abstraction over the fd. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-13 15:39:22 +02:00
Rob Bradford	218be2642e	hypervisor: Explicitly `pub use` at the hypervisor crate top-level Explicitly re-export types from the hypervisor specific modules. This makes it much clearer what the common functionality that is exposed is. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-13 15:39:22 +02:00
Rob Bradford	cd0df05808	vmm, arch: CpuId is x86_64 specific so import from the x86_64 module It will be removed as a top-level export from the hypervisor crate. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-13 15:39:22 +02:00
Rob Bradford	d3f66f8702	hypervisor: Make vm module private And thus only export what is necessary through a `pub use`. This is consistent with some of the other modules and makes it easier to understand what the external interface of the hypervisor crate is. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-13 15:39:22 +02:00
Rob Bradford	b1bd87df19	vmm: Simplify MsiInterruptManager generics By taking advantage of the fact that IrqRoutingEntry is exported by the hypervisor crate (that is typedef'ed to the hypervisor specific version) then the code for handling the MsiInterruptManager can be simplified. This is particularly useful if in this future it is not a typedef but rather a wrapper type. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-11 11:19:14 +01:00
Rob Bradford	3f9e8d676a	hypervisor: Move creation of irq routing struct to hypervisor crate This removes the requirement to leak as many datastructures from the hypervisor crate into the vmm crate. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-11 11:19:14 +01:00
Rob Bradford	c2c813599d	vmm: Don't use kvm_ioctls directly The IoEventAddress is re-exported through the crate at the top-level. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-10 15:57:43 +01:00
Rob Bradford	387d56879b	vmm, hypervisor: Clean up nomenclature around offloading VM operations The trait and functionality is about operations on the VM rather than the VMM so should be named appropriately. This clashed with with existing struct for the concrete implementation that was renamed appropriately. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-10 13:10:01 +01:00
Sebastien Boeuf	5f722d0d3f	vmm: Fix loading RAW firmware Whenever going through the codepath of loading a RAW firmware, we always add an extra RAM region to the guest memory through the memory manager. But we must be careful to use the updated guest memory rather than a previous reference that wasn't containing the new region, as this can lead to the following error: VmBoot(FirmwareLoad(InvalidGuestAddress(GuestAddress(4290772992)))) This is fixed by the current patch, getting the latest reference onto the guest memory from the memory manager right after the new region has been added. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-06 18:13:28 +02:00
Bo Chen	42c19e14c5	vmm: Add 'shutdown()' to vCPU seccomp filter This is required when hot-removing a vfio-user device. Details code path below: Thread 6 "vcpu0" received signal SIGSYS, Bad system call. [Switching to Thread 0x7f8196889700 (LWP 2358305)] 0x00007f8196dae7ab in shutdown () at ../sysdeps/unix/syscall-template.S:78 78 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS) (gdb) bt 0x00007f8196dae7ab in shutdown () at ../sysdeps/unix/syscall-template.S:78 0x000056189240737d in std::sys::unix::net::Socket::shutdown () at library/std/src/sys/unix/net.rs:383 std::os::unix::net::stream::UnixStream::shutdown () at library/std/src/os/unix/net/stream.rs:479 0x000056189210e23d in vfio_user::Client::shutdown (self=0x7f8190014300) at vfio_user/src/lib.rs:787 0x00005618920b9d02 in <pci::vfio_user::VfioUserPciDevice as core::ops::drop::Drop>::drop ( self=0x7f819002d7c0) at pci/src/vfio_user.rs:551 0x00005618920b8787 in core::ptr::drop_in_place<pci::vfio_user::VfioUserPciDevice> () at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ptr/mod.rs:188 0x00005618920b92e3 in core::ptr::drop_in_place<core::cell::UnsafeCell<dyn pci::device::PciDevice>> () at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ptr/mod.rs:188 0x00005618920b9362 in core::ptr::drop_in_place<std::sync::mutex::Mutex<dyn pci::device::PciDevice>> () at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ptr/mod.rs:188 0x00005618920d8a3e in alloc::sync::Arc<T>::drop_slow (self=0x7f81968852b8) at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/alloc/src/sync.rs:1092 0x00005618920ba273 in <alloc::sync::Arc<T> as core::ops::drop::Drop>::drop (self=0x7f81968852b8) at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/alloc/src/sync.rs:1688 0x00005618920b76fb in core::ptr::drop_in_place<alloc::sync::Arc<std::sync::mutex::Mutex<dyn pci::device::PciDevice>>> () at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ptr/mod.rs:188 0x0000561891b5e47d in vmm::device_manager::DeviceManager::eject_device (self=0x7f8190009600, pci_segment_id=0, device_id=3) at vmm/src/device_manager.rs:4000 0x0000561891b674bc in <vmm::device_manager::DeviceManager as vm_device:🚌:BusDevice>::write ( self=0x7f8190009600, base=70368744108032, offset=8, data=&[u8](size=4) = {...}) at vmm/src/device_manager.rs:4625 0x00005618921927d5 in vm_device:🚌:Bus::write (self=0x7f8190006e00, addr=70368744108040, data=&[u8](size=4) = {...}) at vm-device/src/bus.rs:235 0x0000561891b72e10 in <vmm::vm::VmOps as hypervisor::vm::VmmOps>::mmio_write ( self=0x7f81900097b0, gpa=70368744108040, data=&[u8](size=4) = {...}) at vmm/src/vm.rs:378 0x0000561892133ae2 in <hypervisor::kvm::KvmVcpu as hypervisor::cpu::Vcpu>::run ( self=0x7f8190013c90) at hypervisor/src/kvm/mod.rs:1114 0x0000561891914e85 in vmm::cpu::Vcpu::run (self=0x7f819001b230) at vmm/src/cpu.rs:348 0x000056189189f2cb in vmm::cpu::CpuManager::start_vcpu::{{closure}}::{{closure}} () at vmm/src/cpu.rs:953 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-05-05 15:33:26 -07:00
Sebastien Boeuf	058a61148c	vmm: Factorize net creation Since both Net and vhost_user::Net implement the Migratable trait, we can factorize the common part to simplify the code related to the net creation. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-05 13:08:41 +02:00
Sebastien Boeuf	425902b296	vmm: Factorize disk creation Since both Block and vhost_user::Blk implement the Migratable trait, we can factorize the common part to simplify the code related to the disk creation. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-05 13:08:41 +02:00
Sebastien Boeuf	54f39aa8cb	vmm: Validate vhost-user-block/net are not configured with iommu=on Extend the validate() function for both DiskConfig and NetConfig so that we return an error if a vhost-user-block or vhost-user-net device is expected to be placed behind the virtual IOMMU. Since these devices don't support this feature, we can't allow iommu to be set to true in these cases. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-05 13:08:41 +02:00
Rob Bradford	707cea2182	vmm, devices: Move logging of 0x80 timestamp to its own device This is a cleaner approach to handling the I/O port write to 0x80. Whilst doing this also use generate the timestamp at the start of the VM creation. For consistency use the same timestamp for the ARM equivalent. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-04 23:02:53 +01:00
Rob Bradford	c47e3b8689	gdb: Do not use VmmOps for memory manipulation We don't use the VmmOps trait directly for manipulating memory in the core of the VMM as it's really designed for the MSHV crate to handle instruction decoding. As I plan to make this trait MSHV specific to allow reduced locking for MMIO and PIO handling when running on KVM this use should be removed. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-04 11:33:02 -07:00
Bo Chen	7fe399598d	vmm: device_manager: Map MMIO regions to the guest correctly To correctly map MMIO regions to the guest, we will need to wait for valid MMIO region information which is generated from 'PciDevice::allocate_bars()' (as a part of 'DeviceManager::add_pci_device()'). Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-05-04 13:53:47 +02:00
Rob Bradford	1dfe4eda5c	vmm: Prevent "internal" identifiers being used by user For devices that cannot be named by the user use the "__" prefix to identify them as internal devices. Check that any identifiers provided in the config do not clash with those internal names. This prevents the user from creating a disk such as "__serial" which would then cause a failure in unpredictable manner. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-04 12:34:11 +02:00
Sebastien Boeuf	6e101f479c	vmm: Ensure hotplugged device identifier is unique Whenever a device (virtio, vfio, vfio-user or vdpa) is hotplugged, we must verify the provided identifier is unique, otherwise we must return an error. Particularly, this will prevent issues with identifiers for serial, console, IOAPIC, balloon, rng, watchdog, iommu and gpio since all of these are hardcoded by the VMM. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-03 18:34:24 +01:00
Rob Bradford	6d4862245d	vmm: Generate event when device is removed The new event contains the BDF and the device id: { "timestamp": { "secs": 2, "nanos": 731073396 }, "source": "vm", "event": "device-removed", "properties": { "bdf": "0000:00:02.0", "id": "test-disk" } } Fixes: #4038 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-03 17:10:36 +02:00
Sebastien Boeuf	a5a2e591c9	vmm: Remove FsConfig from VmConfig when unplugging fs device All hotpluggable devices were properly removed from the VmConfig when a remove-device command was issued, except for the "fs" type. Fix this lack of support as it is causing the integration tests to fail with the recent addition of verifying that identifiers are unique. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-02 13:26:15 +02:00
Sebastien Boeuf	677c8831af	vmm: Ensure uniqueness of generated identifiers The device identifiers generated from the DeviceManager were not guaranteed to be unique since they were not taking the list of identifiers provided through the configuration. By returning the list of unique identifiers from the configuration, and by providing it to the DeviceManager, the generation of new identifiers can rely both on the DeviceTree and the list of IDs from the configuration. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-02 13:26:15 +02:00
Sebastien Boeuf	634c53ea50	vmm: config: Validate provided identifiers are unique A valid configuration means we can only accept unique identifiers from the user. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-02 13:26:15 +02:00
LiHui	ec0c1b01c4	vmm: api: Do not delete the API socket on API server creation The socket will safely deleted on shutdown and so it is not necessary to delete the API socket when starting the HTTP server. Fixes: #4026 Signed-off-by: LiHui <andrewli@kubesphere.io> Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 18:40:49 +01:00
Rob Bradford	f17aa3755f	vmm: Add clarifying comment about Vm::entry_point() Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 11:03:38 +01:00
Rob Bradford	744a049007	vmm: Parallelise functionality with kernel loading Move fuctionality earlier in the boot so as to run in parallel with the loading of the kernel. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 11:03:38 +01:00
Rob Bradford	e70bd069b3	vmm: Load kernel asynchronously Start loading the kernel as possible in the VM in a separate thread. Whilst it is loading other work can be carried out such as initialising the devices. The biggest performance improvement is seen with a more complex set of devices. If using e.g. four virtio-net devices then the time to start the kernel improves by 20-30ms. With the simplest configuration the improvement was of the order of 2-3ms. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 11:03:38 +01:00
Rob Bradford	bfeb3120f5	vmm: Refactor kernel loading to decouple from Vm struct This will allow the kernel to be loaded from another thread. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 11:03:38 +01:00
Rob Bradford	ce6d88d187	vmm: Merge aarch64 use statements These were in their own block and not organised lexically. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 11:03:38 +01:00
Rob Bradford	56fe4c61af	vmm: Duplicate Vm::entry_point() across architectures These will have very different implementations when asynchronously loading the kernel. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 11:03:38 +01:00
Rob Bradford	1d1a087fc5	vmm: Refactor kernel command line generation This allows the same code for generating the kernel command line to be used on both aarch64 and x86_64 when the latter starts loading the kernel in asynchronously. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 11:03:38 +01:00
Rob Bradford	f1276c58d2	vmm: Commandline inject from devices is aarch64 specific This is not required for x86_64 and maintains a tight coupling between kernel loading and the DeviceManager. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 11:03:38 +01:00
Rob Bradford	da33eb5e8c	vmm: device_manager: Remove extra whitespace lines These originated from the removal of the acpi feature gate. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 11:03:38 +01:00
Fabiano Fidêncio	fdeb4f7c46	Revert "vmm, openapi: Token Bucket fields should be uint64" This reverts commit `87eed369cd`. The reason we're reverting this is that OpenAPI Specification[0] doesn't know how to deal with unsigned types. :-/ Right now the best to do is keep it as it's, as an int64, and try to fix OpenAPI, or even switch to swagger, as the latter knows how to properly deal with those. However, switching to swagger is far from being an 1:1 transition and will require time to experiment, thus reverting this for now seems the best approach. [0]: https://github.com/OAI/OpenAPI-Specification/blob/main/versions/3.1.0.md#data-types Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 09:26:38 +02:00
Fabiano Fidêncio	87eed369cd	vmm, openapi: Token Bucket fields should be uint64 The Token Bucket fields are, on the Cloud Hypervisor side, u64. However, we expose those as int64 in the OpenAPI YAML file. With that in mind, let's adjust the yaml file to expose those as uint64. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-27 13:16:02 +02:00
Rob Bradford	79f4c2db01	vmm: Enable virtio-iommu in VmConfig::validate() This means that the automatic enabling of the virtio-iommu will also be applied to VMs creates via the API as well as the CLI. Fixes: #4016 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-26 12:27:00 +01:00
Rob Bradford	bf9f79081a	vmm: Only create ACPI memory manager DSDT when resizable If using the ACPI based hotplug only memory can be added so if the hotplug RAM size is the same as the boot RAM size then do not include the memory manager DSDT entries. Also: this change simplifies the code marginally by making the HotplugMethod enum Copyable. This was identified from the following perf output: 1.78% 0.00% vmm cloud-hypervisor [.] <vmm::memory_manager::MemorySlots as acpi_tables::aml::Aml>::append_aml_bytes \| ---<vmm::memory_manager::MemorySlots as acpi_tables::aml::Aml>::append_aml_bytes <vmm::memory_manager::MemorySlot as acpi_tables::aml::Aml>::append_aml_bytes acpi_tables::aml::Name::new <acpi_tables::aml::Path as acpi_tables::aml::Aml>::append_aml_bytes __libc_malloc Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-26 13:07:19 +02:00
Rob Bradford	62f17ccf8c	vmm: Improve error handling for vmm::vm::Error In particular implement thiserror::Error, cleanup wording and remove unused errors. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-22 17:46:41 +01:00
Rob Bradford	cb03540ffd	vmm: config: Derive thiserror::Error No further changes are necessary that adding a #[derive(Error)] as there is a manual implementation of Display. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-22 17:46:41 +01:00
Rob Bradford	0270d697ab	vmm: cpu: Improve Error reporting Remove unused enum members, improve error messages and implement thiserror::Error. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-22 17:46:41 +01:00
Rob Bradford	47529796d0	arch: Improve arch::Error Remove unused error enum entries, improve wording and derive thiserror::Error. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-22 17:46:41 +01:00
Rob Bradford	1c786610b7	vmm: api: Don't use clashing struct name for Error Import vmm::Error as VmmError to allow the use of thiserror::Error to avoid clashing names. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-22 17:46:41 +01:00
Sebastien Boeuf	eb6daa2fc3	pci: Store MSI interrupt manager in VfioCommon Extend VfioCommon structure to own the MSI interrupt manager. This will be useful for implementing the restore code path. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-04-22 16:16:48 +02:00
Rob Bradford	adb3dcdc13	vmm: openapi: Add serial_number to PlatformConfig Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-21 17:17:08 +02:00
Rob Bradford	e972eb7c74	arch, vmm: Expose platform serial_number via SMBIOS Fixes: #4002 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-21 17:17:08 +02:00
Rob Bradford	203dfdc156	vmm: config: Add "serial_number" option to "--platform" This carries a string that is exposed via DMI/SMBIOS and is particularly useful for cloud-init initialisation. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-21 17:17:08 +02:00
Rob Bradford	4a04d1f8f2	vmm: seccomp: Allow SYS_rseq as required by newer glibc glibc 2.35 as shipped by Fedora 36 now uses the rseq syscall. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-21 13:02:51 +01:00
Rob Bradford	4ca066f077	vmm: api: Simplify error reporting from HTTP to internal API calls Use a single enum member for representing errors from the internal API. This avoids the ugly duplication of the API call name in the error message: e.g. $ target/debug/ch-remote --api-socket /tmp/api resize --cpus 2 Error running command: Server responded with an error: InternalServerError: VmResize(VmResize(CpuManager(DesiredVCpuCountExceedsMax))) Becomes: $ target/debug/ch-remote --api-socket /tmp/api resize --cpus 2 Error running command: Server responded with an error: InternalServerError: ApiError(VmResize(CpuManager(DesiredVCpuCountExceedsMax))) Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-20 19:39:05 +01:00
Sebastien Boeuf	11e9f43305	vmm: Use new Resource type PciBar Instead of defining some very generic resources as PioAddressRange or MmioAddressRange for each PCI BAR, let's move to the new Resource type PciBar in order to make things clearer. This allows the code for being more readable, but also removes the need for hard assumptions about the MMIO and PIO ranges. PioAddressRange and MmioAddressRange types can be used to describe everything except PCI BARs. BARs are very special as they can be relocated and have special information we want to carry along with them. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-04-19 12:54:09 -07:00
Sebastien Boeuf	89218b6d1e	pci: Replace BAR tuple with PciBarConfiguration In order to make the code more consistent and easier to read, we remove the former tuple that was used to describe a BAR, replacing it with the existing structure PciBarConfiguration. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-04-19 12:54:09 -07:00
Sebastien Boeuf	1795afadb8	vmm: Factorize algorithm finding HOB memory resources By factorizing the algorithm untangling TDVF sections from guest RAM into a dedicated function, we can write some unit tests to validate it properly achieves what we expect. Adding the "tdx" feature to the unit tests, otherwise it wouldn't get tested. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-04-19 15:23:12 +02:00
Sebastien Boeuf	5264d545dd	pci, vmm: Extend PciDevice trait to support BAR relocation By adding a new method id() to the PciDevice trait, we allow the caller to retrieve a unique identifier. This is used in the context of BAR relocation to identify the device being relocated, so that we can update the DeviceTree resources for all PCI devices (and not only VirtioPciDevice). Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-04-14 12:11:37 +02:00
Sebastien Boeuf	0c34846ef6	vmm: Return new PCI resources from add_pci_device() By returning the new PCI resources from add_pci_device(), we allow the factorization of the code translating the BARs into resources. This allows VIRTIO, VFIO and vfio-user to add the resources to the DeviceTree node. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-04-14 12:11:37 +02:00
Sebastien Boeuf	4f172ae4b6	vmm: Retrieve PCI resources for VFIO and vfio-user devices Relying on the function introduced recently to get the PCI resources and handle the restore case, both VFIO and vfio-user device creation paths now have access to PCI resources, which can be provided to the function add_pci_device(). Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-04-14 12:11:37 +02:00
Sebastien Boeuf	0f12fe9b3b	vmm: Factorize retrieval of PCI resources Create a dedicated function for getting the PCI segment, b/d/f and optional resources. This is meant for handling the potential case of a restore. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-04-14 12:11:37 +02:00
Sebastien Boeuf	6e084572d4	pci, virtio: Make virtio-pci BAR restoration more generic Updating the way of restoring BAR addresses for virtio-pci by providing a more generic approach that will be reused for other PciDevice implementations (i.e VfioPcidevice and VfioUserPciDevice). Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-04-14 12:11:37 +02:00
Rob Bradford	b212f2823d	vmm: Deprecate mergeable option from virtio-pmem KSM would never merge the file backed pages so this option has no effect. See: #3968 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-12 07:12:25 -07:00
Rob Bradford	ed87e42e6f	vm-device, pci, devices: Remove InterruptSourceGroup::{un}mask The calls to these functions are always preceded by a call to InterruptSourceGroup::update(). By adding a masked boolean to that function call it possible to remove 50% of the calls to the KVM_SET_GSI_ROUTING ioctl as the the update will correctly handle the masked or unmasked case. This causes the ioctl to disappear from the perf report for a boot of the VM. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-11 22:56:48 +01:00
Michael Zhao	d1b2a3fca9	aarch64: Add a memory-simulated flash for UEFI EDK2 execution requires a flash device at address 0. The new added device is not a fully functional flash. It doesn't implement any spec of a flash device. Instead, a piece of memory is used to simulate the flash simply. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-04-11 09:51:34 +01:00
Michael Zhao	298a5580a9	aarch64: Remove unnecessary function definitions This is a refactoring commit to simplify source code. Removed some functions that only return a layout const. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-04-08 11:08:43 -07:00

1 2 3 4 5 ...

1720 Commits