cloud-hypervisor

mirror of https://github.com/cloud-hypervisor/cloud-hypervisor.git synced 2024-12-27 16:15:19 +00:00

Author	SHA1	Message	Date
lizhaoxin1	bc3a276b43	arch, vmm: Expose platform uuid via SMBIOS Parse and set uuid. Signed-off-by: lizhaoxin1 <Lxiaoyouling@163.com> Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-08-04 09:20:06 +02:00
lizhaoxin1	3abc1e1e51	vmm: config: Add "uuid" option to "--platform" The uuid indicates the unique ID of a virtual machine. cloud-hypervisor takes the uuid passed by libvirt and uses it to initialize cloud-init. Signed-off-by: lizhaoxin1 <Lxiaoyouling@163.com>	2022-08-04 09:20:06 +02:00
Bo Chen	1125fd2667	vmm: api: Use 'BTreeMap' for 'HttpRoutes' In this way, we get the values sorted by its key by default, which is useful for the 'http_api' fuzzer. Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-08-03 10:18:24 +01:00
Bo Chen	eb056d374a	vmm: Make 'EpollContext::add_event()' public So that it can be reused by other crate, e.g. from fuzz targets. Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-08-03 10:18:24 +01:00
Sebastien Boeuf	4d74525bdc	vmm: Remove unused "poll_queue" from DiskConfig The parameter "poll_queue" was useful at the time Cloud Hypervisor was responsible for spawning vhost-user backends, as it was carrying the information the vhost-user-block backend should have this option enabled or not. It's been quite some time that we walked away from this design, as we now expect a management layer to be responsible for running vhost-user backends. That's the reason why we can remove "poll_queue" from the DiskConfig structure. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-08-02 15:10:11 +02:00
Michael Zhao	7199119bb2	hypervisor: Remove `Vcpu::read_mpidr()` on AArch64 Replaced `read_mpidr()` with `get_sys_reg()`. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-07-29 11:45:12 +01:00
Michael Zhao	cd7f36a713	hypervisor: Remove `get/set_reg()` on AArch64 `Vcpu::get/set_reg()` were only invoked in Vcpu itself. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-07-29 11:45:12 +01:00
Michael Zhao	f7b6d99c2d	hypervisor: Remove `get/set_sys_regs()` on AArch64 `hypervisor::Vcpu::get/set_sys_regs()` are only used in Vcpu internally. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-07-29 11:45:12 +01:00
Rob Bradford	857edc71a9	vmm: cpu: Remove now unused CpuManager::vcpus_paused() Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-07-26 09:22:25 +02:00
Rob Bradford	0e29379bcf	vmm: Make gdb break/resuming more resilient When starting the VM such that it is already on a breakpoint (via stop_on_boot) when attached to gdb then start the vCPUs in a paused state rather than starting the vCPUs later (upon resume). Further, make the resumption/break of the VM more resilient by only attempting to resume the vCPUs if were are already in a break point and only attempting to pause/break if we were already running. Fixes: #4354 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-07-26 09:22:25 +02:00
Rob Bradford	a749182777	vmm: acpi: Use ACPI platform device addresses from DeviceManager Remove the hardcoded addresses. Also remove PM_TMR_BLK as spec compliant implementation will use X_PM_TMR_BLK over this field. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-07-25 16:16:06 +01:00
Rob Bradford	2e8eb96ef6	vmm: device_manager: Store ACPI platform addresses for later use These are ready for inclusion in the FACP table. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-07-25 16:16:06 +01:00
Wei Liu	ad33f7c5e6	vmm: return seccomp rules according to hypervisors That requires stashing the hypervisor type into various places. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-22 12:50:12 +01:00
Wei Liu	a96a5d7816	hypervisor, vmm: use new vfio-ioctls Use the new vfio-ioctls APIs. Drop Cloud Hypervisor's Device trait since it is no longer needed. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-21 23:37:53 +01:00
Wei Liu	f84ddedb1a	hypervisor, vmm: introduce trait functions for aarch64 PMU The original code uses kvm_device_attr directly outside of the hyeprvisor crate. That leaks hypervisor details. No functional change intended. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-21 23:37:53 +01:00
Wei Liu	f21fc1dcb6	hypervisor: x86: provide a generic MsrEntry structure Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-20 10:13:41 +01:00
Wei Liu	4d2cc3778f	hypervisor: move away from MsrEntries type It is a flexible array. Switch to vector and slice instead. No functional change intended. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-20 10:13:41 +01:00
Wei Liu	05e5106b9b	hypervisor x86: provide a generic LapicState structure This requires making get/set_lapic_reg part of the type. For the moment we cannot provide a default variant for the new type, because picking one will be wrong for the other hypervisor, so I just drop the test cases that requires LapicState::default(). Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-19 09:38:38 +01:00
Wei Liu	6a8c0fc887	hypervisor: provide a generic FpuState structure Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-18 22:15:30 +01:00
Wei Liu	08135fa085	hypervisor: provide a generic CpudIdEntry structure Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-18 22:15:30 +01:00
Wei Liu	45fbf840db	hypervisor, vmm: move away from CpuId type CpuId is an alias type for the flexible array structure type over CpuIdEntry. The type itself and the type of the element in the array portion are tied to the underlying hypervisor. Switch to using CpuIdEntry slice or vector directly. The construction of CpuId type is left to hypervisors. This allows us to decouple CpuIdEntry from hypervisors more easily. No functional change intended. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-18 22:15:30 +01:00
Wei Liu	f1ab86fecb	hypervisor: x86: provide a generic SpecialRegisters structure Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-15 10:21:43 +01:00
Wei Liu	75797827d5	hypervisor: x86: provide a generic SegmentRegister structure And drop SegmentRegisterOps since it is no longer required. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-15 10:21:43 +01:00
Wei Liu	8b7781e267	hypervisor: x86: provide a generic StandardRegisters structure We only need to do this for x86 since MSHV does not have aarch64 support yet. This reduces unnecessary code churn. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-15 10:21:43 +01:00
Wei Liu	4201bf4011	hypervisor: provide a generic ClockData structure Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-14 22:09:04 +01:00
Wei Liu	beb4f86b82	hypervisor, vmm: drop VmState and code VmState was introduced to hold hypervisor specific VM state. KVM does not need it and MSHV does not really use it yet. Just drop the code. It can be easily revived once there is a need. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-14 22:09:04 +01:00
Alyssa Ross	a455917db5	vmm: fix missed API or debug events Previously, we were assuming that every time an eventfd notified us, there was only a single event waiting for us. This meant that if, while one API request was being processed, two more arrived, the second one would not be processed (until the next one arrived, when it would be processed instead of that event, and so on). To fix this, make sure we're processing the number of API and debug requests we've been told have arrived, rather than just one. This is easy to demonstrate by sending lots of API events and adding some sleeps to make sure multiple events can arrive while each is being processed. For other uses of eventfd, like the exit event, this doesn't matter — even if we've received multiple exit events in quick succession, we only need to exit once. So I've only made this change where receiving an event is non-idempotent, i.e. where it matters that we process the event the right number of times. Technically, reset requests are also non-idempotent — there's an observable difference between a VM resetting once, and a VM resetting once and then immediately resetting again. But I've left that alone for now because two resets in immediate succession doesn't sound like something anyone would ever want to me. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2022-07-14 17:44:11 +01:00
Michael Zhao	2d8635f04a	hypervisor: Refactor `system_registers` on AArch64 Function `system_registers` took mutable vector reference and modified the vector content. Now change the definition to `get/set` style. And rename to `get/set_sys_regs` to align with other functions. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-07-14 22:55:19 +08:00
Michael Zhao	c445513976	hypervisor: Refactor `core_registers` on AArch64 On AArch64, the function `core_registers` and `set_core_registers` are the same thing of `get/set_regs` on x86_64. Now the names are aligned. This will benefit supporting `gdb`. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-07-14 22:55:19 +08:00
Wei Liu	0e8769d76a	device_manager: assert passthrough_device has the correct type There is a lot of unsafe code in such a small function. Add an assert to help detect issues earlier. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-14 08:09:50 +01:00
Wei Liu	84bbaf06d1	hypervisor: turn boot_msr_entries into a trait method This allows dispatching to either KVM or MSHV automatically. No functional change. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-08 16:49:58 +01:00
Rob Bradford	121729a3b0	vmm: Split signal handling for VM and VMM signals The VM specific signal (currently only SIGWINCH) should only be handled when the VM is running. The generic VMM signals (SIGINT and SIGTERM) need handling at all times. Split the signal handling into two separate threads which have differing lifetimes. Tested by: 1.) Boot full VM and check resize handling (SIGWINCH) works & sending SIGTERM leads to cleanup (tested that API socket is removed.) 2.) Start without a VM and send SIGTERM/SIGINT and observe cleanup (API socket removed) 3.) Boot full VM, delete VM and observe 2.) holds. 4.) Boot full VM, delete VM, recreate VM and observe 1.) holds. Fixes: #4269 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-07-08 15:15:46 +01:00
Rob Bradford	93237f0106	vmm: Set MADT "Online Capable" flag The Linux kernel now checks for this before marking CPUs as hotpluggable: commit aa06e20f1be628186f0c2dcec09ea0009eb69778 Author: Mario Limonciello <mario.limonciello@amd.com> Date: Wed Sep 8 16:41:46 2021 -0500 x86/ACPI: Don't add CPUs that are not online capable A number of systems are showing "hotplug capable" CPUs when they are not really hotpluggable. This is because the MADT has extra CPU entries to support different CPUs that may be inserted into the socket with different numbers of cores. Starting with ACPI 6.3 the spec has an Online Capable bit in the MADT used to determine whether or not a CPU is hotplug capable when the enabled bit is not set. Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/05_ACPI_Software_Programming_Model/ACPI_Software_Programming_Model.html?#local-apic-flags Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-07-01 18:45:05 +01:00
Rob Bradford	adf5881757	build: #[allow(clippy::significant_drop_in_scrutinee) in some crates This check is new in the beta version of clippy and exists to avoid potential deadlocks by highlighting when the test in an if or for loop is something that holds a lock. In many cases we would need to make significant refactorings to be able to pass this check so disable in the affected crates. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-30 20:50:45 +01:00
Rob Bradford	b57d7b258d	build: Fix beta clippy issue (needless_return) warning: unneeded `return` statement --> pci/src/vfio_user.rs:627:13 \| 627 \| / return Err(std::io::Error::new( 628 \| \| std::io::ErrorKind::Other, 629 \| \| format!("Region not found for 0x{:x}", gpa), 630 \| \| )); \| \|_______________^ \| = note: `#[warn(clippy::needless_return)]` on by default = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_return help: remove `return` \| 627 ~ Err(std::io::Error::new( 628 + std::io::ErrorKind::Other, 629 + format!("Region not found for 0x{:x}", gpa), 630 + )) \| Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-30 20:50:45 +01:00
Rob Bradford	2716bc3311	build: Fix beta clippy issue (derive_partial_eq_without_eq) warning: you are deriving `PartialEq` and can implement `Eq` --> vmm/src/serial_manager.rs:59:30 \| 59 \| #[derive(Debug, Clone, Copy, PartialEq)] \| ^^^^^^^^^ help: consider deriving `Eq` as well: `PartialEq, Eq` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-30 20:50:45 +01:00
Rob Bradford	2e664dca64	vmm: Always reset the console mode on VMM exit Tested: 1. SIGTERM based 2. VM shutdown/poweroff 3. Injected VM boot failure after calling Vm::setup_tty() Fixes: #4248 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-28 16:45:27 +01:00
Rob Bradford	65ec6631fb	vmm: cpu: Store the vCPU snapshots in ascending order The snapshots are stored in a BTree which is ordered however as the ids are strings lexical ordering places "11" ahead of "2". So encode the vCPU id with zero padding so it is lexically sorted. This fixes issues with CPU restore on aarch64. See: #4239 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-27 16:20:57 +01:00
Wei Liu	bccd7c7e48	vmm: drop Sync+Send bounds for EndpointHandler Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-06-20 23:28:57 +01:00
Wei Liu	8fa1098629	vmm: switch from lazy_static to once_cell Once_cell does not require using macro and is slated to become part of Rust std at some point. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-06-20 16:03:07 +01:00
Sebastien Boeuf	335a4e1cc0	vmm: api: Expose kvm_hyperv parameter in OpenAPI description Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-06-17 15:11:53 +01:00
Sebastien Boeuf	81ba70a497	pci, vmm: Defer mapping VFIO MMIO regions on restore When restoring a VM, the restore codepath will take care of mapping the MMIO regions based on the information from the snapshot, rather than having the mapping being performed during device creation. When the device is created, information such as which BARs contain the MSI-X tables are missing, preventing to perform the mapping of the MMIO regions. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-06-09 09:19:58 +02:00
Sebastien Boeuf	7df7061610	pci, vmm: Add migratable support to vfio-user devices Based on recent changes to VfioUserPciDevice, the vfio-user devices can now be migrated. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-06-09 09:19:58 +02:00
Sebastien Boeuf	c021dda267	pci, vmm: Add migratable support to VFIO devices Based on recent changes to VfioPciDevice, the VFIO devices can now be migrated. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-06-09 09:19:58 +02:00
Rob Bradford	94fb9f817d	vmm: Fix clippy issues under "guest_debug" feature Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-08 11:40:56 +01:00
Michael Zhao	a7a15d56dd	aarch64: Move `setup_regs` to `hypervisor` `setup_regs` of AArch64 calls KVM sepecific code. Now move it to `hypervisor` crate. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-06-06 11:07:46 +01:00
Sebastien Boeuf	65dc1c83a9	vmm: cpu: Save and restore CPU states during snapshot/restore Based on recent KVM host patches (merged in Linux 5.16), it's forbidden to call into KVM_SET_CPUID2 after the first successful KVM_RUN returned. That means saving CPU states during the pause sequence, and restoring these states during the resume sequence will not work with the current design starting with kernel version 5.16. In order to solve this problem, let's simply move the save/restore logic to the snapshot/restore sequences rather than the pause/resume ones. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-06-06 11:07:29 +01:00
Sebastien Boeuf	3edaa8adb6	vmm: Ensure restore matches boot sequence The vCPU is created and set after all the devices on a VM's boot. There's no reason to follow a different order on the restore codepath as this could cause some unexpected behaviors. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-06-06 11:07:17 +01:00
Michael Zhao	9260c3816e	vmm: Update unit test for GIC refactoring Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-06-06 10:17:26 +08:00
Michael Zhao	5d45d6d0fb	vmm: Move GIC unit test to `hypervisor` crate Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-06-06 10:17:26 +08:00
Michael Zhao	957d3a7443	aarch64: Simplify GIC related structs definition Combined the `GicDevice` struct in `arch` crate and the `Gic` struct in `devices` crate. After moving the KVM specific code for GIC in `arch`, a very thin wapper layer `GicDevice` was left in `arch` crate. It is easy to combine it with the `Gic` in `devices` crate. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-06-06 10:17:26 +08:00
Michael Zhao	04949755c0	arch: Switch to new GIC interface Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-06-06 10:17:26 +08:00
Rob Bradford	ade3a9c8f6	virtio-devices, vmm: Optimised async virtio device activation In order to ensure that the virtio device thread is spawned from the vmm thread we use an asynchronous activation mechanism for the virtio devices. This change optimises that code so that we do not need to iterate through all virtio devices on the platform in order to find the one that requires activation. We solve this by creating a separate short lived VirtioPciDeviceActivator that holds the required state for the activation (e.g. the clones of the queues) this can then be stored onto the device manager ready for asynchronous activation. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-01 09:42:02 +02:00
Yi Wang	dbeb922882	doc: add vm coredump support Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Co-authored-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-30 13:41:40 +02:00
Yi Wang	8b585b96c1	vmm: enable coredump Based on the newly added guest_debug feature, this patch adds http endpoint support. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Co-authored-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-30 13:41:40 +02:00
Yi Wang	ccb604e1e1	vmm: add cpu segment note for coredump The crash tool use a special note segment which named 'QEMU' to analyze kaslr info and so on. If we don't add the 'QEMU' note segment, crash tool can't find linux version to move on. For now, the most convenient way is to add 'QEMU' note segment to make crash tool happy. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>	2022-05-30 13:41:40 +02:00
Yi Wang	0e65ca4a6c	vmm: save guest memory for coredump Guest memory is needed for analysis in crash tool, so save it for coredump. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Co-authored-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-30 13:41:40 +02:00
Yi Wang	7e280b6f70	vmm: save elf header for coredump The vmcore file of guest is an elf format, so the first step of coredump is to save the elf header. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>	2022-05-30 13:41:40 +02:00
Yi Wang	90034fd6ba	vmm: add GuestDebuggable trait It's useful to dump the guest, which named coredump so that crash tool can be used to analysize it when guest hung up. Let's add GuestDebuggable trait and Coredumpxxx error to support coredump firstly. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Co-authored-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-30 13:41:40 +02:00
Rob Bradford	465db7f08c	vmm: config: Remove mergeable option from PmemConfig Fixes: #3968 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-27 09:48:49 +02:00
Rob Bradford	55c5961f43	vmm: config: Remove dax & cache_size options from FsConfig Fixes: #3889 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-27 09:47:13 +02:00
Rob Bradford	7c3582b4a8	vmm: config: Fix error message regarding use of cache size without dax The error message incorrectly said that the user was trying to combine cache_size without dax whereas it is only usuable with dax. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-27 09:47:13 +02:00
Rob Bradford	979797786d	vmm: Remove DAX cache setup for virtio-fs devices Remove the code from the DeviceManager that prepares the DAX cache since the functionality has now been removed. Fixes: #3889 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-27 09:47:13 +02:00
Michael Zhao	0fd6521759	aarch64: Avoid depending on `layout` in GIC code Removing the dependency on `layout` helps moving GIC code into `hypervisor` crate. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-05-27 10:57:50 +08:00
Michael Zhao	3fe20cc09a	aarch64: Remove `GicDevice` trait `GicDevice` trait was defined for the common part of GicV3 and ITS. Now that the standalone GicV3 do not exist, `GicDevice` is not needed. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-05-27 10:57:50 +08:00
Rob Bradford	fa07d83565	Revert "virtio-devices, vmm: Optimised async virtio device activation" This reverts commit `f160572f9d`. There has been increased flakiness around the live migration tests since this was merged. Speculatively reverting to see if there is increased stability. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-21 21:27:33 +01:00
Rob Bradford	f160572f9d	virtio-devices, vmm: Optimised async virtio device activation In order to ensure that the virtio device thread is spawned from the vmm thread we use an asynchronous activation mechanism for the virtio devices. This change optimises that code so that we do not need to iterate through all virtio devices on the platform in order to find the one that requires activation. We solve this by creating a separate short lived VirtioPciDeviceActivator that holds the required state for the activation (e.g. the clones of the queues) this can then be stored onto the device manager ready for asynchronous activation. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-20 17:07:13 +01:00
Sebastien Boeuf	49db713124	virtio-devices, vmm: Remove unused macro rules Latest cargo beta version raises warnings about unused macro rules. Simply remove them to fix the beta build. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-20 09:59:43 +01:00
Maksym Pavlenko	3a0429c998	cargo: Clean up serde dependencies There is no need to include serde_derive separately, as it can be specified as serde feature instead. Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>	2022-05-18 08:21:19 +02:00
Rob Bradford	16a9882153	vmm: cpu: tdx: Don't use fd suffix for something not an FD The hypervisor::Vcpu is the abstraction over the fd. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-13 15:39:22 +02:00
Rob Bradford	218be2642e	hypervisor: Explicitly `pub use` at the hypervisor crate top-level Explicitly re-export types from the hypervisor specific modules. This makes it much clearer what the common functionality that is exposed is. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-13 15:39:22 +02:00
Rob Bradford	cd0df05808	vmm, arch: CpuId is x86_64 specific so import from the x86_64 module It will be removed as a top-level export from the hypervisor crate. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-13 15:39:22 +02:00
Rob Bradford	d3f66f8702	hypervisor: Make vm module private And thus only export what is necessary through a `pub use`. This is consistent with some of the other modules and makes it easier to understand what the external interface of the hypervisor crate is. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-13 15:39:22 +02:00
Rob Bradford	b1bd87df19	vmm: Simplify MsiInterruptManager generics By taking advantage of the fact that IrqRoutingEntry is exported by the hypervisor crate (that is typedef'ed to the hypervisor specific version) then the code for handling the MsiInterruptManager can be simplified. This is particularly useful if in this future it is not a typedef but rather a wrapper type. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-11 11:19:14 +01:00
Rob Bradford	3f9e8d676a	hypervisor: Move creation of irq routing struct to hypervisor crate This removes the requirement to leak as many datastructures from the hypervisor crate into the vmm crate. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-11 11:19:14 +01:00
Rob Bradford	c2c813599d	vmm: Don't use kvm_ioctls directly The IoEventAddress is re-exported through the crate at the top-level. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-10 15:57:43 +01:00
Rob Bradford	387d56879b	vmm, hypervisor: Clean up nomenclature around offloading VM operations The trait and functionality is about operations on the VM rather than the VMM so should be named appropriately. This clashed with with existing struct for the concrete implementation that was renamed appropriately. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-10 13:10:01 +01:00
Sebastien Boeuf	5f722d0d3f	vmm: Fix loading RAW firmware Whenever going through the codepath of loading a RAW firmware, we always add an extra RAM region to the guest memory through the memory manager. But we must be careful to use the updated guest memory rather than a previous reference that wasn't containing the new region, as this can lead to the following error: VmBoot(FirmwareLoad(InvalidGuestAddress(GuestAddress(4290772992)))) This is fixed by the current patch, getting the latest reference onto the guest memory from the memory manager right after the new region has been added. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-06 18:13:28 +02:00
Bo Chen	42c19e14c5	vmm: Add 'shutdown()' to vCPU seccomp filter This is required when hot-removing a vfio-user device. Details code path below: Thread 6 "vcpu0" received signal SIGSYS, Bad system call. [Switching to Thread 0x7f8196889700 (LWP 2358305)] 0x00007f8196dae7ab in shutdown () at ../sysdeps/unix/syscall-template.S:78 78 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS) (gdb) bt 0x00007f8196dae7ab in shutdown () at ../sysdeps/unix/syscall-template.S:78 0x000056189240737d in std::sys::unix::net::Socket::shutdown () at library/std/src/sys/unix/net.rs:383 std::os::unix::net::stream::UnixStream::shutdown () at library/std/src/os/unix/net/stream.rs:479 0x000056189210e23d in vfio_user::Client::shutdown (self=0x7f8190014300) at vfio_user/src/lib.rs:787 0x00005618920b9d02 in <pci::vfio_user::VfioUserPciDevice as core::ops::drop::Drop>::drop ( self=0x7f819002d7c0) at pci/src/vfio_user.rs:551 0x00005618920b8787 in core::ptr::drop_in_place<pci::vfio_user::VfioUserPciDevice> () at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ptr/mod.rs:188 0x00005618920b92e3 in core::ptr::drop_in_place<core::cell::UnsafeCell<dyn pci::device::PciDevice>> () at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ptr/mod.rs:188 0x00005618920b9362 in core::ptr::drop_in_place<std::sync::mutex::Mutex<dyn pci::device::PciDevice>> () at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ptr/mod.rs:188 0x00005618920d8a3e in alloc::sync::Arc<T>::drop_slow (self=0x7f81968852b8) at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/alloc/src/sync.rs:1092 0x00005618920ba273 in <alloc::sync::Arc<T> as core::ops::drop::Drop>::drop (self=0x7f81968852b8) at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/alloc/src/sync.rs:1688 0x00005618920b76fb in core::ptr::drop_in_place<alloc::sync::Arc<std::sync::mutex::Mutex<dyn pci::device::PciDevice>>> () at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ptr/mod.rs:188 0x0000561891b5e47d in vmm::device_manager::DeviceManager::eject_device (self=0x7f8190009600, pci_segment_id=0, device_id=3) at vmm/src/device_manager.rs:4000 0x0000561891b674bc in <vmm::device_manager::DeviceManager as vm_device:🚌:BusDevice>::write ( self=0x7f8190009600, base=70368744108032, offset=8, data=&[u8](size=4) = {...}) at vmm/src/device_manager.rs:4625 0x00005618921927d5 in vm_device:🚌:Bus::write (self=0x7f8190006e00, addr=70368744108040, data=&[u8](size=4) = {...}) at vm-device/src/bus.rs:235 0x0000561891b72e10 in <vmm::vm::VmOps as hypervisor::vm::VmmOps>::mmio_write ( self=0x7f81900097b0, gpa=70368744108040, data=&[u8](size=4) = {...}) at vmm/src/vm.rs:378 0x0000561892133ae2 in <hypervisor::kvm::KvmVcpu as hypervisor::cpu::Vcpu>::run ( self=0x7f8190013c90) at hypervisor/src/kvm/mod.rs:1114 0x0000561891914e85 in vmm::cpu::Vcpu::run (self=0x7f819001b230) at vmm/src/cpu.rs:348 0x000056189189f2cb in vmm::cpu::CpuManager::start_vcpu::{{closure}}::{{closure}} () at vmm/src/cpu.rs:953 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-05-05 15:33:26 -07:00
Sebastien Boeuf	058a61148c	vmm: Factorize net creation Since both Net and vhost_user::Net implement the Migratable trait, we can factorize the common part to simplify the code related to the net creation. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-05 13:08:41 +02:00
Sebastien Boeuf	425902b296	vmm: Factorize disk creation Since both Block and vhost_user::Blk implement the Migratable trait, we can factorize the common part to simplify the code related to the disk creation. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-05 13:08:41 +02:00
Sebastien Boeuf	54f39aa8cb	vmm: Validate vhost-user-block/net are not configured with iommu=on Extend the validate() function for both DiskConfig and NetConfig so that we return an error if a vhost-user-block or vhost-user-net device is expected to be placed behind the virtual IOMMU. Since these devices don't support this feature, we can't allow iommu to be set to true in these cases. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-05 13:08:41 +02:00
Rob Bradford	707cea2182	vmm, devices: Move logging of 0x80 timestamp to its own device This is a cleaner approach to handling the I/O port write to 0x80. Whilst doing this also use generate the timestamp at the start of the VM creation. For consistency use the same timestamp for the ARM equivalent. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-04 23:02:53 +01:00
Rob Bradford	c47e3b8689	gdb: Do not use VmmOps for memory manipulation We don't use the VmmOps trait directly for manipulating memory in the core of the VMM as it's really designed for the MSHV crate to handle instruction decoding. As I plan to make this trait MSHV specific to allow reduced locking for MMIO and PIO handling when running on KVM this use should be removed. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-04 11:33:02 -07:00
Bo Chen	7fe399598d	vmm: device_manager: Map MMIO regions to the guest correctly To correctly map MMIO regions to the guest, we will need to wait for valid MMIO region information which is generated from 'PciDevice::allocate_bars()' (as a part of 'DeviceManager::add_pci_device()'). Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-05-04 13:53:47 +02:00
Rob Bradford	1dfe4eda5c	vmm: Prevent "internal" identifiers being used by user For devices that cannot be named by the user use the "__" prefix to identify them as internal devices. Check that any identifiers provided in the config do not clash with those internal names. This prevents the user from creating a disk such as "__serial" which would then cause a failure in unpredictable manner. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-04 12:34:11 +02:00
Sebastien Boeuf	6e101f479c	vmm: Ensure hotplugged device identifier is unique Whenever a device (virtio, vfio, vfio-user or vdpa) is hotplugged, we must verify the provided identifier is unique, otherwise we must return an error. Particularly, this will prevent issues with identifiers for serial, console, IOAPIC, balloon, rng, watchdog, iommu and gpio since all of these are hardcoded by the VMM. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-03 18:34:24 +01:00
Rob Bradford	6d4862245d	vmm: Generate event when device is removed The new event contains the BDF and the device id: { "timestamp": { "secs": 2, "nanos": 731073396 }, "source": "vm", "event": "device-removed", "properties": { "bdf": "0000:00:02.0", "id": "test-disk" } } Fixes: #4038 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-05-03 17:10:36 +02:00
Sebastien Boeuf	a5a2e591c9	vmm: Remove FsConfig from VmConfig when unplugging fs device All hotpluggable devices were properly removed from the VmConfig when a remove-device command was issued, except for the "fs" type. Fix this lack of support as it is causing the integration tests to fail with the recent addition of verifying that identifiers are unique. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-02 13:26:15 +02:00
Sebastien Boeuf	677c8831af	vmm: Ensure uniqueness of generated identifiers The device identifiers generated from the DeviceManager were not guaranteed to be unique since they were not taking the list of identifiers provided through the configuration. By returning the list of unique identifiers from the configuration, and by providing it to the DeviceManager, the generation of new identifiers can rely both on the DeviceTree and the list of IDs from the configuration. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-02 13:26:15 +02:00
Sebastien Boeuf	634c53ea50	vmm: config: Validate provided identifiers are unique A valid configuration means we can only accept unique identifiers from the user. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-05-02 13:26:15 +02:00
LiHui	ec0c1b01c4	vmm: api: Do not delete the API socket on API server creation The socket will safely deleted on shutdown and so it is not necessary to delete the API socket when starting the HTTP server. Fixes: #4026 Signed-off-by: LiHui <andrewli@kubesphere.io> Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 18:40:49 +01:00
Rob Bradford	f17aa3755f	vmm: Add clarifying comment about Vm::entry_point() Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 11:03:38 +01:00
Rob Bradford	744a049007	vmm: Parallelise functionality with kernel loading Move fuctionality earlier in the boot so as to run in parallel with the loading of the kernel. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 11:03:38 +01:00
Rob Bradford	e70bd069b3	vmm: Load kernel asynchronously Start loading the kernel as possible in the VM in a separate thread. Whilst it is loading other work can be carried out such as initialising the devices. The biggest performance improvement is seen with a more complex set of devices. If using e.g. four virtio-net devices then the time to start the kernel improves by 20-30ms. With the simplest configuration the improvement was of the order of 2-3ms. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 11:03:38 +01:00
Rob Bradford	bfeb3120f5	vmm: Refactor kernel loading to decouple from Vm struct This will allow the kernel to be loaded from another thread. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 11:03:38 +01:00
Rob Bradford	ce6d88d187	vmm: Merge aarch64 use statements These were in their own block and not organised lexically. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 11:03:38 +01:00
Rob Bradford	56fe4c61af	vmm: Duplicate Vm::entry_point() across architectures These will have very different implementations when asynchronously loading the kernel. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 11:03:38 +01:00
Rob Bradford	1d1a087fc5	vmm: Refactor kernel command line generation This allows the same code for generating the kernel command line to be used on both aarch64 and x86_64 when the latter starts loading the kernel in asynchronously. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 11:03:38 +01:00
Rob Bradford	f1276c58d2	vmm: Commandline inject from devices is aarch64 specific This is not required for x86_64 and maintains a tight coupling between kernel loading and the DeviceManager. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-04-29 11:03:38 +01:00

1 2 3 4 5 ...

1748 Commits