cloud-hypervisor

mirror of https://github.com/cloud-hypervisor/cloud-hypervisor.git synced 2024-11-05 11:31:14 +00:00

Author	SHA1	Message	Date
Sebastien Boeuf	d0c53a5357	vmm: Move Vm to the new restore design Now the entire codebase has been moved to the new restore design, we can complete the work by creating a dedicated restore() function for the Vm object and get rid of the method restore() from the Snapshottable trait. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-12-01 10:16:44 -08:00
Rob Bradford	3888f57600	aarch64: Remove unnecessary casts (beta clippy check) Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-12-01 17:02:30 +00:00
Michael Zhao	b173f6f654	vmm,devices: Change Gic snapshot and restore path The snapshot and restore of AArch64 Gic was done in Vm. Now it is moved to DeviceManager. The benefit is that the restore can be done while the Gic is created in DeviceManager. While the moving of state data from Vm snapshot to DeviceManager snapshot breaks the compatability of migration from older versions. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-12-01 17:07:25 +01:00
Michael Zhao	def1d7cf86	vmm: Remove GICR typers in snapshot on AArch64 The GICR typers are also set in restoring the GIC. Saving them in snapshot is not needed. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-12-01 17:07:25 +01:00
Sebastien Boeuf	e8c6d83f3f	vmm: Merge Vm::new_from_snapshot with Vm::new Given the recent factorization that happened in vm.rs, we're now able to merge Vm::new_from_snapshot with Vm::new. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-12-01 13:46:31 +01:00
Sebastien Boeuf	1c36065754	vmm: Move devices creation to Vm creation This moves the devices creation out of the dedicated restore function which will be eventually removed. This factorizes the creation of all devices into a single location. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-12-01 13:46:31 +01:00
Sebastien Boeuf	bccfa81368	vmm: Restore clock from Vm creation (x86_64 only) This allows the clock restoration to be moved out of the dedicated restore function, which will eventually be removed. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-12-01 13:46:31 +01:00
Sebastien Boeuf	a6959a7469	vmm: Move DeviceManager to new restore design Based on all the work that has already been merged, it is now possible to fully move DeviceManager out of the previous restore model, meaning there's no need for a dedicated restore() function to be implemented there. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-12-01 13:46:31 +01:00
Sebastien Boeuf	4487c8376b	vmm: Move CpuManager and Vcpu to the new restore design Every Vcpu is now created with the right state if there's an available snapshot associated with it. This simplifies the restore logic. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-12-01 09:27:00 +01:00
Sebastien Boeuf	b62a40efae	virtio-devices, vmm: Always restore virtio devices in paused state Following the new restore design, it is not appropriate to set every virtio device threads into a paused state after they've been started. This is why we remove the line of code pausing the devices only after they've been restored, and replace it with a small patch in every virtio device implementation. When a virtio device is created as part of a restored VM, the associated "paused" boolean is set to true. This ensures the corresponding thread will be directly parked when being started, avoiding the thread to be in a different state than the one it was on the source VM during the snapshot. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-12-01 09:27:00 +01:00
Bo Chen	ec94ae31ee	vmm: EpollContext: Allow to add custom epoll events for fuzzing Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-11-30 12:13:14 +00:00
Sebastien Boeuf	90b5014a50	vmm: device_manager: Remove 'restoring' attribute Given 'restoring' isn't needed anymore from the DeviceManager structure, let's simplify it. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-11-29 13:46:30 +01:00
Sebastien Boeuf	cc3706afe1	pci, vmm: Move VfioPciDevice and VfioUserPciDevice to new restore design Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-11-29 13:46:30 +01:00
Rob Bradford	6f8bd27cf7	build: Bulk update dependencies Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-11-28 16:57:49 +00:00
Sebastien Boeuf	81862e8ed3	devices, vmm: Move Gpio to new restore design Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-11-25 17:37:29 +00:00
Sebastien Boeuf	9fbf52b998	devices, vmm: Move Pl011 to new restore design Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-11-25 17:37:29 +00:00
Sebastien Boeuf	0bd910e8b0	devices, vmm: Move Serial to new restore design Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-11-25 17:37:29 +00:00
Sebastien Boeuf	ef92e55998	devices, vmm: Move Ioapic to new restore design Moving the Ioapic object to the new restore design, meaning the Ioapic is created directly with the right state, and it shares the same codepath as when it's created from scratch. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-11-25 17:18:21 +01:00
Sebastien Boeuf	a50b3784fe	virtio-devices: Create a proper result type for VirtioPciDevice Creating a dedicated Result type for VirtioPciDevice, associated with the new VirtioPciDeviceError enum. This allows for a clearer handling of the errors generated through VirtioPciDevice::new(). Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-11-23 18:37:40 +00:00
Sebastien Boeuf	eae8043890	pci, virtio-devices: Move VirtioPciDevice to the new restore design The code for restoring a VirtioPciDevice has been updated, including the dependencies VirtioPciCommonConfig, MsixConfig and PciConfiguration. It's important to note that both PciConfiguration and MsixConfig still have restore() implementations because Vfio and VfioUser devices still rely on the old way for restore. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-11-23 18:37:40 +00:00
Michael Zhao	7d16c74020	vmm: Refactor AArch64 GIC initialization process In the new process, `device::Gic::new()` covers additional actions: 1. Creating `hypervisor::vGic` 2. Initializing interrupt routings The change makes the vGic device ready in the beginning of `DeviceManager::create_devices()`. This can unblock the GIC related devices initialization in the `DeviceManager`. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-11-23 11:49:57 +01:00
Sebastien Boeuf	86e7f07485	vmm: cpu: Create vCPUs before the DeviceManager Moving the creation of the vCPUs before the DeviceManager gets created will allow for the aarch64 vGIC to be created before the DeviceManager as well in a follow up patch. The end goal being to adopt the same creation sequence for both x86_64 and aarch64, and keeping in mind that the vGIC requires every vCPU to be created. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-11-23 11:49:57 +01:00
Sebastien Boeuf	578780ed0c	vmm: cpu: Split vCPU creation Split the vCPU creation into two distincts parts. On the one hand we create the actual Vcpu object with the creation of the hypervisor::Vcpu. And on the other hand, we configure the existing Vcpu, setting registers to proper values (such as setting the entry point). This will allow for further work to move the creation earlier in the boot, so that the hypervisor::Vcpu will be already created when the DeviceManager gets created. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com> Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-11-23 11:49:57 +01:00
Sebastien Boeuf	ec01062ada	vmm: Switch order between DeviceManager and CpuManager creation The CpuManager is now created before the DeviceManager. This is required as preliminary work for creating the vCPUs before the DeviceManager, which is required to ensure both x86_64 and aarch64 follow the same sequence. It's important to note the optimization for faster PIO accesses on the PCI config space had to be removed given the VmOps was required by the CpuManager and by the Vcpu by extension. But given the PciConfigIo is created as part of the DeviceManager, there was no proper way of moving things around so that we could provide PciConfigIo early enough. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-11-23 11:49:57 +01:00
Rob Bradford	e37ec26ccf	vmm: Remove PCI PIO optimisation This optimisation provided some peformance improvement when measured by perf however when considered in terms of boot time peformance this optimisation doesn't have any impact measurable using our peformance-metrics tooling. Removing this optimisation helps simplify the VMM internals as it allows the reordering of the VM creation process permitting refactoring of the restore code path. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-11-22 19:47:53 +00:00
Wei Liu	d05586f520	vmm: modify or provide safety comments Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-11-18 12:50:01 +00:00
Wei Liu	d274fe9cb8	vmm: fix tdx check The field has been moved in `3793ffe888`. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-11-18 12:50:01 +00:00
Praveen K Paladugu	09e79a5e9b	vmm: Add tpm device to mmio bus Add tpm device to mmio bus if appropriate cmdline arguments were passed. Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>	2022-11-15 16:42:21 +00:00
Praveen K Paladugu	af261f231c	vmm: Add required acpi entries for vtpm device Add an TPM2 entry to DSDT ACPI table. Add a TPM2 table to guest's ACPI. Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> Co-authored-by: Sean Yoo <t-seanyoo@microsoft.com>	2022-11-15 16:42:21 +00:00
Praveen K Paladugu	7122e2989c	vmm: Add tpm parameter Add an optional --tpm parameter that takes UNIX Domain Socket from swtpm. Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>	2022-11-15 16:42:21 +00:00
Rob Bradford	6230929d51	openapi: Add thp option to MemoryConfig Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-11-09 16:51:21 +00:00
Rob Bradford	f603afc46e	vmm: Make Transparent Huge Pages controllable (default on) Add MemoryConfig::thp and `--memory thp=on\|off` to allow control of Transparent Huge Pages. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-11-09 16:51:21 +00:00
Rob Bradford	b68add2d0d	vmm: Enable THP when using anonymous memory If the memory is not backed by a file then it is possible to enable Transparent Huge Pages on the memory and take advantage of the benefits of huge pages without requiring the specific allocation of an appropriate number of huge pages. TEST=Boot and see that in /proc/`pidof cloud-hypervisor`/smaps that the region is now THPeligible (and that also pages are being used.) Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-11-09 16:51:21 +00:00
Rob Bradford	6e0bd73c90	build: Bump linux-loader from 0.6.0 to 0.7.0 Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-11-02 11:02:00 +00:00
Bo Chen	a9ec0f33c0	misc: Fix clippy issues Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-11-02 09:41:43 +01:00
Rob Bradford	f4495de143	vmm: Improve handling of shared memory backing As huge pages are always MAP_SHARED then where the shared memory would be checked (for vhost-user and local migration) we can also check instead for huge pages. The checking is also extended to cover the memory zones based configuration as well. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-10-31 22:28:29 +00:00
Rob Bradford	99d9a3d299	vmm: memory_manager: Avoid MAP_PRIVATE CoW with VFIO for hugepages too We can't use MAP_ANONYMOUS and still have huge pages so MAP_SHARED is effectively required when using huge pages. Unfortunately it is not as simple as always forcing MAP_SHARED if hugepages is on as this might be inappropriate in the backing file case hence why there is additional complexity of assigning to mmap_flags on each case and the MAP_SHARED is only turned on for the anonymous file huge page case as well as anonymous shared file case. See: #4805 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-10-31 22:28:29 +00:00
Rob Bradford	df7c728399	vmm: memory_manager: Only file back memory when required If we do not need an anonymous file backing the memory then do not create one. As a side effect this addresses an issue with CoW (mmap with MAP_PRIVATE but no MAP_ANONYMOUS) when the memory is pinned for VFIO. Fixes: #4805 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-10-31 22:28:29 +00:00
Rob Bradford	1e5a4e8d77	vmm: memory_manager: Split filesystem backed and anonymous RAM creation This simplifies the code somewhat making the code paths more readable. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-10-31 22:28:29 +00:00
Rob Bradford	ff3fb91ba6	vmm: Refactor creation of the FileOffset for GuestRegionMmap::new() Create this earlier so that it is possible to pass a None in for anonymous mappings. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-10-31 22:28:29 +00:00
Jinrong Liang	cb171d4a23	device_manager: Avoid checking io_uring support when it's not needed After testing, io_uring_is_supported() causes about 38ms of overhead when creating virtio-blk. By modifying the position of io_uring_is_supported(), the overhead of creating virtio-blk is reduced to less than 1ms when we close io_uring. Signed-off-by: Jinrong Liang <cloudliang@tencent.com>	2022-10-27 22:21:51 -07:00
Wei Liu	b99b2bc990	memory_manager: use MFD_CLOEXEC flag when creating memory fd Until there is a need for sharing the memory fd with a child process, we should err on the safe side to close it on exec. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-10-27 09:20:08 +02:00
Sebastien Boeuf	1f0e5eb66a	vmm: virtio-devices: Restore every VirtioDevice upon creation Following the new design proposal to improve the restore codepath when migrating a VM, all virtio devices are supplied with an optional state they can use to restore from. The restore() implementation every device was providing has been removed in order to prevent from going through the restoration twice. Here is the list of devices now following the new restore design: - Block (virtio-block) - Net (virtio-net) - Rng (virtio-rng) - Fs (vhost-user-fs) - Blk (vhost-user-block) - Net (vhost-user-net) - Pmem (virtio-pmem) - Vsock (virtio-vsock) - Mem (virtio-mem) - Balloon (virtio-balloon) - Watchdog (virtio-watchdog) - Vdpa (vDPA) - Console (virtio-console) - Iommu (virtio-iommu) Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-10-24 14:17:08 +02:00
Sebastien Boeuf	157db33d65	vmm: Refactor hypervisor::Vm creation on restore This prevents from leaking implementation details to lib.rs, and rather keep them in vm.rs. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-10-24 14:17:08 +02:00
Fabiano Fidêncio	b4e3942708	api: Fix vm.add-device argument type The add_device() function, from the device manager code, takes a DeviceConfig as a parameter, instead of a VmAddDevice. The change was originally done as part of `34412c9b41` and it didn't break Kata Containers because the VmAddDevice and DeviceConfig structs share most of their fields, besides the optional for serialization `pci_segment`, which is not used by the client yet. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-10-21 11:09:55 -07:00
Sebastien Boeuf	c52ccf3992	vmm: migration: Create destination VM right before to restore it This is preliminary work to ensure a migrated VM is created right before it is restored. This will be useful when moving to a design where the VM is both created and restored simultaneously from the Snapshot. In details, that means the MemoryManager is the object that must be created upon receiving the config from the source VM, so that memory content can be later received and filled into the GuestMemory. Only after these steps happened, the snapshot is received from the source VM, and the actual Vm object can be created from both the snapshot and the MemoryManager previously created. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-10-18 17:14:29 +02:00
Rob Bradford	a75d71f2c8	vmm: Reduce logging severity for unknown MMIO/PIO device accesses These look alarming if you are booting with the a distro kernel which is now a recommended approach. See: #4786 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-10-17 10:08:36 -07:00
Bo Chen	96209e7a16	vmm: Remove the explicit call to 'Snapshottable:restore()' The restore path of MemoryManager is handled specially without implementing a `Snapshottable:restore()`. Removing the explicit call to it along the migration code path to avoid confusions. See: #4783 Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-10-17 10:07:44 -07:00
Sebastien Boeuf	099cdd2af8	virtio-devices, vmm: vdpa: Implement live migration support Vdpa now implements the Migratable trait, which allows the device to be added to the DeviceTree and therefore allows live migrating any vDPA device that supports being suspended. Given a vDPA device can't be resumed from a suspended state without having to reset everything, we don't support pause/resume for a vDPA device, as well as snapshot/restore (which requires resume to be supported). In order for the migration to work locally, reusing the same device on the same host machine, the vhost-vdpa handler is dropped after the snapshot has been performed, which allows the destination VM to open the device without any conflict about the device being busy. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-10-13 10:03:23 +02:00
Sebastien Boeuf	22be5f9d0f	vmm: Extend list of authorized ioctls for vDPA Adding VHOST_VDPA_GET_CONFIG_SIZE and VHOST_VDPA_SUSPEND to the list of authorized ioctls for the vmm thread. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-10-13 10:03:23 +02:00
Bo Chen	37c3b0429a	vmm: Make MemoryManager::create_ram_region() public So that it can be reused externally, such as for fuzzing. Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-10-12 16:09:27 +01:00
Anatol Belski	a18b08c682	seccomp: mshv: Allow create partition ioctl Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>	2022-10-11 09:05:24 +01:00
Bo Chen	29cf637f3f	vmm: Move 'default_serial/console()' to vm_config.rs In this way, we have all functions related to generate default values of vm-config structs in the same location. Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-10-07 09:13:15 -07:00
Rob Bradford	83cc554f90	vmm: Remove deprecated VmConfig::{kernel, initramfs, cmdline} members These have been replaced by members of PayloadConfig and should be removed in v28.0 (mentioned in v26.0 release notes.) Fixes: #4737 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-10-06 14:25:29 +01:00
Rob Bradford	7d8d27c1b4	vmm: Rename queue size / number of queues constants These constants still referenced the long removed (separate vhost-user structs.) Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-10-06 14:25:29 +01:00
Rob Bradford	d692dfb8e3	vmm: Move `impl Default for ...` to vm_config.rs This is consistent when considering that some structs have a `#[derive(Default)`] so it makes sense for the default implementations to be in the same location. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-10-06 14:25:29 +01:00
Rob Bradford	7ad58457b0	vmm: Split structs from logic that make up VmConfig Place the data structures that are required for constructing a VmConfig into it's own module from the logic that exists to suppot them. This is useful as a consumer of the API can now clearly see what data structures make up the API for creating VMs. This has no functional change and I made no attempt to clean up the ordering (it's as in the original file) nor any other clean up. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-10-06 14:25:29 +01:00
Sebastien Boeuf	89677c3181	build: Bump clap from 3.2.22 to 4.0.9 Bumps [clap](https://github.com/clap-rs/clap) from 3.2.22 to 4.0.9. - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](clap-rs/clap@v3.2.22...v4.0.9) --- updated-dependencies: - dependency-name: clap dependency-type: direct:production update-type: version-update:semver-major ... Moving to the major version 4 introduced some breaking changes which had to be handled manually. Fixes #4709 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-10-05 12:59:14 +01:00
Rob Bradford	2daab89987	vmm: Remove legacy firmware loading This functionality was deprecated and is for removal in the upcoming release. Fixes: #4511 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-10-03 17:09:02 +01:00
Rob Bradford	1bc63e7848	vmm: Remove legacy I/O ports for ACPI These addresses have been superseded and replaced with other I/O ports. Fixes: #4483 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-10-03 17:08:57 +01:00
Bo Chen	2115a41568	openapi: Add 'firmware' to 'PayloadConfig' This option is needed for the openapi consumer (e.g. Kata Containers) to load firmware (e.g. td-shim) for booting. Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-10-01 08:45:21 +01:00
Rob Bradford	06eb82d239	build: Consolidate "gdb" build feature into "guest_debug" This simplifies the CI process but also logical with the existing functionality under "guest_debug" (dumping guest memory). Fixes: #4679 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-09-27 14:30:57 +01:00
Sebastien Boeuf	3bf3cca70a	vhost_user_net: Allow user to set MTU Adding the support for the user to set the MTU for the vhost-user-net backend, which allows the integration test to be extended with the test of the MTU parameter. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-09-27 10:37:35 +01:00
Sebastien Boeuf	903c08f8a1	net: Don't override default TAP interface MTU Adjust MTU logic such that: 1. Apply an MTU to the TAP interface if the user supplies it 2. Always query the TAP interface for the MTU and expose that. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-09-27 10:37:35 +01:00
Rob Bradford	b2d1dd65f3	build: Remove "fwdebug" and "common" feature flags This simplifes the buld and checks with very little overhead and the fwdebug device is I/O port device on 0x402 that can be used by edk2 as a very simple character device. See: #4679 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-09-26 10:16:33 -07:00
Rob Bradford	66c092e69b	build: Bump linux-loader from 0.5.0 to 0.6.0 Bumps [linux-loader](https://github.com/rust-vmm/linux-loader) from 0.5.0 to 0.6.0. - [Release notes](https://github.com/rust-vmm/linux-loader/releases) - [Changelog](https://github.com/rust-vmm/linux-loader/blob/main/CHANGELOG.md) - [Commits](https://github.com/rust-vmm/linux-loader/compare/v0.5.0...v0.6.0) --- updated-dependencies: - dependency-name: linux-loader dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-09-24 09:54:18 +00:00
Rob Bradford	1202b9a07a	vmm: Add some tracing of boot sequence Add tracing of the VM boot sequence from the point at which the request to create a VM is received to the hand-off to the vCPU threads running. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-09-22 18:09:31 +01:00
Sebastien Boeuf	76dbf85b79	net: Give the user the ability to set MTU Add a new "mtu" parameter to the NetConfig structure and therefore to the --net option. This allows Cloud Hypervisor's users to define the Maximum Transmission Unit (MTU) they want to use for the network interface that they create. In details, there are two main aspects. On the one hand, the TAP interface is created with the proper MTU if it is provided. And on the other hand the guest is made aware of the MTU through the VIRTIO configuration. That means the MTU is properly set on both the TAP on the host and the network interface in the guest. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-09-21 16:20:57 +02:00
Sebastien Boeuf	f38056fc9e	virtio-devices, vmm: Simplify virtio-mem resize operation There's no need to delegate the resize operation to the virtio-mem thread. This can come directly from the vmm thread which will use the Mem object to update the VIRTIO configuration and trigger the interrupt for the guest to be notified. In order to achieve what's described above, the VirtioMemZone structure now has a handle onto the Mem object directly. This avoids the need for intermediate Resize and ResizeSender structures. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-09-20 13:43:40 +02:00
Rob Bradford	f32487f8e8	misc: Automatic beta clippy fixes e.g. cargo clippy --all --tests --all-targets --fix --features=.. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-09-20 10:59:48 +01:00
Sebastien Boeuf	1849ffff31	vmm: Remove "amx" feature gate Given the AMX x86 feature has been made available since kernel v5.17, and given we don't have any test validating this feature, there's no need to keep it behing a Rust feature gate. Fixes #3996 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-09-16 15:03:31 +01:00
Rob Bradford	0e52be0909	vmm: Ensure default deserialisation for "amx" feature bit This allows a migration from a binary not compiled with struct member to be completed. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-09-16 15:03:31 +01:00
Sebastien Boeuf	3793ffe888	vmm: config: Move TDX to rely on PayloadConfig Removing the option --tdx to specify that we want to run a TD VM. Rely on --platform option by adding the "tdx" boolean parameter. This is the new way for enabling TDX with Cloud Hypervisor. Along with this change, the way to retrieve the firmware path has been updated to rely on the recently introduced PayloadConfig structure. Fixes #4556 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-09-05 12:14:59 +01:00
Sebastien Boeuf	b3bef3adda	vmm: acpi: Don't declare MMIO config space through PCI buses The PCI buses should not declare the address space related to the MMIO config space given it's already declared in the MCFG table and through the motherboard device PNP0C02 in the DSDT table. The PCI MMIO config region for the segment was being wrongly exposed as part of the _CRS for the ACPI bus device (using Memory32Fixed). Exposing it via this object was ineffectual as the equivalent entry in the PNP0C02 (_SB_.MBRD) marked those ranges as not usable via the kernel. Either way, with both devices used by the kernel, the kernel will not try and use those memory ranges for the device BARs. However under td-shim on TDX the PNP0C02 device is not on the permitted list of devices so the the memory ranges were not marked as unusable resulting in the kernel attempting to allocate BARs that collided with the PCI MMIO configuration space. This is based on the kernel documentation PCI/acpi-info.rst which relies on ACPI and PCI Firmware specifications. And here are the interesting quotes from this document: """ Prior to the addition of Extended Address Space descriptors, the failure of Consumer/Producer meant there was no way to describe bridge registers in the PNP0A03/PNP0A08 device itself. The workaround was to describe the bridge registers (including ECAM space) in PNP0C02 catch-all devices. With the exception of ECAM, the bridge register space is device-specific anyway, so the generic PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it. PNP0C02 “motherboard” devices are basically a catch-all. There’s no programming model for them other than “don’t use these resources for anything else.” So a PNP0C02 _CRS should claim any address space that is (1) not claimed by _CRS under any other device object in the ACPI namespace and (2) should not be assigned by the OS to something else. The address range reported in the MCFG table or by _CBA method (see Section 4.1.3) must be reserved by declaring a motherboard resource. For most systems, the motherboard resource would appear at the root of the ACPI namespace (under _SB) in a node with a _HID of EISAID (PNP0C02), and the resources in this case should not be claimed in the root PCI bus’s _CRS. The resources can optionally be returned in Int15 E820 or EFIGetMemoryMap as reserved memory but must always be reported through ACPI as a motherboard resource. """ This change has been manually tested by running a VM with multiple segments (4 segments), and by hotplugging an additional disk to the segment number 2 (third segment). From one shell: """ cloud-hypervisor \ --cpus boot=1 \ --memory size=1G \ --kernel vmlinux \ --cmdline "root=/dev/vda1 rw console=hvc0" \ --disk path=jammy-server-cloudimg.raw \ --api-socket /tmp/ch.sock \ --platform num_pci_segments=4 """ From another shell (after the VM is booted): """ ch-remote \ --api-socket=/tmp/ch.sock \ add-disk \ path=test-disk.raw,id=disk2,pci_segment=2 """ Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-09-02 14:14:23 +02:00
Nuno Das Neves	784a3aaf3c	devices: gic: use VgicConfig everywhere Use VgicConfig to initialize Vgic. Use Gic::create_default_config everywhere so we don't always recompute redist/msi registers. Add a helper create_test_vgic_config for tests in hypervisor crate. Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>	2022-08-31 08:33:05 +01:00
Jianyong Wu	3a19573c69	vmm: unify payload load chain on AArch64 with x86_64 AArch64 can share the same way of loading payload with x86_64. It makes the payload loading more consistent between different arches. Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2022-08-31 08:32:08 +01:00
Michael Zhao	b65639fad3	vmm:AArch64: move uefi_flash to memory manager uefi_flash is used when load firmware, that is load payload depends on device manager. move uefi_flash to memory manager can eliminate the dependency. Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-08-31 08:32:08 +01:00
Jianyong Wu	9b1452f258	vmm:AArch64: load standalone firmware A new firmware item has been added into payload config, we need extend ability to load standalone firmware on AArch64. "load_kernel" method will be the entry of image loading work including kernel and firmware. This change is back compatible. So, we can either load firmware through "-kernel" like before or "-firmware". Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2022-08-31 08:32:08 +01:00
Jianyong Wu	2054c8699a	vmm:AArch64: add load_firmware method Later, we will load standalone firmware. So, refactor load_kernel by abstracting load_firmware method. Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2022-08-31 08:32:08 +01:00
Sebastien Boeuf	8c02648ac9	vmm: device_manager: Update virtio-console for proper PTY support Given the virtio-console is now able to buffer its output when no PTY is connected on the other end, the device manager code is updated to enable this. Moving the endpoint type from FilePair to PtyPair enables the proper codepath in the virtio-console implementation, as well as updating the PTY resize code, and forcing the PTY to always be non-blocking. The non-blocking behavior is required to avoid blocking the guest that would be waiting on the virtio-console driver. When receiving an EWOULDBLOCK error, the output will simply be redirected to the temporary buffer so that it can be later flushed. The PTY resize logic has been slightly modified to ensure the PTY file descriptors are closed. It avoids the child process to keep a hold onto the PTY device, which would have caused the PTY to believe something is connected on the other end, which would have prevented the detection of any new connection on the PTY. Fixes #4521 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-08-30 13:47:51 +02:00
Sebastien Boeuf	a940f525a8	vmm: Move SerialBuffer to its own crate We want to be able to reuse the SerialBuffer from the virtio-devices crate, particularly from the virtio-console implementation. That's why we move the SerialBuffer definition to its own crate so that it can be accessed from both vmm and virtio-devices crates, without creating any cyclic dependency. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-08-30 13:47:51 +02:00
Sebastien Boeuf	63462fd8ab	vmm: serial_manager: Iterate again on EINTR If the epoll_wait() call returns EINTR, this only means a signal has been delivered before any of the file descriptors registered triggered an event or before the end of the timeout (if timeout isn't -1). For that reason, we should simply try to listen on the epoll loop again. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-08-30 13:47:51 +02:00
Sebastien Boeuf	0bcb6ff061	vmm: Limit the size of the SerialBuffer We must limit how much the buffer can grow, otherwise this could lead the process to consume all the memory on the machine. This could happen if the output from the guest was very important and nothing would connect to the PTY for a long time. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-08-24 12:14:59 +02:00
Michael Zhao	d66d64c325	vmm: Restrict the maximum number of HW breakpoints Set the maximum number of HW breakpoints according to the value returned from `Hypervisor::get_guest_debug_hw_bps()`. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-08-23 16:57:12 +02:00
Wei Liu	3e6b0a5eab	vmm: unify TranslateVirtualAddress error for both x86_64 and aarch64 Using anyhow::Error should cover both architectures. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-08-22 09:37:21 -07:00
Michael Zhao	c798b958f3	vmm: Extend seccomp rules for GDB Add 'KVM_SET_GUEST_DEBUG' ioctl to seccomp filter rules. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-08-21 17:07:26 +08:00
Michael Zhao	0522e40933	vmm: Implement `translate_gva` on AArch64 On AArch64, `translate_gva` API is not provided by KVM. We implemented it in VMM by walking through translation tables. Address translation is big topic, here we only focus the scenario that happens in VMM while debugging kernel. This `translate_gva` implementation is restricted to: - Exception Level 1 - Translate high address range only (kernel space) This implementation supports following Arm-v8a features related to address translation: - FEAT_LPA - FEAT_LVA - FEAT_LPA2 The implementation supports page sizes of 4KiB, 16KiB and 64KiB. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-08-21 17:07:26 +08:00
Michael Zhao	5febdec81a	vmm: Enable `gdbstub` on AArch64 The `gva_translate` function is still missing, it will be added with a separate commit. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-08-21 17:07:26 +08:00
Nuno Das Neves	fdc8546eef	vmm: aarch64: Use GIC_V3_* consts instead of magic numbers in create_madt() Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>	2022-08-21 17:06:48 +08:00
Sebastien Boeuf	cdcd4d259e	vmm: serial: Wait for PTY to be available before writing to it The goal of this patch is to provide a reliable way to detect when the other end of the PTY is connected, and therefore be able to identify when we can write to the PTY device. This is needed because writing to the PTY device when the other end isn't connected causes the loss of the written bytes. The way to detect the connection on the other end of the PTY is by knowing the other end is disconnected at first with the presence of the EPOLLHUP event. Later on, when the connection happens, EPOLLHUP is not triggered anymore, and that's when we can assume it's okay to write to the PTY main device. It's important to note we had to ensure the file descriptor for the other end was closed, otherwise we would have never seen the EPOLLHUP event. And we did so by removing the "sub" field from the PtyPair structure as it was keeping the associated File opened. Fixes #3170 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-08-19 14:39:06 +01:00
Rob Bradford	396f9ce2c6	vmm: Deprecate non-PVH firmware loading Curently all the firmware blobs we support can use PVH loading. See: #4511 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-08-18 17:29:44 +01:00
Rob Bradford	282a1001ef	vmm: x86_64: Rename load_firmware() to reflect its purpose This function only supports loading legacy, non-PVH firmware binaries. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-08-17 09:50:42 +01:00
Rob Bradford	0d682e185f	vmm: x86_64: Add support for firmware loading Since our firmware files are still designed to be used via PVH use the load_kernel() function to load the firmware falling back to legacy firmware loading if necessary. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-08-17 09:50:42 +01:00
Rob Bradford	8ec5a248cd	main, vmm: Add option to pass firmware parameter in payload Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-08-17 09:50:42 +01:00
Rob Bradford	763ea7da42	vmm: x86_64: Split payload loading into it's own function Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-08-17 09:50:42 +01:00
Rob Bradford	2856074d12	vmm: x86_64: Make kernel loading use PayloadConfig Minor refactoring to start supporting loading a firmware payload Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-08-17 09:50:42 +01:00
Rob Bradford	485900eeb4	vmm: x86_64: Use more general name for payload handling Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-08-17 09:50:42 +01:00
Rob Bradford	6988da79d2	vmm: x86_64: Split legacy firmware loading into own function Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-08-17 09:50:42 +01:00
Sebastien Boeuf	98f949d35d	vmm: Add new I/O ports for ACPI shutdown and PM timer devices Adding new I/O ports for both the ACPI shutdown and the ACPI PM timer devices so they can be triggered from both addresses. The reason for this change is that TDX expects only certain I/O ports to be enabled based on what QEMU exposes. We follow this to avoid new ports from being opened exclusively for Cloud Hypervisor. We have to keep the former I/O ports available given all firmwares haven't been updated yet. Once we reach a point where we know both Rust Hypervisor Firmware, OVMF, TDVF and TDSHIM have been updated with the new port values, we'll be able to remove the former ports. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-08-11 11:46:09 +01:00
Rob Bradford	8c22c03e1e	vmm: openapi: Switch to describing new payload API The old API remains usable, and will remain usable for two releases but we should only advertise the new API. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-08-10 22:20:07 +01:00
Rob Bradford	51fdc48817	vmm: openapi: Fix OpenAPI YAML file formatting Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-08-10 22:20:07 +01:00
Rob Bradford	cef51a9de0	vmm: Encompass guest payload configuration in PayloadConfig Introduce a new top level member of VmConfig called PayloadConfig that (currently) encompasses the kernel, commandline and initramfs for the guest to use. In future this can be extended for firmware use. The existing "--kernel", "--cmdline" and "initramfs" CLI parameters now fill the PayloadConfig. Any config supplied which uses the now deprecated config members have those members mapped to the new version with a warning. See: #4445 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-08-10 15:12:34 +01:00
Rob Bradford	6bc46ba9c1	vmm: config: Reject VFIO devices with the same path By checking in the validation logic we get checking for both devices specified in the initial config but also hotplug too. Fixes: #4453 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-08-09 14:32:35 +02:00
Rob Bradford	ea58d2f68a	vmm: config: Enhance test_cpu_parsing to add "affinity" parameter Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-08-08 16:23:00 +01:00
Rob Bradford	d295de4cd5	option_parser: Move test_option_parser to option_parser crate Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-08-08 16:23:00 +01:00
Wei Liu	53aecf9341	vmm: add oem_strings to openapi Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-08-08 08:59:19 +01:00
Wei Liu	57e9b80123	vmm: provide oem_strings option Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-08-08 08:59:19 +01:00
lizhaoxin1	65f42c1f62	vmm: openapi: Add uuid to PlatformConfig Signed-off-by: lizhaoxin1 <Lxiaoyouling@163.com>	2022-08-04 09:20:06 +02:00
lizhaoxin1	bc3a276b43	arch, vmm: Expose platform uuid via SMBIOS Parse and set uuid. Signed-off-by: lizhaoxin1 <Lxiaoyouling@163.com> Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-08-04 09:20:06 +02:00
lizhaoxin1	3abc1e1e51	vmm: config: Add "uuid" option to "--platform" The uuid indicates the unique ID of a virtual machine. cloud-hypervisor takes the uuid passed by libvirt and uses it to initialize cloud-init. Signed-off-by: lizhaoxin1 <Lxiaoyouling@163.com>	2022-08-04 09:20:06 +02:00
Bo Chen	1125fd2667	vmm: api: Use 'BTreeMap' for 'HttpRoutes' In this way, we get the values sorted by its key by default, which is useful for the 'http_api' fuzzer. Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-08-03 10:18:24 +01:00
Bo Chen	eb056d374a	vmm: Make 'EpollContext::add_event()' public So that it can be reused by other crate, e.g. from fuzz targets. Signed-off-by: Bo Chen <chen.bo@intel.com>	2022-08-03 10:18:24 +01:00
Sebastien Boeuf	4d74525bdc	vmm: Remove unused "poll_queue" from DiskConfig The parameter "poll_queue" was useful at the time Cloud Hypervisor was responsible for spawning vhost-user backends, as it was carrying the information the vhost-user-block backend should have this option enabled or not. It's been quite some time that we walked away from this design, as we now expect a management layer to be responsible for running vhost-user backends. That's the reason why we can remove "poll_queue" from the DiskConfig structure. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-08-02 15:10:11 +02:00
Michael Zhao	7199119bb2	hypervisor: Remove `Vcpu::read_mpidr()` on AArch64 Replaced `read_mpidr()` with `get_sys_reg()`. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-07-29 11:45:12 +01:00
Michael Zhao	cd7f36a713	hypervisor: Remove `get/set_reg()` on AArch64 `Vcpu::get/set_reg()` were only invoked in Vcpu itself. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-07-29 11:45:12 +01:00
Michael Zhao	f7b6d99c2d	hypervisor: Remove `get/set_sys_regs()` on AArch64 `hypervisor::Vcpu::get/set_sys_regs()` are only used in Vcpu internally. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-07-29 11:45:12 +01:00
Rob Bradford	857edc71a9	vmm: cpu: Remove now unused CpuManager::vcpus_paused() Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-07-26 09:22:25 +02:00
Rob Bradford	0e29379bcf	vmm: Make gdb break/resuming more resilient When starting the VM such that it is already on a breakpoint (via stop_on_boot) when attached to gdb then start the vCPUs in a paused state rather than starting the vCPUs later (upon resume). Further, make the resumption/break of the VM more resilient by only attempting to resume the vCPUs if were are already in a break point and only attempting to pause/break if we were already running. Fixes: #4354 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-07-26 09:22:25 +02:00
Rob Bradford	a749182777	vmm: acpi: Use ACPI platform device addresses from DeviceManager Remove the hardcoded addresses. Also remove PM_TMR_BLK as spec compliant implementation will use X_PM_TMR_BLK over this field. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-07-25 16:16:06 +01:00
Rob Bradford	2e8eb96ef6	vmm: device_manager: Store ACPI platform addresses for later use These are ready for inclusion in the FACP table. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-07-25 16:16:06 +01:00
Wei Liu	ad33f7c5e6	vmm: return seccomp rules according to hypervisors That requires stashing the hypervisor type into various places. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-22 12:50:12 +01:00
Wei Liu	a96a5d7816	hypervisor, vmm: use new vfio-ioctls Use the new vfio-ioctls APIs. Drop Cloud Hypervisor's Device trait since it is no longer needed. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-21 23:37:53 +01:00
Wei Liu	f84ddedb1a	hypervisor, vmm: introduce trait functions for aarch64 PMU The original code uses kvm_device_attr directly outside of the hyeprvisor crate. That leaks hypervisor details. No functional change intended. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-21 23:37:53 +01:00
Wei Liu	f21fc1dcb6	hypervisor: x86: provide a generic MsrEntry structure Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-20 10:13:41 +01:00
Wei Liu	4d2cc3778f	hypervisor: move away from MsrEntries type It is a flexible array. Switch to vector and slice instead. No functional change intended. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-20 10:13:41 +01:00
Wei Liu	05e5106b9b	hypervisor x86: provide a generic LapicState structure This requires making get/set_lapic_reg part of the type. For the moment we cannot provide a default variant for the new type, because picking one will be wrong for the other hypervisor, so I just drop the test cases that requires LapicState::default(). Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-19 09:38:38 +01:00
Wei Liu	6a8c0fc887	hypervisor: provide a generic FpuState structure Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-18 22:15:30 +01:00
Wei Liu	08135fa085	hypervisor: provide a generic CpudIdEntry structure Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-18 22:15:30 +01:00
Wei Liu	45fbf840db	hypervisor, vmm: move away from CpuId type CpuId is an alias type for the flexible array structure type over CpuIdEntry. The type itself and the type of the element in the array portion are tied to the underlying hypervisor. Switch to using CpuIdEntry slice or vector directly. The construction of CpuId type is left to hypervisors. This allows us to decouple CpuIdEntry from hypervisors more easily. No functional change intended. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-18 22:15:30 +01:00
Wei Liu	f1ab86fecb	hypervisor: x86: provide a generic SpecialRegisters structure Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-15 10:21:43 +01:00
Wei Liu	75797827d5	hypervisor: x86: provide a generic SegmentRegister structure And drop SegmentRegisterOps since it is no longer required. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-15 10:21:43 +01:00
Wei Liu	8b7781e267	hypervisor: x86: provide a generic StandardRegisters structure We only need to do this for x86 since MSHV does not have aarch64 support yet. This reduces unnecessary code churn. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-15 10:21:43 +01:00
Wei Liu	4201bf4011	hypervisor: provide a generic ClockData structure Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-14 22:09:04 +01:00
Wei Liu	beb4f86b82	hypervisor, vmm: drop VmState and code VmState was introduced to hold hypervisor specific VM state. KVM does not need it and MSHV does not really use it yet. Just drop the code. It can be easily revived once there is a need. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-14 22:09:04 +01:00
Alyssa Ross	a455917db5	vmm: fix missed API or debug events Previously, we were assuming that every time an eventfd notified us, there was only a single event waiting for us. This meant that if, while one API request was being processed, two more arrived, the second one would not be processed (until the next one arrived, when it would be processed instead of that event, and so on). To fix this, make sure we're processing the number of API and debug requests we've been told have arrived, rather than just one. This is easy to demonstrate by sending lots of API events and adding some sleeps to make sure multiple events can arrive while each is being processed. For other uses of eventfd, like the exit event, this doesn't matter — even if we've received multiple exit events in quick succession, we only need to exit once. So I've only made this change where receiving an event is non-idempotent, i.e. where it matters that we process the event the right number of times. Technically, reset requests are also non-idempotent — there's an observable difference between a VM resetting once, and a VM resetting once and then immediately resetting again. But I've left that alone for now because two resets in immediate succession doesn't sound like something anyone would ever want to me. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2022-07-14 17:44:11 +01:00
Michael Zhao	2d8635f04a	hypervisor: Refactor `system_registers` on AArch64 Function `system_registers` took mutable vector reference and modified the vector content. Now change the definition to `get/set` style. And rename to `get/set_sys_regs` to align with other functions. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-07-14 22:55:19 +08:00
Michael Zhao	c445513976	hypervisor: Refactor `core_registers` on AArch64 On AArch64, the function `core_registers` and `set_core_registers` are the same thing of `get/set_regs` on x86_64. Now the names are aligned. This will benefit supporting `gdb`. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-07-14 22:55:19 +08:00
Wei Liu	0e8769d76a	device_manager: assert passthrough_device has the correct type There is a lot of unsafe code in such a small function. Add an assert to help detect issues earlier. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-14 08:09:50 +01:00
Wei Liu	84bbaf06d1	hypervisor: turn boot_msr_entries into a trait method This allows dispatching to either KVM or MSHV automatically. No functional change. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-07-08 16:49:58 +01:00
Rob Bradford	121729a3b0	vmm: Split signal handling for VM and VMM signals The VM specific signal (currently only SIGWINCH) should only be handled when the VM is running. The generic VMM signals (SIGINT and SIGTERM) need handling at all times. Split the signal handling into two separate threads which have differing lifetimes. Tested by: 1.) Boot full VM and check resize handling (SIGWINCH) works & sending SIGTERM leads to cleanup (tested that API socket is removed.) 2.) Start without a VM and send SIGTERM/SIGINT and observe cleanup (API socket removed) 3.) Boot full VM, delete VM and observe 2.) holds. 4.) Boot full VM, delete VM, recreate VM and observe 1.) holds. Fixes: #4269 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-07-08 15:15:46 +01:00
Rob Bradford	93237f0106	vmm: Set MADT "Online Capable" flag The Linux kernel now checks for this before marking CPUs as hotpluggable: commit aa06e20f1be628186f0c2dcec09ea0009eb69778 Author: Mario Limonciello <mario.limonciello@amd.com> Date: Wed Sep 8 16:41:46 2021 -0500 x86/ACPI: Don't add CPUs that are not online capable A number of systems are showing "hotplug capable" CPUs when they are not really hotpluggable. This is because the MADT has extra CPU entries to support different CPUs that may be inserted into the socket with different numbers of cores. Starting with ACPI 6.3 the spec has an Online Capable bit in the MADT used to determine whether or not a CPU is hotplug capable when the enabled bit is not set. Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/05_ACPI_Software_Programming_Model/ACPI_Software_Programming_Model.html?#local-apic-flags Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-07-01 18:45:05 +01:00
Rob Bradford	adf5881757	build: #[allow(clippy::significant_drop_in_scrutinee) in some crates This check is new in the beta version of clippy and exists to avoid potential deadlocks by highlighting when the test in an if or for loop is something that holds a lock. In many cases we would need to make significant refactorings to be able to pass this check so disable in the affected crates. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-30 20:50:45 +01:00
Rob Bradford	b57d7b258d	build: Fix beta clippy issue (needless_return) warning: unneeded `return` statement --> pci/src/vfio_user.rs:627:13 \| 627 \| / return Err(std::io::Error::new( 628 \| \| std::io::ErrorKind::Other, 629 \| \| format!("Region not found for 0x{:x}", gpa), 630 \| \| )); \| \|_______________^ \| = note: `#[warn(clippy::needless_return)]` on by default = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_return help: remove `return` \| 627 ~ Err(std::io::Error::new( 628 + std::io::ErrorKind::Other, 629 + format!("Region not found for 0x{:x}", gpa), 630 + )) \| Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-30 20:50:45 +01:00
Rob Bradford	2716bc3311	build: Fix beta clippy issue (derive_partial_eq_without_eq) warning: you are deriving `PartialEq` and can implement `Eq` --> vmm/src/serial_manager.rs:59:30 \| 59 \| #[derive(Debug, Clone, Copy, PartialEq)] \| ^^^^^^^^^ help: consider deriving `Eq` as well: `PartialEq, Eq` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-30 20:50:45 +01:00
Rob Bradford	2e664dca64	vmm: Always reset the console mode on VMM exit Tested: 1. SIGTERM based 2. VM shutdown/poweroff 3. Injected VM boot failure after calling Vm::setup_tty() Fixes: #4248 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-28 16:45:27 +01:00
Rob Bradford	65ec6631fb	vmm: cpu: Store the vCPU snapshots in ascending order The snapshots are stored in a BTree which is ordered however as the ids are strings lexical ordering places "11" ahead of "2". So encode the vCPU id with zero padding so it is lexically sorted. This fixes issues with CPU restore on aarch64. See: #4239 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-06-27 16:20:57 +01:00
Wei Liu	bccd7c7e48	vmm: drop Sync+Send bounds for EndpointHandler Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-06-20 23:28:57 +01:00
Wei Liu	8fa1098629	vmm: switch from lazy_static to once_cell Once_cell does not require using macro and is slated to become part of Rust std at some point. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2022-06-20 16:03:07 +01:00
Sebastien Boeuf	335a4e1cc0	vmm: api: Expose kvm_hyperv parameter in OpenAPI description Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-06-17 15:11:53 +01:00
Sebastien Boeuf	81ba70a497	pci, vmm: Defer mapping VFIO MMIO regions on restore When restoring a VM, the restore codepath will take care of mapping the MMIO regions based on the information from the snapshot, rather than having the mapping being performed during device creation. When the device is created, information such as which BARs contain the MSI-X tables are missing, preventing to perform the mapping of the MMIO regions. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-06-09 09:19:58 +02:00

1 2 3 4 5 ...

1906 Commits