cloud-hypervisor

mirror of https://github.com/cloud-hypervisor/cloud-hypervisor.git synced 2024-11-05 11:31:14 +00:00

Author	SHA1	Message	Date
Sebastien Boeuf	adf297066d	vmm: Create devices in different path if restoring the VM In case the VM is created from scratch, the devices should be created after the DeviceManager has been created. But this should not affect the restore codepath, as in this case the devices should be created as part of the restore() function. It's necessary to perform this differentiation as the restore must go through the following steps: - Create the DeviceManager - Restore the DeviceManager with the right state - Create the devices based on the restored DeviceManager's device tree - Restore each device based on the restored DeviceManager's device tree That's why this patch leverages the recent split of the DeviceManager's creation to achieve what's needed. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-05-05 16:08:42 +02:00
Sebastien Boeuf	d39f91de02	vmm: Reorganize DeviceManager creation This commit performs the split of the DeviceManager's creation into two separate functions by moving anything related to device's creation after the DeviceManager structure has been initialized. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-05-05 16:08:42 +02:00
Rob Bradford	a76cf0865f	vmm: vm: Remove vsock device from config When doing device unplug remove the vsock device from the configuration if present. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-04-29 12:44:49 +01:00
Rob Bradford	99422324a7	vmm: vm: Add "add_vsock()" Add the vsock device to the device manager and patch the config to add the new vsock device. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-04-29 12:44:49 +01:00
Muminul Islam	e1a07ce3c4	vmm: vm: Unpark the threads before shutdown when the current state is paused If the current state is paused that means most of the handles got killed by pthread_kill We need to unpark those threads to make the shutdown worked. Otherwise The shutdown API hangs and the API is not responding afterwards. So before the shutdown call we need to resume the VM make it succeed. Fixes: #817 Signed-off-by: Muminul Islam <muislam@microsoft.com>	2020-04-27 09:09:12 +02:00
Dean Sheather	c2abadc293	vmm: Add ability to add virtio-fs device post-boot Adds DeviceManager method `make_virtio_fs_device` which creates a single device, and modifies `make_virtio_fs_devices` to use this method. Implements the new `vm.add-fs route`. Signed-off-by: Dean Sheather <dean@coder.com>	2020-04-20 20:36:26 +02:00
Rob Bradford	f9a0445c3d	vmm: vm: Remove device from configuration after unplug This ensures that a device that is removed will not reappear after a reboot. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-04-16 17:03:25 +02:00
Rob Bradford	1beb62ed2d	vmm: vm: Don't panic on kernel load error Rather than panic()ing when we get a kernel loading error populate the error upwards. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-04-16 17:03:25 +02:00
Alejandro Jimenez	7134f3129f	vmm: Allow PVH boot with initramfs We can now allow guests that specify an initramfs to boot using the PVH boot protocol. Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>	2020-04-09 17:28:03 +02:00
Rob Bradford	3b0da2d895	vmm: vm: Validate configuration on API boot When performing an API boot validate the configuration. For now only some very basic validation is performed but in subsequent commits the validation will be extended. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-04-08 12:06:09 +01:00
Sebastien Boeuf	8d9d22436a	vmm: Add "prefault" option when restoring Now that the restore path uses RestoreConfig structure, we add a new parameter called "prefault" to it. This will give the user the ability to populate the pages corresponding to the mapped regions backed by the snapshotted memory files. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-04-08 10:56:14 +02:00
Sebastien Boeuf	a517ca23a0	vmm: Move restore parameters into common RestoreConfig structure The goal here is to move the restore parameters into a dedicated structure that can be reused from the entire codebase, making the addition or removal of a parameter easier. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-04-08 10:56:14 +02:00
Sebastien Boeuf	6712958f23	vmm: memory: Add prefault option when creating region When CoW can be used, the VM restoration time is reduced, but the pages are not populated. This can lead to some slowness from the guest when accessing these pages. Depending on the use case, we might prefer a slower boot time for better performances from guest runtime. The way to achieve this is to prefault the pages in this case, using the MAP_POPULATE flag along with CoW. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-04-08 10:56:14 +02:00
Sebastien Boeuf	d771223b2f	vmm: memory: Extend new() to support external backing files Whenever a MemoryManager is restored from a snapshot, the memory regions associated with it might need to directly back the mapped memory for increased performances. If that's the case, a list of external regions is provided and the MemoryManager should simply ignore what's coming from the MemoryConfig. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-04-08 10:56:14 +02:00
Samuel Ortiz	2cd0bc0a2c	vmm: Create initial VM from its snapshot The MemoryManager is somehow a special case, as its restore() function was not implemented as part of the Snapshottable trait. Instead, and because restoring memory regions rely both on vm.json and every memory region snapshot file, the memory manager is restored at creation time. This makes the restore path slightly different from CpuManager, Vcpu, DeviceManager and Vm, but achieve the correct restoration of the MemoryManager along with its memory regions filled with the correct content. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2020-04-07 12:26:10 +02:00
Samuel Ortiz	b55b83c6e8	vmm: vm: Implement the Transportable trait This is only implementing the send() function in order to store all Vm states into a file. This needs to be extended for live migration, by adding more transport methods, and also the recv() function must be implemented. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2020-04-07 12:26:10 +02:00
Samuel Ortiz	1ed357cf34	vmm: vm: Implement the Snapshottable trait By aggregating snapshots from the CpuManager, the MemoryManager and the DeviceManager, Vm implements the snapshot() function from the Snapshottable trait. And by restoring snapshots from the CpuManager, the MemoryManager and the DeviceManager, Vm implements the restore() function from the Snapshottable trait. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>	2020-04-07 12:26:10 +02:00
Yi Sun	50b3f008d1	vmm: cpu: Implement the Snapshottable trait Implement the Snapshottable trait for Vcpu, and then implements it for CpuManager. Note that CpuManager goes through the Snapshottable implementation of Vcpu for every vCPU in order to implement the Snapshottable trait for itself. Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2020-04-07 12:26:10 +02:00
Samuel Ortiz	447af8e702	vmm: vm: Factorize the device and cpu managers creation routine Into a new_from_memory_manager() routine. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2020-04-03 18:05:18 +01:00
Samuel Ortiz	c73c9b112c	vmm: vm: Open kernel and initramfs once all managers are created Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2020-04-03 18:05:18 +01:00
Samuel Ortiz	0646a90626	vmm: cpu: Pass CpusConfig to simplify the new() prototype Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2020-04-03 18:05:18 +01:00
Samuel Ortiz	b584ec3fb3	vmm: memory_manager: Own the system allocator Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2020-04-03 18:05:18 +01:00
Samuel Ortiz	ef2b11ee6c	vmm: memory_manager: Pass MemoryConfig to simplify the new() prototype Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2020-04-03 18:05:18 +01:00
Samuel Ortiz	622f3f8fb6	vmm: vm: Avoid ioapic variable creation For a more readable VM creation routine. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2020-04-03 18:05:18 +01:00
Samuel Ortiz	164e810069	vmm: cpu: Move CPUID patching to CpuManager Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2020-04-03 18:05:18 +01:00
Samuel Ortiz	1a2c1f9751	vmm: vm: Factorize the KVM setup code Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2020-04-03 18:05:18 +01:00
Samuel Ortiz	92c73c3b78	vmm: Add a VmRestore command Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2020-04-02 13:24:25 +01:00
Samuel Ortiz	cf8f8ce93a	vmm: api: Add a Snapshot command Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>	2020-04-02 13:24:25 +01:00
Samuel Ortiz	1b1a2175ca	vm-migration: Define the Snapshottable and Transportable traits A Snapshottable component can snapshot itself and provide a MigrationSnapshot payload as a result. A MigrationSnapshot payload is a map of component IDs to a list of migration sections (MigrationSection). As component can be made of several Migratable sub-components (e.g. the DeviceManager and its device objects), a migration snapshot can be made of multiple snapshot itself. A snapshot is a list of migration sections, each section being a component state snapshot. Having multiple sections allows for easier and backward compatible migration payload extensions. Once created, a migratable component snapshot may be transported and this is what the Transportable trait defines, through 2 methods: send and recv. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>	2020-04-02 13:24:25 +01:00
Sebastien Boeuf	cc67131ecc	vmm: Retrieve new memory region when memory is extended Whenever the memory is resized, it's important to retrieve the new region to pass it down to the device manager, this way it can decide what to do with it. Also, there's no need to use a boolean as we can instead use an Option to carry the information about the region. In case of virtio-mem, there will be no region since the whole memory has been reserved up front by the VMM at boot. This means only the ACPI hotplug will return a region and is the only method that requires the memory to be updated from the device manager. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-03-27 09:35:39 +01:00
Samuel Ortiz	8fc7bf2953	vmm: Move to the latest linux-loader Commit 2adddce2 reorganized the crate for a cleaner multi architecture (x86_64 and aarch64) support. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2020-03-27 08:48:20 +01:00
Sebastien Boeuf	785812d976	vmm: Fallback to legacy boot if PVH is enabled along with initramfs For now, the codebase does not support booting from initramfs with PVH boot protocol, therefore we need to fallback to the legacy boot. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-03-26 11:59:03 +01:00
Damjan Georgievski	6cce7b9560	arch: load initramfs and populate zero page * load the initramfs File into the guest memory, aligned to page size * finally setup the initramfs address and its size into the boot params (in configure_64bit_boot) Signed-off-by: Damjan Georgievski <gdamjan@gmail.com>	2020-03-26 11:59:03 +01:00
Rob Bradford	f664cddec9	vmm: Add support for adding network devices to the VM The persistent memory will be hotplugged via DeviceManager and saved in the config for later use. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-03-25 17:58:06 +01:00
Samuel Ortiz	41d7b3a387	vmm: memory_manager: Only send the GED notification for the ACPI method Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2020-03-25 15:54:16 +01:00
Hui Zhu	e6b934a56a	vmm: Add support for virtio-mem This commit adds new option hotplug_method to memory config. It can set the hotplug method to "acpi" or "virtio-mem". Signed-off-by: Hui Zhu <teawater@antfin.com>	2020-03-25 15:54:16 +01:00
Rob Bradford	15de30f141	vmm: Add support for adding pmem devices to the VM The persistent memory will be hotplugged via DeviceManager and saved in the config for later use. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-03-25 13:18:17 +01:00
Rob Bradford	164ec2b8e6	vmm: Add support for adding disks to the VM The disk will be hotplugged via DeviceManager and saved in the config for later use. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-03-25 09:35:53 +00:00
Sebastien Boeuf	e54f8ec8a5	vmm: Update memory through DeviceManager Whenever the VM memory is resized, DeviceManager needs to be notified so that it can subsequently notify each virtio devices about it. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-03-24 19:01:15 +00:00
Rob Bradford	0788600702	build: Remove "pvh_boot" feature flag This feature is stable and there is no need for this to be behind a flag. This will also reduce the time needed to run the integration test as we will not be running them all again under the flag. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-03-19 13:05:44 +00:00
Samuel Ortiz	63eeed29cc	vm: Comment on the VM config update from memory hotplug I spent a few minutes trying to understand why we were unconditionally updating the VM config memory size, even if the guest memory resizing did not happen. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2020-03-18 12:48:40 +01:00
Alejandro Jimenez	a22bc3559f	pvh: Write start_info structure to guest memory Fill the hvm_start_info and related memory map structures as specified in the PVH boot protocol. Write the data structures to guest memory at the GPA that will be stored in %rbx when the guest starts. Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>	2020-03-13 18:29:44 +01:00
Alejandro Jimenez	24f0e42e6a	pvh: Introduce EntryPoint struct In order to properly initialize the kvm regs/sregs structs for the guest, the load_kernel() return type must specify which boot protocol to use with the entry point address it returns. Make load_kernel() return an EntryPoint struct containing the required information. This structure will later be used in the vCPU configuration methods to setup the appropriate initial conditions for the guest. Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>	2020-03-13 18:29:44 +01:00
Sebastien Boeuf	34412c9b41	vmm: Add id option to VFIO hotplug Add a new id option to the VFIO hotplug command so that it matches the VFIO coldplug semantic. This is done by refactoring the existing code for VFIO hotplug, where VmAddDeviceData structure is replaced by DeviceConfig. This structure is the one used whenever a VFIO device is coldplugged, which is why it makes sense to reuse it for the hotplug codepath. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-03-11 19:50:31 +01:00
Sebastien Boeuf	e514b124ed	vmm: Update VmConfig when removing VFIO device This commit ensures that when a VFIO device is hot-unplugged from the VM, it is also removed from the VmConfig. This prevents a potential reboot from creating the device. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-03-10 17:05:06 +00:00
Sebastien Boeuf	6cbdb9aa47	vmm: api: Introduce new "remove-device" HTTP endpoint This commit introduces the new command "remove-device" that will let a user hot-unplug a VFIO PCI device from an already running VM. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-03-10 17:05:06 +00:00
Sebastien Boeuf	09829c44b2	vmm: Remove IO bus strong reference from Vm The Vm structure was used to store a strong reference to the IO bus. This is not needed anymore since the AddressManager is logically the one holding this strong reference. This has been made possible by the introduction of Weak references on the Bus structure itself. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-03-04 18:46:44 +01:00
Sebastien Boeuf	d0820cc026	vmm: Make add_vfio_device mutable The method add_vfio_device() from the DeviceManager needs to be mutable if we want later to be able to update some internal fields from the DeviceManager from this same function. This commit simply takes care of making the necessary changes to change this function as mutable. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-03-04 18:46:44 +01:00
Sebastien Boeuf	948f808da6	vm: Rename DeviceManager field in Vm structure It's more logical to name the field referring to the DeviceManager as "device_manager" instead of "devices". Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-03-04 18:46:44 +01:00
Sebastien Boeuf	d47f733e51	vmm: Break the cyclic dependency between DeviceManager and IO bus By inserting the DeviceManager on the IO bus, we introduced some cyclic dependency: DeviceManager ---> AddressManager ---> Bus ---> BusDevice ^ \| \| \| +---------------------------------------------+ This cycle needs to be broken by inserting a Weak reference instead of an Arc (considered as a strong reference). Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-03-04 12:06:02 +00:00
Sebastien Boeuf	c1af13efeb	vmm: Update VmConfig when adding new device Ensures the configuration is updated after a new device has been hotplugged. In the event of a reboot, this means the new VM will be started with the new device that had been previously hotplugged. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-03-04 12:06:02 +00:00
Sebastien Boeuf	a86f4369a7	vmm: Add VFIO PCI device hotplug support This commit finalizes the VFIO PCI hotplug support, based on all the previous commits preparing for it. One thing to notice, this does not support vIOMMU yet. This means we can hotplug VFIO PCI devices, but we cannot attach them to an existing or a new virtio-iommu device. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-03-04 12:06:02 +00:00
Sebastien Boeuf	d0218e94a3	vmm: Trigger hotplug notification to the guest Whenever the user wants to hotplug a new VFIO PCI device, the VMM will have to trigger a hotplug notification through the GED device. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-03-04 12:06:02 +00:00
Sebastien Boeuf	0e58741a09	vmm: api: Introduce new "add-device" HTTP endpoint This commit introduces the new command "add-device" that will let a user hotplug a VFIO PCI device to an already running VM. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-03-04 12:06:02 +00:00
Sebastien Boeuf	8142c823ed	vmm: Move DeviceManager into an Arc<Mutex<>> In anticipation of the support for device hotplug, this commit moves the DeviceManager object into an Arc<Mutex<>> when the DeviceManager is being created. The reason is, we need the DeviceManager to implement the BusDevice trait and then provide it to the IO bus, so that IO accesses related to device hotplug can be handled correctly. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-02-27 11:12:31 +01:00
Sebastien Boeuf	793d4e7b8d	vmm: Move codebase to GuestMemoryAtomic from vm-memory Relying on the latest vm-memory version, including the freshly introduced structure GuestMemoryAtomic, this patch replaces every occurrence of Arc<ArcSwap<GuestMemoryMmap> with GuestMemoryAtomic<GuestMemoryMmap>. The point is to rely on the common RCU-like implementation from vm-memory so that we don't have to do it from Cloud-Hypervisor. Fixes #735 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-02-19 13:48:19 +00:00
Rob Bradford	b04eb4770b	vmm: Follow the "exe" symlink from the PID directory in /proc It is necessary to do this at the start of the VMM execution rather than later as it must be done in the main thread in order to satisfy the checks required by PTRACE_MODE_READ_FSCREDS (see proc(5) and ptrace(2)) The alternative is to run as CAP_SYS_PTRACE but that has its disadvantages. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-02-14 17:32:49 +00:00
Sebastien Boeuf	3447e226d9	dependencies: bump vm-memory from `4237db3` to `f3d1c27` This commit updates Cloud-Hypervisor to rely on the latest version of the vm-memory crate. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-02-06 11:40:45 +01:00
Sebastien Boeuf	62ccccc303	vmm: Make sure to retry creating the VM on EINTR If the ioctl syscall KVM_CREATE_VM gets interrupted while creating the VM, it is expected that we should retry since EINTR should not be considered a standard error. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-02-05 12:06:21 +01:00
Rob Bradford	880a57c920	vmm: Remove VmInfo struct After refactoring the VmInfo struct is no longer needed. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-02-03 12:28:30 +00:00
Sebastien Boeuf	148a9ed5ce	vmm: Fix map_err losing the inner error Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-01-24 12:42:09 +01:00
Sebastien Boeuf	9ac06bf613	ci: Run clippy for each specific feature The build is run against "--all-features", "pci,acpi", "pci" and "mmio" separately. The clippy validation must be run against the same set of features in order to validate the code is correct. Because of these new checks, this commit includes multiple fixes related to the errors generated when manually running the checks. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2020-01-21 11:44:40 +01:00
Rob Bradford	0ab22fea2c	vmm: Only generate GED event when new DIMM added Avoid the ACPI scan in the guest OS when no new DIMM is hotplugged. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-01-17 23:44:21 +01:00
Rob Bradford	211786ab42	vmm: Only generate GED interrupt when the number of vCPUs has changed Avoid activity in the the guest OS if the number of vCPUs has not changed. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-01-17 23:44:21 +01:00
Rob Bradford	7310ab6fa7	devices, vmm: Use a bit field for ACPI GED interrupt type Use independent bits for storing whether there is a CPU or memory device changed when reporting changes via ACPI GED interrupt. This prevents a later notification squashing an earlier one and ensure that hotplugging both CPU and memory at the same time succeeds. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-01-15 20:21:22 +01:00
Rob Bradford	28c6652e57	vmm: Upon VmResize attempt to hotplug the memory If a new amount of RAM is requested in the VmResize command try and hotplug if it an increase (MemoryManager::Resize() silently ignores decreases.) Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-01-15 20:21:22 +01:00
Rob Bradford	284d5e011a	vmm: Add memory hotplug ACPI entries to DSDT Generate and expose the DSDT table entries required to support memory hotplug. The AML methods call into the MemoryManager via I/O ports exposed as fields. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-01-15 20:21:22 +01:00
Rob Bradford	82fce5a4e2	vmm: Add support for resizing the memory used by the VM For now the new memory size is only used after a reboot but support for hotplugging memory will be added in a later commit. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-01-15 20:21:22 +01:00
Rob Bradford	f5137e84bb	vmm, main: Add optional "hotplug_size" to --mem This specifies how much address space should be reserved for hotplugging of RAM. This space is reserved by adding move the start of the device area by the desired amount. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-01-15 20:21:22 +01:00
Rob Bradford	f1b6657833	vmm: Make desired vCPUs optional in resize command In order to be able to support resizing either vCPUs or memory or both make the fields in the resize command optional. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-01-15 20:21:22 +01:00
Rob Bradford	b2589d4f3f	vm-virtio, vmm, vfio: Store GuestMemoryMmap in an Arc<ArcSwap<T>> This allows us to change the memory map that is being used by the devices via an atomic swap (by replacing the map with another one). The ArcSwap provides the mechanism for atomically swapping from to another whilst still giving good read performace. It is inside an Arc so that we can use a single ArcSwap for all users. Not covered by this change is replacing the GuestMemoryMmap itself. This change also removes some vertical whitespace from use blocks in the files that this commit also changed. Vertical whitespace was being used inconsistently and broke rustfmt's behaviour of ordering the imports as it would only do it within the block. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2020-01-02 13:20:11 +00:00
Rob Bradford	7df88793a0	vmm: device_manager: Get device range from MemoryManager This removes the duplication of these values. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-23 10:25:40 +00:00
Rob Bradford	61cfe3e72d	vmm: Obtain sequential KVM memory slot numbers from MemoryManager This removes the need to handle a mutable integer and also centralises the allocation of these slot numbers. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-23 10:25:40 +00:00
Rob Bradford	260cebb8cf	vmm: Introduce MemoryManager The memory manager is responsible for setting up the guest memory and in the long term will also handle addition of guest memory. In this commit move code for creating the backing memory and populating the allocator into the new implementation trying to make as minimal changes to other code as possible. Follow on commits will further reduce some of the duplicated code. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-23 10:25:40 +00:00
Samuel Ortiz	37557c8b35	vmm: vm: Implement the Pausable trait Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-12-12 08:50:36 +01:00
Samuel Ortiz	9756fc2dd0	vmm: cpu_manager: Implement the Pausable trait Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-12-12 08:50:36 +01:00
Samuel Ortiz	35dd1523c9	vmm: device_manager: Implement the Pausable trait Since the Snapshotable placeholder and Migratable traits are provided as well, the DeviceManager object and all its objects are now Migratable. All Migratable devices are tracked as Arc<Mutex<dyn Migratable>> references. Keeping track of all migratable devices allows for implementing the Migratable trait for the DeviceManager structure, making the whole device model potentially migratable. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-12-12 08:50:36 +01:00
Rob Bradford	f994665610	vmm: Reduce the minimum IRQ constant Now that the GED device does not use a hardcoded IRQ number the starting IRQ number can be restored (needed for the hardcoded serial port IRQ.) Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-09 16:58:00 +00:00
Rob Bradford	9b1ba14f2d	vmm: Delegate device related ACPI DSDT table work to DeviceManager Move the code for handling the creation of the DSDT entries for devices into the DeviceManager. This will make it easier to handle device hotplug and also in the future remove some hardcoded ACPI constants. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-06 17:44:00 +00:00
Rob Bradford	60e6609011	vmm: Delegate CPU related ACPI tables to CpuManager Move the code for generating the MADT (APIC) table and the DSDT generation for CPU related functionality into the CpuManager. There is no functional change just code rearrangement. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-06 17:44:00 +00:00
Rob Bradford	59d01712ad	vmm: Remove kernel based IOAPIC handling from the device manager Previously the device setup code assumed that if no IOAPIC was passed in then the device should be added to the kernel irqchip. As an earlier change meant that there was always a userspace IOAPIC this kernel based code can be removed. The accessor still returns an Option type to leave scope for implementing a situation without an IOAPIC (no serial or GED device). This change does not add support no-IOAPIC mode as the original code did not either. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-06 12:34:06 +01:00
Rob Bradford	afea6a10a2	vmm: Stop initialising kernel based IOAPIC/PIC Now that we require the modern capabilities we can stop creating a kernel base irqchip. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-06 12:34:06 +01:00
Rob Bradford	9b1cb9621f	vmm: Remove pin based interrupt setup for virtio devices With MSI now required remove pin based interrupt support from all the virtio PCI device setup. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-06 12:34:06 +01:00
Rob Bradford	72fb687e3f	vmm: Check for required capabilities We now require CAP_SIGNAL_MSI, CAP_TSC_DEADLINE_TIMER and CAP_SPLIT_IRQCHIP. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-06 12:34:06 +01:00
Rob Bradford	f98b16f308	vmm: Update the configuration to preserve hot-plug CPUs after reboot Update the configuration after a resize to ensure that after a reboot the added vCPUs are preserved. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-05 16:39:19 +00:00
Rob Bradford	1722708612	vmm: Switch to storing VmConfig inside an Arc<Mutex<>> This permits the runtime reconfiguration of the VM. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-05 16:39:19 +00:00
Qiu Wenbo	e1af17d93a	vmm: Restore tty to canonical mode when SIGTERM or SIGINT received The tty mode remains raw mode when cloud-hypervisor is terminted by SIGTERM or SIGINT. The terminal is unusable due to echoing is disabled which is really annoying. Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn>	2019-12-05 01:29:26 -08:00
Qiu Wenbo	5208ff86c8	vmm: Detect and handle AMD SME (Secure Memory Encryption) Some physical address bits may become reserved in page table when SME is enabled on AMD platform. Guest will trigger a reserved bit violation page fault in this case due to write these reserved bits to 1 in page table. We need reduce the reserved bits to get the right physical address range. Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn>	2019-12-04 14:46:44 +00:00
Rob Bradford	48bf141364	vmm: Trigger a hotplug device notification when resizing When adjusting the number of vCPUs generate a hotplug notification. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-02 13:49:04 +00:00
Rob Bradford	ae9359c859	vmm: acpi: Create the CPU entries in the DSDT for all vCPUs CPU entries need to be created for every potential vCPU in the system. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-02 13:49:04 +00:00
Rob Bradford	791ca3388f	vmm: device_manager: Add ability to notify via GED device Add ability to notify via the GED device that there is some new hotplug activity. This will be used by the CpuManager (and later DeviceManager itself) to notify of new hotplug activity. Currently it has a hardcoded IRQ of 5 as the ACPI tables also need to refer to this IRQ and the IRQ allocation does not permit the allocation of specific IRQs. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-02 13:49:04 +00:00
Rob Bradford	86339b4cb4	vmm: Add HTTP API to resize the VM Currently only increasing the number of vCPUs is supported but in the future it will be extended. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-02 13:49:04 +00:00
Rob Bradford	1bbe48b24c	vmm: acpi: Mark non-boot vCPUs as disabled in the MADT table The MADT table contains the details of all the potential vCPUs and whether they are present at boot (as indicated by the flags field.) Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-02 13:49:04 +00:00
Rob Bradford	82bc07cce4	vmm: Add boot and max vCPU handling to command line parser Also retain support (with a warning for the old behaviour.) Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-02 13:49:04 +00:00
Rob Bradford	7543e00a07	vmm: Use new CpuManager accessor to get boot vCPUs When initialising the ACPI tables and configuring the VM use the new accessor on the CpuManager to get the number of boot vCPUs. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-02 13:49:04 +00:00
Rob Bradford	df0907845a	vmm: cpu: Introduce concept of maximum vs boot vCPUs in CpuManager For now the max vCPUs is the same as the boot vCPUs. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-12-02 13:49:04 +00:00
Samuel Ortiz	0f21781fbe	cargo: Bump the kvm and vmm-sys-util crates Since the kvm crates now depend on vmm-sys-util, the bump must be atomic. The kvm-bindings and ioctls 0.2.0 and 0.4.0 crates come with a few API changes, one of them being the use of a kvm_ioctls specific error type. Porting our code to that type makes for a fairly large diff stat. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-11-29 17:48:02 +00:00
Sebastien Boeuf	e4e8062dda	vmm: Mark guest RAM pages as mergeable In case the VM is started with the flag "--memory mergeable=on", it means the user expects the guest RAM pages to be marked as mergeable. This commit relies on the madvise(MADV_MERGEABLE) system call to inform the host kernel about these pages. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-11-22 15:28:10 +00:00
Rob Bradford	1da0ff395d	vmm: cpu: Add the CpuManager onto the IO bus This allows the kernel (via ACPI based controls) to query and control the CPU state. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-11-21 09:17:15 -08:00
Rob Bradford	1ac1231292	vmm: Encase CpuManager within an Arc<Mutex<>> This is necessary to be able to add the CpuManager onto the IO bus. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-11-21 09:17:15 -08:00
Rob Bradford	6958ec4922	vmm: Move CPU management code to its own module Move CpuManager, Vcpu and related functionality to its own module (and file) inside the VMM crate Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-11-11 15:46:24 +00:00
Rob Bradford	a1a5fe0c93	vmm: Split CPU management into it's own struct Pull details of vCPU management (booting, pausing, resuming, shutdown) into it's own structure. This will ultimately enable this to be moved to its own file and encapsulate all the vCPU handling for the VMM. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-11-08 11:59:21 +01:00
Rob Bradford	0319a4a09a	arch: vmm: Move ACPI tables creation to vmm crate Remove ACPI table creation from arch crate to the vmm crate simplifying arch::configure_system() GuestAddress(0) is used to mean no RSDP table rather than adding complexity with a conditional argument or an Option type as it will evaluate to a zero value which would be the default anyway. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-11-07 14:02:27 +00:00
Sebastien Boeuf	587a420429	cargo: Update to the latest kvm-ioctls version We need to rely on the latest kvm-ioctls version to benefit from the recent addition of unregister_ioevent(), allowing us to detach a previously registered eventfd to a PIO or MMIO guest address. Because of this update, we had to modify the current constraint we had on the vmm-sys-util crate, using ">= 0.1.1" instead of being strictly tied to "0.2.0". Once the dependency conflict resolved, this commit took care of fixing build issues caused by recent modification of kvm-ioctls relying on EventFd reference instead of RawFd. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-10-31 09:30:59 +01:00
Sebastien Boeuf	8746c16593	vmm: Create AddressManager to own SystemAllocator In order to reuse the SystemAllocator later at runtime, it is moved into the new structure AddressManager. The goal is to have a hold onto the SystemAllocator and both IO and MMIO buses so that we can use them later. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-10-29 16:48:02 +01:00
Sebastien Boeuf	1870eb4295	devices: Lock the BtreeMap inside to avoid deadlocks Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-10-29 16:48:02 +01:00
Jose Carlos Venegas Munoz	78e2f7a99a	api: http: handle cpu according to openapi openapi definition defines an object for cpus not an integer Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>	2019-10-17 07:39:56 +02:00
Samuel Ortiz	2a0ba7aef8	vmm: vm: Add state validation unit test Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-10-14 06:35:36 +02:00
Samuel Ortiz	097b30669f	vmm: vm: Verify that state transitions are valid We should return an explicit error when the transition from on VM state to another is invalid. The valid_transition() routine for the VmState enum essentially describes the VM state machine. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-10-14 06:35:36 +02:00
Samuel Ortiz	d2d3abb13c	vmm: Rename Booted vm state to Running Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-10-10 17:13:44 -07:00
Samuel Ortiz	dbbd04a4cf	vmm: Implement VM resume To resume a VM, we unpark all its vCPU threads. Fixes: #333 Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-10-10 17:13:44 -07:00
Samuel Ortiz	4ac0cb9cff	vmm: Implement VM pause In order to pause a VM, we signal all the vCPU threads to get them out of vmx non-root. Once out, the vCPU thread will check for a an atomic pause boolean. If it's set to true, then the thread will park until being resumed. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-10-10 17:13:44 -07:00
Samuel Ortiz	1298b508bf	vmm: Manage the exit and reset behaviours from the control loop So that we don't need to forward an ExitBehaviour up to the VMM thread. This simplifies the control loop and the VMM thread even further. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-10-08 18:03:27 -07:00
Sebastien Boeuf	b918220b49	vmm: Support virtio-pci devices attached to a virtual IOMMU This commit is the glue between the virtio-pci devices attached to the vIOMMU, and the IORT ACPI table exposing them to the guest as sitting behind this vIOMMU. An important thing is the trait implementation provided to the virtio vrings for each device attached to the vIOMMU, as they need to perform proper address translation before they can access the buffers. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-10-07 10:12:07 +02:00
Sebastien Boeuf	03352f45f9	arch: Create ACPI IORT table The virtual IOMMU exposed through virtio-iommu device has a dependency on ACPI. It needs to expose the device ID of the virtio-iommu device, and all the other devices attached to this virtual IOMMU. The IDs are expressed from a PCI bus perspective, based on segment, bus, device and function. The guest relies on the topology description provided by the IORT table to attach devices to the virtio-iommu device. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-10-07 10:12:07 +02:00
Samuel Ortiz	f9daf2e247	vmm: Factorize the vm boot and shutdown code So that the API handling state machine is cleaner and easier to read. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-10-04 09:36:33 +02:00
Samuel Ortiz	43b3642955	vmm: Clean Error handling up We used to have errors definitions spread across vmm, vm, api, and http. We now have a cleaner separation: All API routines only return an ApiResult. All VM operations, including the VMM wrappers, return a VmResult. This makes it easier to carry errors up to the HTTP caller. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-10-04 09:36:33 +02:00
Samuel Ortiz	27af983ec9	vmm: Track the VM state We will expose it through the api/v1/vm.info endpoint. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-10-04 09:36:33 +02:00
Samuel Ortiz	46cde1a38e	vmm: Rename the VM start and stop operations to boot and shutdown To match the OpenAPI description. And also to map the real life terminology. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-10-04 09:36:33 +02:00
Samuel Ortiz	f2de4d0315	vmm: config: Make the cmdline config serializable The linux_loader crate Cmdline struct is not serializable. Instead of forcing the upstream create to carry a serde dependency, we simply use a String for the passed command line and build the actual CmdLine when we need it (in vm::new()). Also, the cmdline offset is not a configuration knob, so we remove it. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-10-04 09:36:33 +02:00
Samuel Ortiz	b14fd37db9	vmm: Make --kernel optional The kernel path was the only mandatory command line option. With the addition of the --api-socket option, we can run without a kernel path and get it later through the API. Since we can end up with VM configurations that are no longer valid by default, we need to provide a validation check for it. For now, if the kernel path is not defined, the VM configuration is invalid. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-10-04 09:36:33 +02:00
Rob Bradford	a0455167d0	vmm: Use layout constant for kernel command line Remove the unnecessary field on CmdlineConfig and switch to using the common offset. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-27 11:48:30 -07:00
Rob Bradford	0e7a1fc923	arch, vmm: Start documenting major regions of RAM and reserved memory Using the existing layout module start documenting the major regions of RAM and those areas that are reserved. Some of the constants have also been renamed to be more consistent and some functions that returned constant variables have been replaced. Future commits will move more constants into this file to make it the canonical source of information about the memory layout. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-27 08:55:47 -07:00
Samuel Ortiz	8188074300	main: Start the VMM thread We now start the main VMM thread, which will be listening for VM and IPC related events. In order to start the configured VM, we no longer directly call the VM API but we use the IPC instead, to first create and then start a VM. Fixes: #303 Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-09-26 16:21:14 +02:00
Samuel Ortiz	4671a5831f	vmm: Move the EpollContext implementation to lib The VMM thread and control loop will be the sole consumer of the EpollContext and EpollDispatch API, so let's move it to lib.rs. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-09-26 16:21:14 +02:00
Samuel Ortiz	6710a39b5a	vmm: Pass the exit and reset fds to the vm creation method As we're going to move the control loop to the VMM thread, the exit and reset EventFds are no longer going to be owned by the VM. We pass a copy of them when creating the Vm instead. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-09-26 16:21:14 +02:00
Samuel Ortiz	feb1c33084	vmm: Add a VM config getter We will need it from the VMM thread, when trying to reboot a VM. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-09-26 16:21:14 +02:00
Samuel Ortiz	47167a658e	vmm: Add a VM console handling method In order to handle the VM STDIN stream from a separate VMM thread without having to export the DeviceManager, we simply add a console handling method to the Vm structure. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-09-26 16:21:14 +02:00
Samuel Ortiz	ea7abc6c80	vmm: Add a VM stop method In order to transfer the control loop to a separate VMM thread, we want to shrink the VM control loop to a bare minimum. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-09-26 16:21:14 +02:00
Samuel Ortiz	e6ef9ece2c	vmm: Move the tty setting to the VM start routine We want to shrink the control loop to a bare minimal. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-09-26 16:21:14 +02:00
Samuel Ortiz	2e9d815701	vmm: Use a reference counted VmConfig when creating a new VM Once passed to the VM creation routine, a VmConfig structure is immutable. We can simply carry a Arc of it instead of a reference. This also allows us to remove any lifetime bound from our VM. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-09-26 16:21:14 +02:00
Samuel Ortiz	bdfd1a3f38	vmm: Remove the Vmm structure The Vmm structure is just a placeholder for the KVM instance. We can create it directly from the VM creation routine instead. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-09-24 10:12:04 +02:00
Samuel Ortiz	9c5135da7a	vmm: Simplify the VM start flow We can integrate the kernel loading into the VM start method. The VM start flow is then: Vm::new() -> vm.start(), which feels more natural. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-09-24 10:12:04 +02:00
Samuel Ortiz	acc60b0ad5	vmm: Make VsockConfig owned Convert Path to PathBuf and remove the associated lifetime. Now we can remove the VmConfig associated lifetime. Fixes #298 Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-09-24 08:39:39 +01:00
Samuel Ortiz	9c5bfb8e13	vmm: Make MemoryConfig owned Convert Path to PathBuf and remove the associated lifetime. Fixes #298 Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-09-24 08:39:39 +01:00
Rob Bradford	5b3ca78dac	vmm: Use the full host physical address range Probe for the size of the host physical address range and use that to establish the address range for the VM. This removes the limitation on the size of the VM RAM and gives more space for the devices. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-19 10:43:55 +01:00
Rob Bradford	f0360c92d9	arch: acpi: Set the upper device range based on RAM levels After the 32-bit gap the memory is shared between the devices and the RAM. Ensure that the ACPI tables correctly indicate where the RAM ends and the device area starts by patching the precompiled tables. We get the following valid output now from the PCI bus probing (8GiB guest) [ 0.317757] pci_bus 0000:00: resource 4 [io 0x0000-0x0cf7 window] [ 0.319035] pci_bus 0000:00: resource 5 [io 0x0d00-0xffff window] [ 0.320215] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff window] [ 0.321431] pci_bus 0000:00: resource 7 [mem 0xc0000000-0xfebfffff window] [ 0.322613] pci_bus 0000:00: resource 8 [mem 0x240000000-0xfffffffff window] Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-19 10:43:55 +01:00
Rob Bradford	c042483953	build: make PCI (virtio and vfio) disableable at build time Although included by default it is now possible to build without PCI support. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-13 12:30:13 +01:00
Rob Bradford	6d27ac9dfc	vmm: Allow the DeviceManager to inject extra kernel commandline entries This is useful for virtio-mmio to be able to provide the commandline entries for the devices. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-13 12:30:13 +01:00
Rob Bradford	05b5115e67	vmm: Call DeviceManager's register_devices() on creation Rather than calling it at the very start of the VM execution (i.e. when the VCPUs are created) do it as part of the DeviceManager creation. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-10 20:04:00 +02:00
Rob Bradford	8f37dec498	vmm: "close" the SIGWINCH signal handler Rather than sending a signal to the signal handler used for handling SIGWINCH calls instead use the crate provided termination method. This also unregisters the signal handler which also means that there won't be a leaked signal handler remaining. This leaked signal handler is what was causing a failure to cleanup up the thread on subsequent requests breaking two reboots in a row. Fixes: #252 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-09 15:42:26 +02:00
Rob Bradford	d2db34edf2	vmm: Hide underlying console setup from VM Refactor the underlying console details into the DeviceManager and abstract away. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-06 09:26:37 -07:00
Rob Bradford	d089ee4e25	vmm: Move ownership of the exit/reset EventFd to Vm structure It makes more sense there as it is used by more than just the DeviceManager. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-06 09:26:37 -07:00
Rob Bradford	2f4de81175	vmm: Access ioapic/io_bus/mmio_bus from DeviceManager via accessor This paves the way for introducing a trait for the DeviceManager. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-06 09:26:37 -07:00
Rob Bradford	9ac967e3d8	vmm: Split DeviceManager into it's own file Refactor out DeviceManager into it's own file. This is part of a bigger effort to reduce complexity in the vm.rs file but will also allow future separation to allow making PCI support optional. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-06 09:26:37 -07:00
Rob Bradford	5dd675710b	vmm: Call munmap() on regions that have been mmap()ed For virtio-fs and virtio-pmem regions of memory are manually mapped into the address space of the VMM. In order to cleanly reboot we need to unmap those regions. Fixes: #223 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-05 10:38:14 +01:00
Rob Bradford	f59cad15a3	vmm: Cleanup signal_handler thread used for console SIGWINCH handling Do this by using the same mechanism as the vCPU threads by sending a signal to the thread. As this is the same mechanism reuse the same code and rename the "vcpus" member to "threads" to indicate this represents both the vCPU threads and also the signal handler thread. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-04 09:21:01 -07:00
Rob Bradford	9e764fc091	vmm, arch, devices: Put ACPI support behind a default feature Put the ACPI support behind a feature and ensure that the code compiles without that feature by adding an extra build to Travis. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-03 19:18:49 +02:00
Rob Bradford	bb2e7bb942	vmm: Shutdown vCPU threads As part of the cleanup of the VM shutdown all the vCPU threads. This is achieved by toggling a shared atomic boolean variable which is checked in the vCPU loop. To trigger the vCPU code to look at this boolean it is necessary to send a signal to the vCPU which will interrupt the running KVM_RUN ioctl. Fixes: #229 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-03 19:18:49 +02:00
Rob Bradford	ad128bf72d	vmm: Give vCPU and signal handler thread useful names Sadly only the first few characters of the thread name is preserved so use a shorter name for the vCPU thread for now. Also give the signal handling thread a name. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-03 19:18:49 +02:00
Rob Bradford	614eb68f16	vm: Make triple-fault and i8042 reset reboot the VM Now we have ACPI shutdown we should reboot on these reset triggers. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-03 19:18:49 +02:00
Rob Bradford	5a187ee2c2	x86_64/devices: acpi: Add support for ACPI shutdown & reboot Add an I/O port "device" to handle requests from the kernel to shutdown or trigger a reboot, borrowing an I/O used for ACPI on the Q35 platform. The details of this I/O port are included in the FADT (SLEEP_STATUS_REG/SLEEP_CONTROL_REG/RESET_REG) with the details of the value to write in the FADT for reset (RESET_VALUE) and in the DSDT for shutdown (S5 -> 0x05) Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-03 19:18:49 +02:00
Rob Bradford	ae66a44d26	vmm: Support both reset and shutdown Add a 2nd EventFd to the VM to control resetting (rebooting) the VM this supplements the EventFd used for managing shutdown of the VM. The default behaviour on i8042 or triple-fault based reset is currently unchanged i.e. it will trigger a shutdown. In order to support restarting the VM it was necessary to make start() function take a reference to the config. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-03 19:18:49 +02:00
Rob Bradford	2610f4353d	arch: acpi: Only add ACPI COM1 device if serial is turned on Only add the ACPI PNP device for the COM1 serial port if it is not turned off with "--serial off" Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-03 19:18:49 +02:00
Rob Bradford	451502b50b	vm: If a VCPU thread errors out then exit the hypervisor Currently when the VCPU thread exits on an error the VMM continues to run with no way of shutting down the main thread. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-09-03 19:18:49 +02:00
Sebastien Boeuf	b7d3ad9063	vm-virtio: fs: Factorize vhost-user setup This patch factorizes the existing virtio-fs code by relying onto the common code part of the vhost_user module in the vm-virtio crate. In details, it factorizes the vhost-user setup, and reuses the error types defined by the module instead of defining its own types. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-08-31 17:33:17 +01:00
Sebastien Boeuf	56cad00f2e	vm-virtio: Move fs.rs to vhost_user module vhost-user-net introduced a new module vhost_user inside the vm-virtio crate. Because virtio-fs is actually vhost-user-fs, it belongs to this new module and needs to be moved there. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-08-31 17:33:17 +01:00
Cathy Zhang	584a2cccee	vmm: Add vhost-user-net support Update vm configuration and device initial process to add vhost-user-net support. Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>	2019-08-30 15:00:26 +01:00
Cathy Zhang	51306555e7	vmm: Add hugetlbfs handling support The currently directory handling process to open tempfile by OpenOptions with custom_flags(O_TMPFILE) is workable for tmp filesystem, but not workable for hugetlbfs, add new directory handling process which works fine for both tmpfs and hugetlbfs. Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>	2019-08-30 15:00:26 +01:00
Sebastien Boeuf	0b8856d148	vmm: Add RwLock to the GuestMemoryMmap Following the refactoring of the code allowing multiple threads to access the same instance of the guest memory, this patch goes one step further by adding RwLock to it. This anticipates the future need for being able to modify the content of the guest memory at runtime. The reasons for adding regions to an existing guest memory could be: - Add virtio-pmem and virtio-fs regions after the guest memory was created. - Support future hotplug of devices, memory, or anything that would require more memory at runtime. Because most of the time, the lock will be taken as read only, using RwLock instead of Mutex is the right approach. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-08-22 08:24:15 +01:00
Sebastien Boeuf	ec0b5567c8	vmm: Share the guest memory instead of cloning it The VMM guest memory was cloned (copied) everywhere the code needed to have ownership of it. In order to clean the code, and in anticipation for future support of modifying this guest memory instance at runtime, it is important that every part of the code share the same instance. Because VirtioDevice implementations need to have access to it from different threads, that's why Arc must be used in this case. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-08-22 08:24:15 +01:00
Sebastien Boeuf	658c076eb2	linters: Fix clippy issues Latest clippy version complains about our existing code for the following reasons: - trait objects without an explicit `dyn` are deprecated - `...` range patterns are deprecated - lint `clippy::const_static_lifetime` has been renamed to `clippy::redundant_static_lifetimes` - unnecessary `unsafe` block - unneeded return statement All these issues have been fixed through this patch, and rustfmt has been run to cleanup potential formatting errors due to those changes. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-08-15 09:10:04 -07:00
Samuel Ortiz	c52e276a5c	vmm: Log debug ioport timestamps We timestamp the VM creation time, and log the elapsed time between that instant and the debug ioport events. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-08-15 16:06:54 +02:00
Samuel Ortiz	48a9300667	vmm: Log 0x80 IO port writes The 0x80 IO port is typically used for BIOS debugging and testing on bare metal x86 platforms. We use that port and its dedicated 16 debug codes to time and track the guest boot process. Fixes #63 Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-08-15 16:06:54 +02:00
Sebastien Boeuf	3c29c47783	vmm: Create shared memory region for virtio-fs When the cache_size parameter from virtio-fs device is not empty, the VMM creates a dedicated memory region where the shared files will be memory mapped by the virtio-fs device. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-08-13 13:57:53 +02:00
fazlamehrab	df5058ec0a	vm-virtio: Implement console size config feature One of the features of the virtio console device is its size can be configured and updated. Our first iteration of the console device implementation is lack of this feature. As a result, it had a default fixed size which could not be changed. This commit implements the console config feature and lets us change the console size from the vmm side. During the activation of the device, vmm reads the current terminal size, sets the console configuration accordinly, and lets the driver know about this configuration by sending an interrupt. Later, if someone changes the terminal size, the vmm detects the corresponding event, updates the configuration, and sends interrupt as before. As a result, the console device driver, in the guest, updates the console size. Signed-off-by: A K M Fazla Mehrab <fazla.mehrab.akm@intel.com>	2019-08-09 13:55:43 -07:00
Rob Bradford	d9a355f85a	vmm: Add new "null" serial/console output mode Poor performance was observed when booting kernels with "console=ttyS0" and the serial port disabled. This change introduces a "null" console output mode and makes it the default for the serial console. In this case the serial port is advertised as per other output modes but there is no input and any output is dropped. Fixes: #163 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-08-09 09:04:48 -07:00
Rob Bradford	f910476dd7	vmm: Only send stdin input to serial/console if it can handle it Do not send the contents of stdin to the serial or console device if they're not in tty mode. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-08-09 09:04:48 -07:00
Rob Bradford	9caad7394d	build, misc: Bump vmm-sys-util dependency The structure of the vmm-sys-util crate has changed with lots of code moving to submodules. This change adjusts the use of the imported structs to reference the submodules. Fixes: #145 Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-08-02 07:42:20 -07:00
Sebastien Boeuf	1a484a82f9	vmm: Don't break from epoll loop on EINTR The existing code taking care of the epoll loop was too restrictive as it was propagating the error returned from the epoll_wait() syscall, no matter what was the error. This causes the epoll loop to be broken, leading to the VM termination. This patch enforces the parsing of the returned error and prevent from the error propagation in case it is EINTR, which stands for Interrupted. In case the epoll loop is interrupted, it is appropriate to retry. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-08-02 08:37:34 +01:00
Sebastien Boeuf	532f6a96f3	vmm: Factorize VM related information into a structure In order to fix the clippy error complaining about the number of arguments passed to a function exceeding the maximum of 7 arguments, this patch factorizes those parameters into a more global one called VmInfo. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-08-02 08:35:16 +01:00
Sebastien Boeuf	c0756c429d	vmm: Increase memory slot from virtio-pmem Since virtio-pmem uses a KVM user memory region, it needs to increment the slot index in use to prevent from any conflict with further VFIO allocations (used for mapping mappable memory BARs). Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-08-02 08:35:16 +01:00
Samuel Ortiz	fa41ddd94f	arch: Add a Reserved memory region to the memory hole We add a Reserved region type at the end of the memory hole to prevent 32-bit devices allocations to overlap with architectural address ranges like IOAPIC, TSS or APIC ones. Eventually we should remove that reserved range by allocating all the architectural ranges before letting 32-bit devices use the memory hole. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-25 11:45:38 +01:00
Samuel Ortiz	299d887856	arch: Add SubRegion memory type We want to be able to differentiate between memory regions that must be managed separately from the main address space (e.g. the 32-bit memory hole) and ones that are reserved (i.e. from which we don't want to allow the VMM to allocate address ranges. We are going to use a reserved memory region for restricting the 32-bit memory hole from expanding beyond the IOAPIC and TSS addresses. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-25 11:45:38 +01:00
Sebastien Boeuf	d92d797896	vfio: Update memory slot index to support multiple VFIO devices In order to correctly support multiple VFIO devices, we need to increment the memory slot index every time it is being used to set some user memory region through KVM. That's why the mem_slot parameter is made mutable. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-25 11:45:38 +01:00
Sebastien Boeuf	b9f677c46c	vmm: Fix the memory slot index The memory slot index provided to the DeviceManager was wrong since only the RAM memory regions are set as user memory regions to KVM. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-25 11:45:38 +01:00
Sebastien Boeuf	b5eab43aa5	vfio: Create a global KVM VFIO device for all VFIO devices KVM does not support multiple KVM VFIO devices to be created when trying to support multiple VFIO devices. This commit creates one global KVM VFIO device being shared with every VFIO device, which makes possible the support for passing several devices through the VM. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-25 11:45:38 +01:00
Samuel Ortiz	4d16ca8ae7	vmm: Support direct device assignment With the VFIO crate, we can now support directly assigned PCI devices into cloud-hypervisor guests. We support assigning multiple host devices, through the --device command line parameter. This parameter takes the host device sysfs path. Fixes: #60 Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-24 11:55:08 +02:00
Samuel Ortiz	4e48309660	vm: Factorize all virtio devices creation routines Our DeviceManager::new() routine is reaching north of 250 lines. For simplicity and readbility sake, extract all virtio devices creation code into their own routines. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-23 08:41:37 +01:00
fazlamehrab	24438e0390	vm-virtio: Enable the vmm support for virtio-console To use the implemented virtio console device, the users can select one of the three options ("off", "tty" or "file=/path/to/the/file") with the command line argument "--console". By default, the console is enabled as a device named "hvc0" (option: tty). When "off" option is used, the console device is not added to the VM configuration at all. Signed-off-by: A K M Fazla Mehrab <fazla.mehrab.akm@intel.com>	2019-07-22 23:08:56 +01:00
Sebastien Boeuf	f98a69f42e	vm-allocator: Introduce an MMIO hole address allocator With this new AddressAllocator as part of the SystemAllocator, the VMM can now decide with finer granularity where to place memory. By allocating the RAM and the hole into the MMIO address space, we ensure that no memory will be allocated by accident where the RAM or where the hole is. And by creating the new MMIO hole address space, we create a subset of the entire MMIO address space where we can place 32 bits BARs for example. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-22 09:51:16 -07:00
Samuel Ortiz	0a04a950a1	vm-allocator: Expand the IRQ allocation API to support GSI GSI (Global System Interrupt) is an extension of just a linear array of IRQs. It takes IOAPICs into account for example. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2019-07-22 09:51:16 -07:00
Chao Peng	96fb38a5aa	vm-allocator: Align address at allocation time There is alignment support for AddressAllocator but there are occations that the alignment is known only when we call allocate(). One example is PCI BAR which is natually aligned, means for which we have to align the base address to its size. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>	2019-07-22 09:51:16 -07:00
Chao Peng	af7cd74e04	vm-allocator: Make port IO non optional This is only for allocating the port IO address range. If a platform does not have PIO devices at all, the address range will simply be unused. So, simplify the vm-allocator data structure by making both MMIO and PIO mandatory. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>	2019-07-22 09:51:16 -07:00
Sebastien Boeuf	1268165040	pci: Allow for registering IO and Memory BAR This patch adds the support for both IO and Memory BARs by expecting the function allocate_bars() to identify the type of each BAR. Based on the type, register_mapping() insert the address range on the appropriate bus (PIO or MMIO). Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-22 09:50:10 -07:00
Rob Bradford	cb81f8be5b	vmm: Make serial port controllable via command line Add a "--serial" command line that takes as input either "off", "tty" (default and current behaviour) and "file=/path/to/file". When "--serial off" is used the serial device is not added to the VM configuration at all. Integration tests added that check for interrupts present (or not) and that when sending to a file the file contains the expected serial output. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-07-11 12:17:58 +01:00
Sebastien Boeuf	d9ce29117e	vmm: Flag --disk should be optional Now that cloud-hypervisor VMM supports virtio-pmem, it can directly boot a VM from an image exposed as a persistent memory block device. That's why there is no need to force the --disk option as being mandatory. Fixes #90 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-09 21:58:02 +02:00
Sebastien Boeuf	f0a76ad424	vmm: Add support for multiple virtio-net devices Until now, the VMM was only accepting a single instance of virtio-net device. This commit extends the virtio-net support by allowing several devices to be created for a single VM. Fixes #71 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-09 18:55:30 +01:00
Sebastien Boeuf	a2947f9a9f	cli: Accept K,M,G suffixes for size parameters For every parameter dealing with a size as option, such as memory or virtio-pmem, the CLI can now parse sizes with the suffixes K, M or G. Fixes #70 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-09 15:22:26 +01:00
Jing Liu	2bb0b22cc1	pci: Refine pci topology PciConfigIo is a legacy pci bus dispatcher, which manages all pci devices including a pci root bridge. However, it is unnecessary to design a complex hierarchy which redirects every access by PciRoot. Since pci root bridge is also a pci device instance, and only contains easy config space read/write, and PciConfigIo actually acts as a pci bus to dispatch resource based resolving when VMExit, we re-arrange to make the pci hierarchy clean. Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>	2019-07-09 10:01:18 +02:00
Rob Bradford	49d6b495d5	vmm: Remove println! from debugging Remove println! left over from virtio-fs development. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2019-07-02 13:50:50 +02:00
Sebastien Boeuf	34e09923a5	vmm: Add support for multiple virtio-pmem devices Until now, the VMM was only accepting a single instance of virtio-pmem device. This commit extend the virtio-pmem support by allowing several devices to be created for a single VM. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-01 14:38:55 +01:00
Sebastien Boeuf	294c26bfb7	vmm: Add virtio-pmem support to cloud-hypervisor This patch plumbs the virtio-pmem device to the VMM. By adding a new command line option "--pmem", we can now expose some persistent memory to the guest OS, backed by the provided source. The point of having such support in cloud-hypervisor is to be able to share some memory between the host and the guest as DAXable. One interesting use case is to boot directly from an image passed through virtio-pmem, instead of going through virtio-blk. This can allow good performances while avoiding the guest cache, which would prevent the VM memory footprint from growing too much. Fixes #68 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-07-01 14:38:55 +01:00
Sebastien Boeuf	1cb2378499	vmm: Add support for multiple virtio-fs devices Until now, the VMM was only accepting a single instance of a virtio-fs device. This commit extend the virtio-fs support by allowing several devices to be created for a single VM. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-06-27 21:46:00 +02:00
Sebastien Boeuf	53085c7ccc	memory: Allow memory to be backed by a file In the context of vhost-user, we need the guest RAM to be backed by a file in order to be accessed by an external process. This patch adds the new flag "file=" to the "--memory" option so that we can specify from the command line if the memory needs to be backed, and by which specific file. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-06-27 21:46:00 +02:00
Sebastien Boeuf	2ede30b6d3	vmm: Add virtio-fs support to the VMM The user can now share some files and directories with the guest by providing the corresponding vhost-user socket. The virtiofsd daemon should be started by the user before to start the VM. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-06-27 21:46:00 +02:00
Jing Liu	30266a41be	vm-memory usage: vm-memory latest codes rename MmapError to Error Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>	2019-06-26 08:33:46 -07:00
Jing Liu	9da2343cb7	device: Improvement for BusDevice trait and PciDevice trait BusDevice includes two methods which are only for PCI devices, which should be as members of PciDevice trait for a better clean high level APIs. Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>	2019-06-25 06:17:30 -07:00
Sebastien Boeuf	5e803ab18f	vmm: Integrate userspace IOAPIC The previous commit introduced a userspace implementation of an IOAPIC and this commits aims to plumb it into the cloud-hypervisor VMM. Here is the list of new things brought by this patch: - Update the rust-vmm/kvm-ioctls dependency to benefit from latest patches including the support for split irqchip, and the vector being returned when a VM exit is caused by an EOI. - Enable the split irqchip (which means no IOAPIC or PIC is emulated in kernel). This is done conditionally based on the support of the TSC_DEADLINE_TIMER from both KVM and the underlying CPU. The dependency on TSC_DEADLINE_TIMER is related to KVM which does not support creating the in kernel PIT if it has a split irqchip. - Rely on callbacks to handle the following use cases: - in kernel IOAPIC + serial IRQ (pin based) - in kernel IOAPIC + virtio-pci MSI-X - in kernel IOAPIC + virtio-pci IRQ (pin based) - userspace IOAPIC + serial IRQ (pin based) - userspace IOAPIC + virtio-pci MSI-X - userspace IOAPIC + virtio-pci IRQ (pin based) Fixes #13 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-06-21 10:09:34 +02:00
Sebastien Boeuf	c8c4a4d444	devices: Create Interrupt trait to abstract interrupt delivery This commit anticipate the future need from having support for both in kernel and userspace IOAPIC. The way to signal an interrupt from the serial device will vary depending on the use case, but this should be independent from the serial implementation itself. That's why this patch provides a generic trait for the serial device to call from, so that it can trigger interrupts independently from the IOAPIC type chosen (in kernel vs userspace). Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2019-06-21 10:09:34 +02:00

... 2 3 4 5 6 ...

394 Commits