cloud-hypervisor

mirror of https://github.com/cloud-hypervisor/cloud-hypervisor.git synced 2024-10-18 19:09:15 +00:00

Author	SHA1	Message	Date
Praveen K Paladugu	dc723171a7	vmm: cleanup legacy console device management Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>	2024-06-12 15:47:19 +00:00
Praveen K Paladugu	52eebaf6b2	vmm: refactor DeviceManager to use console_info While adding console devices, DeviceManager will now use the FDs in console_info instead of creating them. To reduce the size of this commit, I marked some variables are unused with '_' prefix. All those variables are cleaned up in next commit. Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>	2024-06-12 15:47:19 +00:00
Praveen K Paladugu	d784bf0c75	vmm: move listen_for_sigwinch_on_tty method Move listen_for_sigwinch_on_tty to sigwinch_listener.rs module. Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>	2024-06-12 15:47:19 +00:00
Praveen K Paladugu	cf6115a73c	vmm: Introduce console_devices module Introduce ConsoleInfo struct. This struct will be used to store FDs of console devices created in pre_create_console_devices and passed to vm_boot. Move set_raw_mode, create_pty methods to console_devices.rs to consolidate console management methods into a single module. Lastly, copy the logic to create and configure console devices into pre_create_console_devices method. Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>	2024-06-12 15:47:19 +00:00
Thomas Barrett	e7e856d8ac	vmm: add pci_segment mmio aperture configs When using multiple PCI segments, the 32-bit and 64-bit mmio aperture is split equally between each segment. Add an option to configure the 'weight'. For example, a PCI segment with a `mmio32_aperture_weight` of 2 will be allocated twice as much 32-bit mmio space as a normal PCI segment. Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>	2024-04-24 09:35:19 +00:00
Rob Bradford	10ab87d6a3	misc: Migrate away from versionize Replace with serde instead. Fixes: #6370 Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2024-04-22 17:10:55 +00:00
Andrew Carp	045964deee	virtio-devices: Map mmio over virtio-iommu Add infrastructure to lookup the host address for mmio regions on external dma mapping requests. This specifically resolves vfio passthrough for virtio-iommu, allowing for nested virtualization to pass external devices through. Fixes #6110 Signed-off-by: Andrew Carp <acarp@crusoeenergy.com>	2024-04-01 09:16:30 +00:00
Andrew Carp	a5e2460d95	virtio-devices: Move VfioDmaMapping to be in the pci crate VfioUserDmaMapping is already in the pci crate, this moves VfioDmaMapping to match the behavior. This is a necessary change to allow the VfioDmaMapping trait to have access to MmioRegion memory without creating a circular dependency. The VfioDmaMapping trait needs to have access to mmio regions to map external devices over mmio (a follow-up commit). Signed-off-by: Andrew Carp <acarp@crusoeenergy.com>	2024-04-01 09:16:30 +00:00
Rob Bradford	521a0d1ade	vmm: Fix clippy warnings for use of .clone() warning: assigning the result of `Clone::clone()` may be inefficient --> vmm/src/device_manager.rs:4188:17 \| 4188 \| id = child_id.clone(); \| ^^^^^^^^^^^^^^^^^^^^^ help: use `clone_from()`: `id.clone_from(child_id)` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#assigning_clones = note: `#[warn(clippy::assigning_clones)]` on by default Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2024-03-19 18:36:22 +00:00
Bo Chen	1363891df6	vmm: Avoid deadlock from waiting on paused device worker threads A deadlock can happen from the destination VM of live upgrade or migration due to waiting on paused device worker threads. For example, when a serialization error happens after the `DeviceManager` struct is restored (where all virtio device worker threads are spawned but in paused/parked state), a deadlock will happen from `DeviceManager::drop()`, as it blocks for waiting worker threads to join. This patch ensures that we wake up all device (mostly virtio) worker threads before we block for them to join. Signed-off-by: Bo Chen <chen.bo@intel.com>	2024-03-14 02:07:52 +00:00
Thomas Barrett	b750c332aa	vmm: add NVIDIA GPUDirect P2P support On platforms where PCIe P2P is supported, inject a PCI capability into NVIDIA GPU to indicate support. Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>	2024-02-29 09:26:29 +00:00
acarp	035c4b20fb	block: Set an option to pin virtio block threads to host cpus Currently the only way to set the affinity for virtio block threads is to boot the VM, search for the tid of each of the virtio block threads, then set the affinity manually. This commit adds an option to pin virtio block queues to specific host cpus (similar to pinning vcpus to host cpus). A queue_affinity option has been added to the disk flag in the cli to specify a mapping of queue indices to host cpus. Signed-off-by: acarp <acarp@crusoeenergy.com>	2024-02-13 09:05:57 +00:00
Rob Bradford	e70bf59809	vmm: Directly clone console resize pipe Beta clippy fix: warning: this call to `as_ref.map(...)` does nothing --> vmm/src/device_manager.rs🔢9 \| 1234 \| self.console_resize_pipe.as_ref().map(Arc::clone) \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try: `self.console_resize_pipe.clone()` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#useless_asref = note: `#[warn(clippy::useless_asref)]` on by default Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2024-02-07 09:25:40 +00:00
Philipp Schuster	e50a641126	devices: add debug-console device This commit adds the debug-console (or debugcon) device to CHV. It is a very simple device on I/O port 0xe9 supported by QEMU and BOCHS. It is meant for printing information as easy as possible, without any necessary configuration from the guest at all. It is primarily interesting to OS/kernel and firmware developers as they can produce output as soon as the guest starts without any configuration of a serial device or similar. Furthermore, a kernel hacker might use this device for information of type B whereas information of type A are printed to the serial device. This device is not used by default by Linux, Windows, or any other "real" OS, but only by toy kernels and during firmware development. In the CLI, it can be configured similar to --console or --serial with the --debug-console parameter. Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>	2024-01-25 10:25:14 -08:00
Thomas Barrett	c297d8d796	vmm: use RateLimiterGroup for virtio-blk devices Add a 'rate_limit_groups' field to VmConfig that defines a set of named RateLimiterGroups. When the 'rate_limit_group' field of DiskConfig is defined, all virtio-blk queues will be rate-limited by a shared RateLimiterGroup. The lifecycle of all RateLimiterGroups is tied to the Vm. A RateLimiterGroup may exist even if no Disks are configured to use the RateLimiterGroup. Disks may be hot-added or hot-removed from the RateLimiterGroup. When the 'rate_limiter' field of DiskConfig is defined, we construct an anonymous RateLimiterGroup whose lifecycle is tied to the Disk. This is primarily done for api backwards compatability. Importantly, the behavior is not the same! This implementation rate_limits the aggregate bandwidth / iops of an individual disk rather than the bandwidth / iops of an individual queue of a disk. When neither the 'rate_limit_group' or the 'rate_limiter' fields of DiskConfig is defined, the Disk is not rate-limited. Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>	2024-01-03 10:21:06 -08:00
Markus Sütter	0e9513f2b7	vmm: Allow IP configuration on named TAP interfaces This commit changes existing behavior of named TAP interfaces. When booting a VM with configuration for a named TAP interface, cloud-hypervisor will create the interface and apply a given IP configuration to that interface. If the named interface already exists on the system, the configuration is NOT overwritten. Setting the ip and netmask fields in a tap interface configuration for a named tap interface now works by handing this configuration to the virtio_devices::Net object when it is created with a name. This commit also touches net_util to make sure that the ip configuration of existing TAP interfaces is not modified with ip or netmask handed to open_tap. Signed-off-by: Markus Sütter <markus.suetter@secunet.com>	2023-12-05 08:59:04 -08:00
Thomas Barrett	45b01d592a	vmm: assign each pci segment 32-bit mmio allocator Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>	2023-11-20 15:33:50 -08:00
Thomas Barrett	bae13c5c56	block: add aio disk backend Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>	2023-10-25 10:19:23 -07:00
Thomas Barrett	3029fbeafd	vmm: Allow assignment of PCI segments to NUMA node Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>	2023-10-18 11:18:15 -07:00
Praveen K Paladugu	6d1077fc3c	vmm: Unix socket backend for serial port Cloud-Hypervisor takes a path for Unix socket, where it will listen on. Users can connect to the other end of the socket and access serial port on the guest. "--serial socket=/path/to/socket" is the cmdline option to pass to cloud-hypervisor. Users can use socat like below to access guest's serial port once the guest starts to boot: socat -,crnl UNIX-CONNECT:/path/to/socket Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>	2023-10-05 15:26:29 +01:00
Thomas Barrett	c4e8e653ac	block: Add support for user specified ID_SERIAL Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>	2023-09-11 12:50:41 +01:00
Philipp Schuster	7bf0cc1ed5	misc: Fix various spelling errors using typos This fixes all typos found by the typos utility with respect to the config file. Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>	2023-09-09 10:46:21 +01:00
Rob Bradford	4548de194d	build: Bump acpi_tables version Fix newly added deprecation for mispelling of cacheable. Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2023-09-07 13:58:33 +01:00
Rob Bradford	8d072fef15	vmm: device_manager: Remove unnecessary mut from reference warning: this argument is a mutable reference, but not used mutably --> vmm/src/device_manager.rs:1908:35 \| 1908 \| fn set_raw_mode(&mut self, f: &mut dyn AsRawFd) -> vmm_sys_util::errno::Result<()> { \| ^^^^^^^^^^^^^^^^ help: consider changing to: `&dyn AsRawFd` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_pass_by_ref_mut = note: `#[warn(clippy::needless_pass_by_ref_mut)]` on by default Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2023-08-22 12:01:54 +01:00
Rob Bradford	a00d29867c	fuzz, vmm: Avoid infinite loop in CMOS fuzzer With the addition of the spinning waiting for the exit event to be received in the CMOS device a regression was introduced into the CMOS fuzzer. Since there is nothing to receive the event in the fuzzer and there is nothing to update the bit the that the device is looping on; introducing an infinite loop. Use an Option<> type so that when running the device in the fuzzer no Arc<AtomicBool> is provided effectively disabling the spinning logic. Fixes: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=61165 Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2023-08-07 08:04:55 +08:00
Rob Bradford	06dc708515	vmm: Only return from reset driven I/O once event received The reset system is asynchronous with an I/O event (PIO or MMIO) for ACPI/i8042/CMOS triggering a write to the reset_evt event handler. The VMM thread will pick up this event on the VMM main loop and then trigger a shutdown in the CpuManager. However since there is some delay between the CPU threads being marked to be killed (through the CpuManager::cpus_kill_signalled bool) it is possible for the guest vCPU that triggered the exit to be re-entered when the vCPU KVM_RUN is called after the I/O exit is completed. This is undesirable and in particular the Linux kernel will attempt to jump to real mode after a CMOS based exit - this is unsupported in nested KVM on AMD on Azure and will trigger an error in KVM_RUN. Solve this problem by spinning in the device that has triggered the reset until the vcpus_kill_signalled boolean has been updated indicating that the VMM thread has received the event and called CpuManager::shutdown(). In particular if this bool is set then the vCPU threads will not re-enter the guest. Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2023-08-04 09:57:25 +08:00
Yu Li	447cad3861	block: merge qcow, vhdx and block_util into block crate This commit merges crates `qcow`, `vhdx` and `block_util` into the crate `block`, which can allow `qcow` to use functions from `block_util` without introducing a circular crate dependency. This commit is based on crosvm implementation: `f2eecc4152` Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>	2023-07-19 13:52:43 +01:00
Manish Goregaokar	6fdba7ca11	build: Allow disabling io_uring This gives users the chance to reduce the number of dependencies included, which is generally good practice and also reduces code size. Furthermore, `io_uring` specifically is a strong contender for something one may wish to disable due to the syscall API's many security issues[1] [1]: https://security.googleblog.com/2023/06/learnings-from-kctf-vrps-42-linux.html Signed-off-by: Manish Goregaokar <manishsmail@gmail.com>	2023-07-11 06:19:30 -07:00
Yi Wang	d99c0c0d1d	devices: pvpanic: add method for DeviceManager Add method for DeviceManager to invoke. Signed-off-by: Yi Wang <foxywang@tencent.com> Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2023-07-06 11:14:54 +01:00
Rob Bradford	f485922b78	build: Bump acpi_tables from `cb5f06c` to `05a6091` Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2023-06-08 17:28:02 +00:00
Wei Liu	ba1e89139a	pci: aml: support up to 256 PCI segments Originally the AML only accepted one hex number for PCI segment numbering. Change it to accept two numbers. That makes it possible to add up to 256 PCI segments. Signed-off-by: Wei Liu <liuwe@microsoft.com>	2023-05-02 09:34:05 +01:00
Alyssa Ross	c90a0ffff6	vmm: reset to the original termios Previously, we used two different functions for configuring ttys. vmm_sys_util::terminal::Terminal::set_raw_mode() was used to configure stdio ttys, and cfmakeraw() was used to configure ptys created by cloud-hypervisor. When I centralized the stdio tty cleanup, I also switched to using cfmakeraw() everywhere, to avoid duplication. cfmakeraw sets the OPOST flag, but when we later reset the ttys, we used vmm_sys_util::terminal::Terminal::set_canon_mode(), which does not unset this flag. This meant that the terminal was getting mostly, but not fully, reset. To fix this without depending on the implementation of cfmakeraw(), let's just store the original termios for stdio terminals, and restore them to exactly the state we found them in when cloud-hypervisor exits. Fixes: `b6feae0a` ("vmm: only touch the tty flags if it's being used") Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-05-02 09:33:53 +01:00
Bo Chen	a9623c7a28	vmm: Add valid FDs for TAP devices to 'VmConfig::preserved_fds' In this way, valid FDs for TAP devices will be closed when the holding VmConfig instance is destroyed. Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-04-17 16:33:29 +01:00
Alyssa Ross	b6feae0ace	vmm: only touch the tty flags if it's being used When neither serial nor console are connected to the tty, cloud-hypervisor shouldn't touch the tty at all. One way in which this is annoying is that if I am running cloud-hypervisor without it using my terminal, I expect to be able to suspend it with ^Z like any other process, but that doesn't work if it's put the terminal into raw mode. Instead of putting the tty into raw mode when a VM is created or restored, do it when a serial or console device is created. Since we now know it can't be put into raw mode until the Vm object is created, we can move setting it back to canon mode into the drop handler for that object, which should always be run in normal operation. We still also put the tty into canon mode in the SIGTERM / SIGINT handler, but check whether the tty was actually used, rather than whether stdin is a tty. This requires passing on_tty around as an atomic boolean. I explored more of an abstraction over the tty — having an object that encapsulated stdout and put the tty into raw mode when initialized and into canon mode when dropped — but it wasn't practical, mostly due to the special requirements of the signal handler. I also investigated whether the SIGWINCH listener process could be used here, which I think would have worked but I'm hesitant to involve it in serial handling as well as conosle handling. There's no longer a check for whether the file descriptor is a tty before setting it into canon mode — it's redundant, because if it's not a tty it just won't respond to the ioctl. Tested by shutting down through the API, SIGTERM, and an error injected after setting raw mode. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-17 16:33:17 +01:00
Alyssa Ross	38a1b45783	vmm: use the SIGWINCH listener for TTYs too Previously, we were only using it for PTYs, because for PTYs there's no alternative. But since we have to have it for PTYs anyway, if we also use it for TTYs, we can eliminate all of the code that handled SIGWINCH for TTYs. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-05 11:23:06 +01:00
Alyssa Ross	e9841db486	vmm: don't ignore errors from SIGWINCH listener Now that the SIGWINCH listener has fallbacks for older kernels, we don't expect it to routinely fail, so if there's an error setting it up, we want to know about it. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-05 11:23:06 +01:00
Alyssa Ross	505f4dfa53	vmm: close all unused fds in sigwinch listener The PTY main file descriptor had to be introduced as a parameter to start_sigwinch_listener, so that it could be closed in the child. Really the SIGWINCH listener process should not have any file descriptors open, except for the ones it needs to function, so let's make it more robust by having it close all other file descriptors. For recent kernels, we can do this very conveniently with close_range(2), but for older kernels, we have to fall back to closing open file descriptors one at a time. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-05 11:23:06 +01:00
Rob Bradford	73c4156775	vmm, devices: Update to latest acpi_tables crate API Significant API changes have occured, most significantly is the switch to an approach which does not require vm-memory and can run no_std. Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2023-03-03 13:08:36 +00:00
Yong He	76d6d28f3e	vmm: do not start signal thread to resize console if no need Now cloud hypervisor will start signal thread to catch SIGWINCH signal, cloud hypervisor then will resize the guest console via vconsole. This patch skip starting signal thread when there is no need to resize guest console, such as console is not configured. Signed-off-by: Yong He <alexyonghe@tencent.com>	2023-02-28 09:40:07 -08:00
Yong He	3494080e2f	vmm: add configuration for network offloading features Add new configuration for offloading features, including Checksum/TSO/UFO, and set these offloading features as enabled by default. Fixes: #4792. Signed-off-by: Yong He <alexyonghe@tencent.com>	2023-01-12 09:05:45 +00:00
Rob Bradford	5e52729453	misc: Automatically fix cargo clippy issues added in 1.65 (stable) Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-12-14 14:27:19 +00:00
Sebastien Boeuf	3931b99d4e	vm-migration: Introduce new constructor for Snapshot This simplifies the Snapshot creation as we expect a SnapshotData to be provided most of the time. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-12-09 10:26:06 +01:00
Sebastien Boeuf	4ae6b595d7	vm-migration: Rename add_data_section() into add_data() Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-12-09 10:26:06 +01:00
Sebastien Boeuf	748018ace3	vm-migration: Don't store the id as part of Snapshot structure The information about the identifier related to a Snapshot is only relevant from the BTreeMap perspective, which is why we can get rid of the duplicated identifier in every Snapshot structure. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-12-09 10:26:06 +01:00
Sebastien Boeuf	4517b76a23	vm-migration: Rename SnapshotDataSection into SnapshotData Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-12-09 10:26:06 +01:00
Sebastien Boeuf	5b3bcfa233	vm-migration: Snapshot should have a unique SnapshotDataSection There's no reason to carry a HashMap of SnapshotDataSection per Snapshot. And given we now provide at most one SnapshotDataSection per Snapshot, there's no need to keep the id part of the SnapshotDataSection structure. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-12-09 10:26:06 +01:00
Rob Bradford	b3e3a5fdd7	vmm: Fix clippy on musl toolchains The datatype used for the ioctl() C library call is different between it and the glibc toolchains. The easiest solution is to have the compiler type cast to type of the parameter. Signed-off-by: Rob Bradford <robert.bradford@intel.com>	2022-12-07 17:50:48 +00:00
Michael Zhao	b173f6f654	vmm,devices: Change Gic snapshot and restore path The snapshot and restore of AArch64 Gic was done in Vm. Now it is moved to DeviceManager. The benefit is that the restore can be done while the Gic is created in DeviceManager. While the moving of state data from Vm snapshot to DeviceManager snapshot breaks the compatability of migration from older versions. Signed-off-by: Michael Zhao <michael.zhao@arm.com>	2022-12-01 17:07:25 +01:00
Sebastien Boeuf	a6959a7469	vmm: Move DeviceManager to new restore design Based on all the work that has already been merged, it is now possible to fully move DeviceManager out of the previous restore model, meaning there's no need for a dedicated restore() function to be implemented there. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-12-01 13:46:31 +01:00
Sebastien Boeuf	b62a40efae	virtio-devices, vmm: Always restore virtio devices in paused state Following the new restore design, it is not appropriate to set every virtio device threads into a paused state after they've been started. This is why we remove the line of code pausing the devices only after they've been restored, and replace it with a small patch in every virtio device implementation. When a virtio device is created as part of a restored VM, the associated "paused" boolean is set to true. This ensures the corresponding thread will be directly parked when being started, avoiding the thread to be in a different state than the one it was on the source VM during the snapshot. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2022-12-01 09:27:00 +01:00

1 2 3 4 5 ...

687 Commits