cloud-hypervisor

mirror of https://github.com/cloud-hypervisor/cloud-hypervisor.git synced 2025-01-26 14:35:20 +00:00

Author	SHA1	Message	Date
Yu Li	e139cdfd69	arch: create memory mapping by the actual memory info The original codes did not consider that the previous memory region might not be full and always set it to the maximum size. This commit fixes this problem by creating memory mappings based on the actual memory details in both E820 on x86_64 and fdt on aarch64. Fixes: #5463 Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>	2023-07-05 09:36:08 -07:00
Yu Li	541de8b757	logger: use `write` with `\r\n` instead of `writeln` The device manager will set tty or pty to raw mode, all the `\n` will be LF without CR, which makes the output difficult to read. This commit solves it by using `write` with `\r\n` instead of `writeln`, which can print CR and LF explicitly. Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>	2023-07-05 09:36:08 -07:00
Yu Li	184dac70a0	vmm: use `unwrap_or` instead of `match` for `prefault` Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>	2023-07-05 09:36:08 -07:00
Jianyong Wu	022b489e7b	arch: x86_64: Populate the APIC Id Program the APIC ID (CPUID leaf 0x1 EBX) with the CPU id. This resolves an issue where the EDKII firmware expects the APIC ID to vary per-CPU. Fixes: #5475 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-07-05 09:36:08 -07:00
Alyssa Ross	399e2f9f7d	vmm, virtio-devices: allow mremap for consoles SerialBuffer uses VecDeque::extend, which calls realloc, which a maximum buffer size of 1 MiB. Starting at allocation sizes of 128 KiB, musl's mallocng allocator will use mremap for the allocation. Since this was not permitted by the seccomp rules, heavy write load could crash cloud-hypervisor with a seccomp failure. (Encountered using virtio-console, but I don't see any reason it wouldn't happen for the legacy serial device too.) Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-07-05 09:36:08 -07:00
Rafael Mendonca	22cc96494f	main: Fix error propagation if starting the VM fails Commit 21d40d7 ("main: reset tty if starting the VM fails") changed start_vmm() to join the vmm thread if an error happens after the vmm thread is started. The implementation put all the error-prone code that is run after the vmm is started in a closure, to be able to always join the vmm thread, regardless of any error happening. However, it missed propagating the error that might happen inside the closure back to the main function, after joining the vmm thread. For some cmd line options, the above issue inhibits proper error reporting when starting a VM with invalid commands, as many parameters are parsed after the vmm is started, thus if such parsing fails, no error will be reported back to the user. See: #5435 Fixes: 21d40d7 ("main: reset tty if starting the VM fails") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>	2023-07-05 09:36:08 -07:00
Bo Chen	f98402ec15	vmm: Allocate guest memory address space before TDX initialization The refactoring on deferring address space allocation (#5169) broke TDX, as TDX initialization needs to access guest memory for encryption and measurement of guest pages. Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-07-05 09:36:08 -07:00
Jianyong Wu	acc54ade7b	vfio: align memory region size and address to PAGE_SIZE In current implementation, memory region used in vfio is assumed to align to 4k which may cause error when the PAGE_SIZE is not 4k, like on Arm, it can be 16k and 64k. Remove this assumption and align memory resource used by vfio to PAGE_SIZE then vfio can run on host with 64k PAGE_SIZE. Fixes: #5292 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-07-05 09:36:08 -07:00
Alyssa Ross	0ebbb3f8a2	vmm: allow getdents64 in seccomp filter This is used on older kernels where close_range() is not available. Signed-off-by: Alyssa Ross <hi@alyssa.is> Fixes: 505f4dfa ("vmm: close all unused fds in sigwinch listener")	2023-07-05 09:36:08 -07:00
Anatol Belski	95511287ec	tests: Enable topology integration tests under mshv Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>	2023-07-05 09:36:08 -07:00
Anatol Belski	5a3af30e6a	seccomp: Add filter entry for MSHV_VP_REGISTER_INTERCEPT_RESULT Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>	2023-07-05 09:36:08 -07:00
Anatol Belski	034b48faf7	mshv: Pass topology explicitly while constructing cpuid Unlike KVM, there's no internal handling for topoolgy under MSHV. Thus, if no topology has been passed during the CH launch, use the boot CPUs count to construct the topology struct. Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>	2023-07-05 09:36:08 -07:00
Anatol Belski	ba3e02ce86	hypervisor: mshv: Implement set_cpuid2 call Passing the CPUID leafs with the topology is integrated into the common mechanism of setting and patching CPUID in Cloud Hypervisor. All the CPUID values will be passed to the hypervisor through the register intercept call. Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>	2023-07-05 09:36:08 -07:00
Alyssa Ross	5492259af9	main: reset tty if starting the VM fails When I refactored this to centralise resetting the tty into DeviceManager::drop, I tested that the tty was reset if an error happened on the vmm thread, but not on the main thread. It turns out that if an error happened on the main thread, the process would just exit, so drop handlers on other threads wouldn't get run. To fix this, I've changed start_vmm() to write to the VMM's exit eventfd and then join the thread if an error happens after the vmm thread is started. Fixes: b6feae0a ("vmm: only touch the tty flags if it's being used") Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-07-05 09:36:08 -07:00
Alyssa Ross	cc1254d5e1	vmm: reset to the original termios Previously, we used two different functions for configuring ttys. vmm_sys_util::terminal::Terminal::set_raw_mode() was used to configure stdio ttys, and cfmakeraw() was used to configure ptys created by cloud-hypervisor. When I centralized the stdio tty cleanup, I also switched to using cfmakeraw() everywhere, to avoid duplication. cfmakeraw sets the OPOST flag, but when we later reset the ttys, we used vmm_sys_util::terminal::Terminal::set_canon_mode(), which does not unset this flag. This meant that the terminal was getting mostly, but not fully, reset. To fix this without depending on the implementation of cfmakeraw(), let's just store the original termios for stdio terminals, and restore them to exactly the state we found them in when cloud-hypervisor exits. Fixes: b6feae0a ("vmm: only touch the tty flags if it's being used") Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-07-05 09:36:08 -07:00
Rob Bradford	34bb3319d4	hypervisor, vmm: Limit max number of vCPUs to hypervisor maximum On KVM this is provided by an ioctl, on MSHV this is constant. Although there is a HV_MAXIMUM_PROCESSORS constant the MSHV ioctl API is limited to u8. Signed-off-by: Rob Bradford <rbradford@rivosinc.com>	2023-07-05 09:36:08 -07:00
Bo Chen	d530569ac2	build: Release v31.1 (bug fix release) Signed-off-by: Bo Chen <chen.bo@intel.com> v31.1	2023-04-18 16:41:02 -07:00
Bo Chen	75956e64ec	tests: Extend '_test_macvtap()' with reboot In this way, we can cover the scenario where a VM with hotplugged net device using FDs can work properly with reboot. Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-04-18 10:47:33 -07:00
Bo Chen	ae646c2a00	vmm: Add valid FDs for TAP devices to 'VmConfig::preserved_fds' In this way, valid FDs for TAP devices will be closed when the holding VmConfig instance is destroyed. Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-04-18 10:47:33 -07:00
Bo Chen	04d3e5bbf5	vmm: Add unit test for 'VmConfig::preserved_fds' Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-04-18 10:47:33 -07:00
Bo Chen	cbe972659c	vmm: Implement Clone and Drop for VmConfig The custom 'clone' duplicates 'preserved_fds' so that the validation logic can be safely carried out on the clone of the VmConfig. The custom 'drop' ensures 'preserved_fds' are safely closed when the holding VmConfig instance is destroyed. Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-04-18 10:47:33 -07:00
Bo Chen	1e4e03d110	vmm: config: Extend 'VmConfig' with 'preserved_fds' Preserved FDs are the ones that share the same life-time as its holding VmConfig instance, such as FDs for creating TAP devices. Preserved FDs will stay open as long as the holding VmConfig instance is valid, and will be closed when the holding VmConfig instance is destroyed. Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-04-18 10:47:33 -07:00
Bo Chen	4876f7550d	Revert "vmm: config: Implement Clone for NetConfig" This reverts commit ea4a95c4f65b949e983b335abbb077c832596381. Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-04-18 10:47:33 -07:00
Bo Chen	c0146e3ef1	Revert "vmm: config: Close FDs for TAP devices that are provided to VM" This reverts commit b14427540b1ec159719dae8f48e14f65a877cc50. Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-04-18 10:47:33 -07:00
Bo Chen	77a205881b	Revert "vmm: config: Don't close reserved FDs from `NetConfig::drop()`" This reverts commit 0110fb4edc9ffa13d33f1c41c708f4bd844b0aee. Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-04-18 10:47:33 -07:00
Bo Chen	321421c53e	Revert "vmm: config: Avoid closing invalid FDs from 'test_net_parsing()'" This reverts commit 0567def931b2d7e76be44393b44a1bb9239d7406. Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-04-18 10:47:33 -07:00
Bo Chen	48a87e699d	Revert "vmm: config: Replace use of memfd_create with fd pointing to /dev/null" This reverts commit 46066d6ae1ef5dca1b80a4bf9650a4ffc1ff10f5. Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-04-18 10:47:33 -07:00
Alyssa Ross	147a800d5d	vmm: only touch the tty flags if it's being used When neither serial nor console are connected to the tty, cloud-hypervisor shouldn't touch the tty at all. One way in which this is annoying is that if I am running cloud-hypervisor without it using my terminal, I expect to be able to suspend it with ^Z like any other process, but that doesn't work if it's put the terminal into raw mode. Instead of putting the tty into raw mode when a VM is created or restored, do it when a serial or console device is created. Since we now know it can't be put into raw mode until the Vm object is created, we can move setting it back to canon mode into the drop handler for that object, which should always be run in normal operation. We still also put the tty into canon mode in the SIGTERM / SIGINT handler, but check whether the tty was actually used, rather than whether stdin is a tty. This requires passing on_tty around as an atomic boolean. I explored more of an abstraction over the tty — having an object that encapsulated stdout and put the tty into raw mode when initialized and into canon mode when dropped — but it wasn't practical, mostly due to the special requirements of the signal handler. I also investigated whether the SIGWINCH listener process could be used here, which I think would have worked but I'm hesitant to involve it in serial handling as well as conosle handling. There's no longer a check for whether the file descriptor is a tty before setting it into canon mode — it's redundant, because if it's not a tty it just won't respond to the ioctl. Tested by shutting down through the API, SIGTERM, and an error injected after setting raw mode. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-18 10:47:33 -07:00
Alyssa Ross	eaf8cbd47d	vmm: don't redundantly set the TTY to canon mode If the VM is shut down, either it's going to be started again, in which case we still want to be in raw mode, or the process is about to exit, in which case canon mode will be set at the end of main. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-18 10:47:33 -07:00
Alyssa Ross	9d24e862eb	vmm: only use KVM_ARM_VCPU_PMU_V3 if available Having PMU in guests isn't critical, and not all hardware supports it (e.g. Apple Silicon). CpuManager::init_pmu already has a fallback for if PMU is not supported by the VCPU, but we weren't getting that far, because we would always try to initialise the VCPU with KVM_ARM_VCPU_PMU_V3, and then bail when it returned with EINVAL. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-18 10:47:33 -07:00
Alyssa Ross	11324ac21c	virtio-devices: seccomp: add vhost-user syscalls Cloud Hypervisor's vhost-user implementation will reconnect if it gets disconnected from the backend. That means connections happen inside the vhost-user seccomp sandbox, so all syscalls used in reconnecting have to be allowed in that sandbox. clock_nanosleep is used by Glibc, and nanosleep is used by musl. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-18 10:47:33 -07:00
Bo Chen	ce75865e2c	vmm: Ignore and warn TAP FDs sent via the HTTP request body Valid FDs can only be sent from another process via `SCM_RIGHTS`. Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-04-18 10:47:33 -07:00
Michael Zhao	f3522e85fc	build: Release v31.0 Signed-off-by: Michael Zhao <michael.zhao@arm.com> v31.0	2023-04-06 07:05:11 -07:00
dependabot[bot]	d3ac6a85e0	build: Bump serde_with from 2.3.1 to 2.3.2 in /fuzz Bumps [serde_with](https://github.com/jonasbb/serde_with) from 2.3.1 to 2.3.2. - [Release notes](https://github.com/jonasbb/serde_with/releases) - [Commits](https://github.com/jonasbb/serde_with/compare/v2.3.1...v2.3.2) --- updated-dependencies: - dependency-name: serde_with dependency-type: indirect update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2023-04-06 00:11:31 +00:00
Alyssa Ross	38a1b45783	vmm: use the SIGWINCH listener for TTYs too Previously, we were only using it for PTYs, because for PTYs there's no alternative. But since we have to have it for PTYs anyway, if we also use it for TTYs, we can eliminate all of the code that handled SIGWINCH for TTYs. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-05 11:23:06 +01:00
Alyssa Ross	e9841db486	vmm: don't ignore errors from SIGWINCH listener Now that the SIGWINCH listener has fallbacks for older kernels, we don't expect it to routinely fail, so if there's an error setting it up, we want to know about it. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-05 11:23:06 +01:00
Alyssa Ross	c1f555cde3	vmm: fall back if CLONE_CLEAR_SIGHAND unsupported This will allow the SIGWINCH listener to run on kernels older than 5.5, although on those kernels it will have to make 64 syscalls to reset all the signal handlers. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-05 11:23:06 +01:00
Alyssa Ross	505f4dfa53	vmm: close all unused fds in sigwinch listener The PTY main file descriptor had to be introduced as a parameter to start_sigwinch_listener, so that it could be closed in the child. Really the SIGWINCH listener process should not have any file descriptors open, except for the ones it needs to function, so let's make it more robust by having it close all other file descriptors. For recent kernels, we can do this very conveniently with close_range(2), but for older kernels, we have to fall back to closing open file descriptors one at a time. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-05 11:23:06 +01:00
Alyssa Ross	67ad3ff1ba	scripts: run doc tests Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-05 11:22:47 +01:00
Alyssa Ross	755cabea4c	hypervisor: use proper doc tests for examples It seems like these examples were always intended to be doctests, since there are lines marked with "#" so that they are excluded from the generated documentation, but they were not recognised as doc tests because they were not formatted correctly. The code needed some adjustments so that it would actually compile and run as doctests. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-05 11:22:47 +01:00
Alyssa Ross	1ed4898d28	hypervisor: fix building doctests When doctests are built, the crate is built with itself as a dependency via --extern. This causes a compiler error if using a module with the name same as the crate, because it's ambiguous whether it's referring to the module, or the extern version of the crate, so it's necessary to disambiguate when using the hypervisor module here. Fixes running cargo test --doc --workspace. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-05 11:22:47 +01:00
Alyssa Ross	a807b91f86	virtio-devices: fix accidental HTML in doc comments Doc comments are Markdown, and can include HTML tags. Anything in angle brackets will therefore be inserted as an HTML tag into rustdoc's output. If that's not intentional, the left angle bracket needs to be escaped. I haven't fixed the doc comments in src/main.rs, because argh doesn't understand the escaping, so the backslashes would show up in the --help output. I've opened https://github.com/google/argh/issues/159 about that. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-04 17:38:21 -07:00
Alyssa Ross	f6236087d8	virtio-devices: fix broken vsock doc comments These need to be //! comments, because they apply to the module as a whole, not to whatever directly follows the comment. Using /// comments here resulted in documentation being attached to the wrong thing, or not rendered at all. I've also checked the Markdown formatting of these comments as rendered by rustdoc, and fixed it where appropriate. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-04 17:38:21 -07:00
Alyssa Ross	95f83320b1	arch: use a non-doc comment for diagram This doesn't need to be rendered in the HTML API documentation, and wouldn't be formatted correctly if it was. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-04 17:38:21 -07:00
dependabot[bot]	dcc858a403	build: Bump libc from 0.2.139 to 0.2.141 in /fuzz Bumps [libc](https://github.com/rust-lang/libc) from 0.2.139 to 0.2.141. - [Release notes](https://github.com/rust-lang/libc/releases) - [Commits](https://github.com/rust-lang/libc/compare/0.2.139...0.2.141) --- updated-dependencies: - dependency-name: libc dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2023-04-05 00:27:44 +00:00
Alyssa Ross	57ea412c64	hypervisor: make buildable independently It was not possible to build just hypervisor with Cargo's -p flag, because it was not properly specifying the features it requires from vfio-ioctls. Signed-off-by: Alyssa Ross <hi@alyssa.is>	2023-04-04 09:57:19 -07:00
Muminul Islam	e32c9525c0	tests: disable virtio_balloon_free_page_reporting on mshv balloon_free_page_reporting test case should not work as expected. The reason is that MSHV pins all the pages during the memory map for the guest. Those pages can not be altered without unpinning the pages. MSHV does not support modifying the pages during the guest life cycle. This test case can be enabled once we add VA backed VM support. Signed-off-by: Muminul Islam <muislam@microsoft.com>	2023-04-04 07:41:04 -07:00
Bo Chen	2b4f60e57b	docs: Remove directory support from MemoryZoneConfig::file Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-04-04 06:49:18 -07:00
Bo Chen	5736205cf0	tests: Use native support of shared memory This replaces the deprecated way of allocating anonymous shared memory explicitly from '/dev/shm'. Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-04-04 06:49:18 -07:00
Ravi kumar Veeramally	a8d1849485	vmm: Remove directory support from MemoryZoneConfig::file Fixes: #5082 Signed-off-by: Ravi kumar Veeramally <ravikumar.veeramally@intel.com>	2023-04-04 06:49:18 -07:00

1 2 3 4 5 ...

6922 Commits