Rebooting a VM fails with the following error when debug assertions
are enabled:
fatal runtime error: IO Safety violation: owned file descriptor already closed
This happens because FromRawFd::from_raw_fd is used on RawFds stored
in ConsoleInfo every time a VM begins to boot, so the second
time (after a reboot, or if the first attempt to boot via the API
failed), the fd will be closed. Until this assertion is hit, the code
is operating on either closed file descriptors, or new file
descriptors for something completely different. If debug assertions
are disabled, it will just continue doing this with unpredictable
results.
To fix this, and prevent the problem reocurring, ownership of the
console file descriptors needs to be properly tracked, using Rust's
type system, so this commit refactors the console code to do that.
The file descriptors are now passed around with reference counts, so
they won't be closed prematurely. The obvious way to do this would be
to just have each member of ConsoleInfo be an Arc<File>, but we need
to accomodate that serial console file descriptors can also be
sockets. We can't just store an OwnedFd and convert it when it's
used, because we only get a reference from the Arc, so we need to
store the descriptors as their concrete types in an enum. Since this
basically duplicates the ConsoleOutputMode enum from the config, the
ConsoleOutputMode enum is now not used past constructing the
ConsoleInfo.
So that ownership can be represented consistently, the debug console's
tty mode now uses its own stdout descriptor.
I'm still using .try_clone().unwrap() (i.e. dup()) to clone file
descriptors for Endpoint::FilePair and Endpoint::TtyPair, because I
assume there's a reason for them not just to hold a single file
descriptor.
I've also retained the existing behaviour of having serial manager
ignore the tty file descriptor passed to it (which is stdout), and
instead using stdin. It looks a lot weirder now, because it has to
explicitly indicate it's ignoring the fd with an underscore binding.
Fixes: 52eebaf6 ("vmm: refactor DeviceManager to use console_info")
Signed-off-by: Alyssa Ross <hi@alyssa.is>
Complete the removal of the DAX support by removing the use of
non-standard messages. These messages have since been removed from the
vhost_user crate (rust-vmm/vhost#246) and so need to be removed from our
implementation since that would otherwise block updating to a newer
version of the crate.
The ability to enable DAX support in Cloud Hypervisor has been disabled
some time ago but this code was residual with no way to enable it.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
And modify to code to use the updated interfaces.
Arguments for map_guest_memory, get_dirty_bitmap, vp.run(),
import_isolated_pages, modify_gpa_host_access have changed.
Update these to use the new interfaces, including new MSHV_*
definitions, and remove some redundant arguments.
Update seccomp IOCTLs to reflect interface changes.
Fix irq-related definitions naming.
Bump vfio-ioctls to support mshv v0.3.0.
Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
Bound checks for virtio-mem and ACPI memory hotplug are off by
one and two, respectively. This prevents users to fully use the reserved
memory hotplug size.
For ACPI, if we specific `--memory size=2G,hotplug_size=4G` and run
`ch-remote resize --memory 6G`, cloud-hypervisor will report the
following error because of the incorrect bound check:
`<vmm> ERROR:vmm/src/lib.rs:1631 -- Error when resizing VM:
MemoryManager(InsufficientHotplugRam)`
Similarly, for virtio-mem, cloud-hypervisor will fail the incorrect
bound check and abort the resize. The VM will see the following error
in dmesg:
`virtio_mem virtio3: unknown error, marking device broken: -22`
This patch has fixed both bound checks and ensure that users can
hot add memory up to the reserved hotplug size.
Signed-off-by: Yuhong Zhong <yz@cs.columbia.edu>
With cd0cdac ("virtio-devices: Fix seccomp rules for SevSnp guest"), the
sys_ioctl rule that was applied in virtio_thread_common, would override
previously specified sys_ioctl rules for individual thread type. This
causes the SevSnp guest to crash with seccomp violation.
Fixes: cd0cdac ("virtio-devices: Fix seccomp rules for SevSnp guest")
Signed-off-by: Purna Pavan Chandra <paekkaladevi@microsoft.com>
An example warning output is:
error: first doc comment paragraph is too long
--> virtio-devices/src/lib.rs:158:1
|
158 | / /// Convert an absolute address into an address space (GuestMemory)
159 | | /// to a host pointer and verify that the provided size define a valid
160 | | /// range within a single memory region.
161 | | /// Return None if it is out of bounds or if addr+size overlaps a single region.
| |_
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#too_long_first_doc_paragraph
= note: `-D clippy::too-long-first-doc-paragraph` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(clippy::too_long_first_doc_paragraph)]`
Signed-off-by: Bo Chen <chen.bo@intel.com>
In case of SEV-SNP guests on MSHV, any host component including VMM
trying to access any guest memory region, needs to call gain_access API
to acquire access to that particular memory region. This applies to all
the virtqueue buffers as well which the VMM is using to use to perform
DMA into the guest and in order to facilitate device emulation.
While creating various virtqueues we are already using access platform
hook to translate the addresses but currently we are missing the size
arguments which would be required by the SevSnpAccessPlatformProxy to
gain access to the memory of these queues.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
This would be used in case of SEV-SNP guest because we need to accquire
access to these virtqueue segments before accessing them.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
When the rate limit was reached it was possible for the notification to
the guest to be lost since the logic to handle the notification was
tightly coupled with processing the queue. The notification would
eventually be triggered when the rate limit pool was refilled but this
could add significant latency.
Address this by refactoring the code to separate processing queue and
signalling - the processing of the queue is suspended when the rate
limit is reached but the signalling will still be attempted if needed
(i.e. VIRTIO_F_EVENT_IDX is still considered.)
Signed-off-by: wuxinyue <wuxinyue.wxy@antgroup.com>
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
With commit 1e967697c ("vmm: pass AccessPlatform implementation for
SEV-SNP guest"), we started performing one additional ioctl to gain
access to the guest memory before accessing those regions inside
virtio-device emulation code path. This additional ioctl is not part of
the current seccomp filter, which is causing the SevSnp guest to crash
in this scenario with seccomp violation.
Fixes: 1e967697c ("vmm: pass AccessPlatform implementation for SEV-SNP
guest")
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
This fixes a issue of running vm compiled in debug with Rust
1.80.0 or later, where this check was introduced.
Signed-off-by: Wenyu Huang <huangwenyuu@outlook.com>
This fixes the vsock::device::tests::test_virtio_device test with Rust
1.80.0 or later, where this check was introduced.
Signed-off-by: Alyssa Ross <hi@alyssa.is>
Support event idx feature for virtio-blk device.
This feature could improve disk IO performance by suppressing
notifications from guest to host and interrupts from host to
guest, which has been already supported in virtio-net and
vhost-user devices.
To achieve this, virtqueue's event-idx-related API is
leveraged for avail_event field update and needs_notification
check.
Fixes: #6580
Signed-off-by: wuxinyue <wuxinyue.wxy@antgroup.com>
As per VirtIO spec 1.2 section 5.2.6, the `status` field is a byte, not
u32. cloud-hypervisor writes an `u32` to guest memory, which
accidentally zeros out the following 3 bytes, and may corrupt guest OS
internal state.
Signed-off-by: Changyuan Lyu <changyuanl@google.com>
TIOCGWINSZ modifies its argument, so it needs to mutably borrow it.
Unfortunately, ioctl()'s signature is not able to enforce this, and
the write happens in the kernel, so I don't think anything like miri,
valgrind, UBSan, etc. would have been able to catch this.
The UB passing an immutable reference caused resulted, for me, in
get_win_size() returning (0, 0) since LLVM commit
9a09c737a052 ("[BasicAA] Make isNotCapturedBeforeOrAt() check for
calls more precise (#69931)").
I've had a look through the other ioctl() calls in Cloud Hypervisor,
and I don't think any others have the same problem.
Signed-off-by: Alyssa Ross <hi@alyssa.is>
In 42e9632c53 a fix was made to address a
typo in the taplo configuration file. Fixing this typo indicated that
many Cargo.toml files were no longer adhering to the formatting rules.
Fix the formatting by running `taplo fmt`.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Since vdpa device does not support pause/resume [1], it does not make
sense to restore on paused state.
[1] 099cdd2af8
Signed-off-by: Bo Chen <chen.bo@intel.com>
Misspellings were identified by:
https://github.com/marketplace/actions/check-spelling
* Initial corrections based on forbidden patterns from the action
* Additional corrections by Google Chrome auto-suggest
* Some manual corrections
* Adding markdown bullets to readme credits section
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
The compiler is now able to warn if an invalid attribute (e.g like a
feature) is not available.
See https://blog.rust-lang.org/2024/05/06/check-cfg.html for more
details.
Add build.rs files in the crates that use #cfg(fuzzing) to add fuzzing
to the list of valid cfg attributes.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Code in this crate is conditional on this feature so it necessary to
expose as a new feature and use that feature as a dependency when the
feature is enabled on the vmm crate.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Update the vhost-user-backend crate version used along with related
crates (vhost and virtio-queue.) This requires minor changes to the
types used for the memory in the backends with the use of the
BitmapMmapRegion type for the Bitmap implementation.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
The mem_size field is not needed in TestContext. Drop it.
Make sure guest_evvq is read once. Clippy cannot figured out that it was
used.
While at it, add an extra assert for the spurious rxvq event test, too.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
In accordance with reuse requirements:
- Place each license file in the LICENSES/ directory
- Add missing SPDX-License-Identifier to files.
- Add .reuse/dep5 to bulk-license files
Fixes: #5887
Signed-off-by: Ruslan Mstoi <ruslan.mstoi@intel.com>
The caller shouldn't pass in an &str that's too long. This is a
precaution if something goes wrong in the caller.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Properly detach a device from a domain if that device is already
attached to another domain on an attach request (following section
5.13.6.3.2 of the virtio-iommu spec). Resolves nested virtualization
reboot.
Signed-off-by: Andrew Carp <acarp@crusoeenergy.com>
Ensures that any endpoints already attached to the domain are properly
mapped to a new endpoint on said endpoint's attach request. This is done
by search for all previous mappings in the domain and then issuing map
requests for the newly attached endpoint.
Signed-off-by: Andrew Carp <acarp@crusoeenergy.com>