Commit Graph

2495 Commits

Author SHA1 Message Date
Praveen K Paladugu
1d89f98edf vmm: Introduce landlock-rules cmdline param
Users can use this parameter to pass extra paths that 'vmm' and its
child threads can use at runtime. Hotplug is the primary usecase for
this parameter.

In order to hotplug devices that use local files: disks, memory zones,
pmem devices etc, users can use this option to pass the path/s that will
be used during hotplug while starting cloud-hypervisor. Doing this will
allow landlock to add required rules to grant access to these paths when
cloud-hypervisor process starts.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2024-07-06 04:42:58 +00:00
Praveen K Paladugu
287dbd4fc9 vmm: Introduce landlock cmdline parameter
Users can use this cmdline option to enable/disable Landlock based
sandboxing while running cloud-hypervisor.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-07-06 04:42:58 +00:00
Praveen K Paladugu
c50ea2c708 vmm: Add seccomp rules to allow landlock syscalls
landlock syscalls are required by event_monitor, signal_handler,
http-server and vmm threads. Rest of the threads are spawned by the vmm
thread and they automatically inherit the ruleset from the vmm thread.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-07-06 04:42:58 +00:00
Rob Bradford
08cf983d42 build: Fix Cargo.toml formatting
In 42e9632c53 a fix was made to address a
typo in the taplo configuration file. Fixing this typo indicated that
many Cargo.toml files were no longer adhering to the formatting rules.
Fix the formatting by running `taplo fmt`.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-06-18 16:19:12 +00:00
Wei Liu
254db7b96a vmm: fix documentation formatting
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2024-06-12 16:59:20 +00:00
Praveen K Paladugu
9f969ee18d vmm: Use cloned fd to check if dev is a tty
While checking if the console device is a tty use the cloned fd instead
of libc::STDOUT_FILENO.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-06-12 15:47:19 +00:00
Praveen K Paladugu
c3fcddf830 vmm: Fix console dev handling in live migration
Console devices are created after vm_config is received and the created
devices are passed Vm during vm_receive_state.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-06-12 15:47:19 +00:00
Praveen K Paladugu
11d98fccac vmm: fix a typo in ioctl name
Rename TIOCGTPEER ioctl to it proper name:TIOCGPTPEER.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-06-12 15:47:19 +00:00
Praveen K Paladugu
a8fa2af64b vmm: dup serial fds to preserve them across reboots
During vm_shutdown or vm_snapshot, all the console devices will be
closed. When this happens stdout (FD #2) will also be closed as the
console device using these FD is closed. If the VM were to be started
later, FD#2 can be assigned to a different file. But
pre_create_console_devices looks for FD#2 while opening tty device,
which could point to any file.

To avoid this problem, the STDOUT FD is duplicated when being
assigned to a console device. Even if the console devices were to be
closed, the duplicated FD will be closed and FD#2 will continue to
point to STDOUT.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-06-12 15:47:19 +00:00
Praveen K Paladugu
dc723171a7 vmm: cleanup legacy console device management
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-06-12 15:47:19 +00:00
Praveen K Paladugu
52eebaf6b2 vmm: refactor DeviceManager to use console_info
While adding console devices, DeviceManager will now use the FDs in
console_info instead of creating them.

To reduce the size of this commit, I marked some variables are unused
with '_' prefix. All those variables are cleaned up in next commit.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-06-12 15:47:19 +00:00
Praveen K Paladugu
380ba564f4 vmm: populate console_info during vm actions
Use pre_create_console_devices method to create and populate console
device FDs into console_info in Vmm Object.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-06-12 15:47:19 +00:00
Praveen K Paladugu
385f9a9aa9 vmm: save console_resize_pipe info to Vmm
With this change all the information to manage console devices is now
available within Vmm Object.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-06-12 15:47:19 +00:00
Praveen K Paladugu
d784bf0c75 vmm: move listen_for_sigwinch_on_tty method
Move listen_for_sigwinch_on_tty to sigwinch_listener.rs module.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-06-12 15:47:19 +00:00
Praveen K Paladugu
cf6115a73c vmm: Introduce console_devices module
Introduce ConsoleInfo struct. This struct will be used to store FDs of
console devices created in pre_create_console_devices and passed to
vm_boot.

Move set_raw_mode, create_pty methods to console_devices.rs to
consolidate console management methods into a single module.

Lastly, copy the logic to create and configure console devices into
pre_create_console_devices method.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-06-12 15:47:19 +00:00
dependabot[bot]
e048da6f73 build: Bump blocking from 1.5.1 to 1.6.1
Bumps [blocking](https://github.com/smol-rs/blocking) from 1.5.1 to 1.6.1.
- [Release notes](https://github.com/smol-rs/blocking/releases)
- [Changelog](https://github.com/smol-rs/blocking/blob/master/CHANGELOG.md)
- [Commits](https://github.com/smol-rs/blocking/compare/v1.5.1...v1.6.1)

---
updated-dependencies:
- dependency-name: blocking
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-06-12 00:32:18 +00:00
Josh Soref
42e9632c53 misc: Fix spelling issues
Misspellings were identified by:
  https://github.com/marketplace/actions/check-spelling

* Initial corrections based on forbidden patterns from the action
* Additional corrections by Google Chrome auto-suggest
* Some manual corrections
* Adding markdown bullets to readme credits section

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2024-06-08 16:31:30 +00:00
Jinank Jain
3414586995 arch: Change the default topology for x86 guests
Currently by default each core is allocated it's own socket. Basically
it is n socket 1 core 1 thread/core kind of a structure as witnessed
from within the guest.

CPU(s):                             8
On-line CPU(s) list:                0-7
Thread(s) per core:                 1
Core(s) per socket:                 1
Socket(s):                          8
NUMA node(s):                       1

This is not a good default topology because resources are distributed
across multiple sockets. For example, a Linux guest with multi socket
configuration will have to calibrate TSC per socket due to which it
might observe a higher amount of boot time than usual.

A better idea for default topology would be 1 socket n core 1
thread/core which ensure better resource locality.

After this change topology would change to:

CPU(s):                             8
On-line CPU(s) list:                0-7
Thread(s) per core:                 1
Core(s) per socket:                 8
Socket(s):                          1
NUMA node(s):                       1

Fixes: #6497

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
2024-06-04 17:08:18 +00:00
dependabot[bot]
d82846c954 build: Bump range_map_vec from 0.1.0 to 0.2.0
Bumps [range_map_vec](https://github.com/microsoft/range_map_vec) from 0.1.0 to 0.2.0.
- [Commits](https://github.com/microsoft/range_map_vec/compare/v0.1.0...v0.2.0)

---
updated-dependencies:
- dependency-name: range_map_vec
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-06-01 00:53:03 +00:00
Wei Liu
6bb3ad1b96 build: update IGVM crates
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2024-05-31 20:16:37 +00:00
Wei Liu
400837ff99 vmm: wrap a new fd in UnixListener in serial manager
The original code gave an owned fd to UnixListener. That made the same
fd wrapped into two owned files.

When the files were dropped, the same fd would be closed more than once.
A newly introduced check in Rust's stdlib caught that error.

A newly cloned fd should be given to UnixListener.

Fixes: #6485

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2024-05-27 19:24:28 +00:00
Wei Liu
a9e41c417a vmm: add a check to avoid wrapping -1 into an owned file
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2024-05-27 19:24:28 +00:00
Nuno Das Neves
30b6e412af hypervisor: mshv: Pin mshv crates to release tag v0.2.0
And bump vfio commit in Cargo.lock to align, since it should also point
to mshv v0.2.0.

Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
2024-05-23 17:37:49 +00:00
Omer Faruk Bayram
036e7e3797 vmm: ch-remote: replace deprecated zbus macros with new equivalents
Fixes deprecation related warnings introduced in #6400.

Signed-off-by: Omer Faruk Bayram <omer.faruk@sartura.hr>
2024-05-23 12:20:06 +00:00
dependabot[bot]
eb8f959361 build: Bump zbus from 3.15.2 to 4.1.2
Bumps [zbus](https://github.com/dbus2/zbus) from 3.15.2 to 4.1.2.
- [Release notes](https://github.com/dbus2/zbus/releases)
- [Commits](https://github.com/dbus2/zbus/compare/zbus-3.15.2...zbus-4.1.2)

---
updated-dependencies:
- dependency-name: zbus
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-22 15:51:56 +00:00
Muminul Islam
860939d677 vmm: pause/resume VM during the VM events
For MSHV we always create frozen partition, so we
resume the VM during boot. Also during pause and resume
VM events we call hypervisor specific API.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2024-05-16 14:17:07 +00:00
Purna Pavan Chandra
b82f25572b vmm: http_endpoint: Change PutHandler for VmRestore
Consume FDs passed via SCM_RIGHTs to VmRestore API and assign them
appropriately to RestoredNetConfig's fds field.

Signed-off-by: Purna Pavan Chandra <paekkaladevi@linux.microsoft.com>
2024-05-14 10:52:46 +00:00
Purna Pavan Chandra
584784a0f8 vmm: Support passing Net FDs to Restore
'NetConfig' FDs, when explicitly passed via SCM_RIGHTS during VM
creation, are marked as invalid during snapshot. See: #6332.
So, Restore should support input for the new net FDs. This patch adds
new field 'net_fds' to 'RestoreConfig'. The FDs passed using this new
field are replaced into the 'fds' field of NetConfig appropriately.

The 'validate()' function ensures all net devices from 'VmConfig' backed
by FDs have a corresponding 'RestoreNetConfig' with a matched 'id' and
expected number of FDs.

The unit tests provide different inputs to parse and validate functions
to make sure parsing and error handling is as per expectation.

Fixes #6286

Signed-off-by: Purna Pavan Chandra <paekkaladevi@linux.microsoft.com>
Co-authored-by: Bo Chen <chen.bo@intel.com>
2024-05-14 10:52:46 +00:00
Rob Bradford
3f8cd52ffd build: Format Cargo.toml files using taplo
Run the taplo formatter with the newly added configuration file

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-05-08 21:46:13 +00:00
Rob Bradford
7e25cc2aa0 build: Add "fuzzing" as a valid cfg(..) attribute
The compiler is now able to warn if an invalid attribute (e.g like a
feature) is not available.

See https://blog.rust-lang.org/2024/05/06/check-cfg.html for more
details.

Add build.rs files in the crates that use #cfg(fuzzing) to add fuzzing
to the list of valid cfg attributes.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-05-08 08:10:28 +00:00
Rob Bradford
ea23c16c5a build: Expose and use "sev_snp" feature on virtio-devices
Code in this crate is conditional on this feature so it necessary to
expose as a new feature and use that feature as a dependency when the
feature is enabled on the vmm crate.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-05-08 08:10:28 +00:00
Rob Bradford
fd43b79f96 build: Correctly enable dhat support in vmm crate
The "dhat-heap" feature needs to be enabled inside the vmm crate as a
depenency from the top-level as there is build time check for that
feature inside the vmm crate.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-05-08 08:10:28 +00:00
dependabot[bot]
a70808bae9 build: Bump thiserror from 1.0.58 to 1.0.60
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.58 to 1.0.60.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.58...1.0.60)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-08 00:08:24 +00:00
Rob Bradford
d10f20eb71 build: Bump vhost-user-backend, vhost, and virtio-queue
Update the vhost-user-backend crate version used along with related
crates (vhost and virtio-queue.) This requires minor changes to the
types used for the memory in the backends with the use of the
BitmapMmapRegion type for the Bitmap implementation.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-05-01 18:29:36 +00:00
Bo Chen
75e1dc2bce vmm: openapi: Do not provide default values for required fields
This is to resolve the inconsistencies from our openapi specification,
as default values do not make sense for required fields.

Reported-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-05-01 17:31:36 +00:00
Muminul Islam
030d84eb08 vmm: make clock data independent of hypervisor
As MSHV also implements set/get_clock data, this patch
removes the KVM feature guard and make it x86_64 only and
both for KVM and MSHV.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2024-04-29 16:46:26 +00:00
Wei Liu
f6d99d9a9b build: use released version of the IGVM crates
No functional change.

While at it, consolidate some of the IGVM related import directives.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2024-04-29 11:13:59 +00:00
Rob Bradford
b89657ea22 hypervisor, vmm: Don't re-export the contents of mshv_bindings::*
The contents of this crate may change and cause conflicts - re-exporting
the contents is unnecessary.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-04-25 20:53:53 +00:00
Rob Bradford
c022063ae8 hypervisor: Remove unused VmExit enum members
The members for {Io, Mmio}{Read, Write} are unused as instead exits of
those types are handled through the VmOps interface. Removing these is
also a prerequisite due to changes in the mutability of the
VcpuFd::run() method.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-04-25 20:53:53 +00:00
Thomas Barrett
e7e856d8ac vmm: add pci_segment mmio aperture configs
When using multiple PCI segments, the 32-bit and 64-bit mmio
aperture is split equally between each segment. Add an option
to configure the 'weight'. For example, a PCI segment with a
`mmio32_aperture_weight` of 2 will be allocated twice as much
32-bit mmio space as a normal PCI segment.

Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-04-24 09:35:19 +00:00
Muminul Islam
a750e6ec15 vmm: Add filter entry for MSHV_GET_PARTITION_PROPERTY
Add seccomp rule for getting partition property on MSHV.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2024-04-23 08:31:10 +00:00
Rob Bradford
10ab87d6a3 misc: Migrate away from versionize
Replace with serde instead.

Fixes: #6370

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-04-22 17:10:55 +00:00
dependabot[bot]
a5552b87f4 build: Bump flume from 0.10.14 to 0.11.0
Bumps [flume](https://github.com/zesterer/flume) from 0.10.14 to 0.11.0.
- [Changelog](https://github.com/zesterer/flume/blob/master/CHANGELOG.md)
- [Commits](https://github.com/zesterer/flume/commits)

---
updated-dependencies:
- dependency-name: flume
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-12 23:42:07 +00:00
Lucas Jacques
108af5a293 openapi: add missing pvpanic property to VmConfig
Signed-off-by: Lucas Jacques <contact@lucasjacques.com>
2024-04-09 08:53:39 +00:00
Yi Wang
e1bb5e71bf vmm: Avoid kernel panic when unmasking guest IRQ on AMD
Assigning KVM_IRQFD (when unmasking a guest IRQ) after
KVM_SET_GSI_ROUTING can avoid kernel panic on the guest that is not
patched with commit a80ced6ea514 (KVM: SVM: fix panic on out-of-bounds
guest IRQ) on AMD systems.

Meanwhile, it is required to deassign KVM_IRQFD (when masking a guest
IRQ) before KVM_SET_GSI_ROUTING (see #3827).

Fixes: #6353

Signed-off-by: Yi Wang <foxywang@tencent.com>
Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-04-07 08:58:03 +00:00
Rob Bradford
7966925c1c build: Bulk update dependencies
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-04-06 09:48:25 +00:00
Wei Liu
f3b0f59646 vmm: validate virtio-fs tag length
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2024-04-04 20:42:36 +00:00
dependabot[bot]
fa7a000dbe build: Bump vm-memory from 0.14.0 to 0.14.1
Bumps [vm-memory](https://github.com/rust-vmm/vm-memory) from 0.14.0 to 0.14.1.
- [Release notes](https://github.com/rust-vmm/vm-memory/releases)
- [Changelog](https://github.com/rust-vmm/vm-memory/blob/v0.14.1/CHANGELOG.md)
- [Commits](https://github.com/rust-vmm/vm-memory/compare/v0.14.0...v0.14.1)

---
updated-dependencies:
- dependency-name: vm-memory
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-03 07:19:10 +00:00
Andrew Carp
045964deee virtio-devices: Map mmio over virtio-iommu
Add infrastructure to lookup the host address for mmio regions on
external dma mapping requests. This specifically resolves vfio
passthrough for virtio-iommu, allowing for nested virtualization to pass
external devices through.

Fixes #6110

Signed-off-by: Andrew Carp <acarp@crusoeenergy.com>
2024-04-01 09:16:30 +00:00
Andrew Carp
a5e2460d95 virtio-devices: Move VfioDmaMapping to be in the pci crate
VfioUserDmaMapping is already in the pci crate, this moves
VfioDmaMapping to match the behavior. This is a necessary change to
allow the VfioDmaMapping trait to have access to MmioRegion memory
without creating a circular dependency. The VfioDmaMapping trait
needs to have access to mmio regions to map external devices over
mmio (a follow-up commit).

Signed-off-by: Andrew Carp <acarp@crusoeenergy.com>
2024-04-01 09:16:30 +00:00
Alexandru Matei
fbe3e4d642 vmm: memory_manager: don't set backing_file for virtio_mem regions
The memory region that is associated with the hotpluggable part of
a virtio-mem zone isn't backed by the file specified in the
MemoryZoneConfig. The file is used only for the fixed part of the
zone. When you try to restore a snapshot with virtio-mem, the
backing file is used for all its regions. This results in the
following error:

  VmRestore(MemoryManager(GuestMemoryRegion(MappingPastEof)))

This patch sets backing_file only for the fixed part of a virtio-mem
zone.

Fixes: #6337

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2024-03-29 20:11:20 +00:00
Nuno Das Neves
639db35635 vmm: Update and add seccomp IOCTL numbers for mshv
Add IOCTL number for generic hypercall ioctl (MSHV_ROOT_HVCALL).
Update IOCTL numbers for set/get vp state.

Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
2024-03-29 13:14:37 -07:00
Bo Chen
11fa24cdcb vmm: Explicitly set NetConfig FDs as invalid for (de)serialization
The 'NetConfig' may contain FDs which can't be serialized correctly, as
FDs can only be donated from another process via a Unix domain socket
with `SCM_RIGHTS`. To avoid false use of the serialized FDs, this patch
explicitly set 'NetConfig' FDs as invalid for (de)serialization.

See: #6286

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-03-26 18:41:38 +00:00
Bo Chen
6922e25e78 vmm: Move VM shutdown event to Vmm::vm_shutdown
Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-03-25 18:06:52 +00:00
Bo Chen
5997cfacbf vmm: Move VM boot events to Vmm::vm_boot
Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-03-25 18:06:52 +00:00
Wei Liu
55678b23ba vmm: add events for VM reboot
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-03-25 18:06:52 +00:00
Rob Bradford
b4b5f16268 vmm: acpi: Use .contains_key()
--> vmm/src/acpi.rs:708:14
    |
708 |               .get(&(DeviceType::Serial, DeviceType::Serial.to_string()))
    |  ______________^
709 | |             .is_some();
    | |______________________^ help: replace it with: `contains_key(&(DeviceType::Serial, DeviceType::Serial.to_string()))`
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_get_then_check
    = note: `-D clippy::unnecessary-get-then-check` implied by `-D warnings`
    = help: to override `-D warnings` add `#[allow(clippy::unnecessary_get_then_check)]`

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-03-19 18:36:22 +00:00
Rob Bradford
b8e84f09be vmm: vm: Remove redundant import
error: the item `GuestMemoryMmap` is imported redundantly
Error:     --> vmm/src/vm.rs:3136:9
     |
3135 |     use super::*;
     |         -------- the item `GuestMemoryMmap` is already imported here
3136 |     use crate::GuestMemoryMmap;
     |         ^^^^^^^^^^^^^^^^^^^^^^
     |
     = note: `-D unused-imports` implied by `-D warnings`
     = help: to override `-D warnings` add `#[allow(unused_imports)]`

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-03-19 18:36:22 +00:00
Rob Bradford
521a0d1ade vmm: Fix clippy warnings for use of .clone()
warning: assigning the result of `Clone::clone()` may be inefficient
    --> vmm/src/device_manager.rs:4188:17
     |
4188 |                 id = child_id.clone();
     |                 ^^^^^^^^^^^^^^^^^^^^^ help: use `clone_from()`: `id.clone_from(child_id)`
     |
     = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#assigning_clones
     = note: `#[warn(clippy::assigning_clones)]` on by default

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-03-19 18:36:22 +00:00
Bo Chen
1363891df6 vmm: Avoid deadlock from waiting on paused device worker threads
A deadlock can happen from the destination VM of live upgrade or
migration due to waiting on paused device worker threads. For example,
when a serialization error happens after the `DeviceManager` struct is
restored (where all virtio device worker threads are spawned but in
paused/parked state), a deadlock will happen from
`DeviceManager::drop()`, as it blocks for waiting worker threads to
join.

This patch ensures that we wake up all device (mostly virtio) worker
threads before we block for them to join.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-03-14 02:07:52 +00:00
Jinank Jain
cd116cb24f vmm: hypervisor: Add support for injecting NMI for MSHV guest
Currently, we only support injecting NMI for KVM guests but we can do
the same for MSHV guests as well to have feature parity.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
2024-03-06 00:12:06 +00:00
dependabot[bot]
d05b05b050 build: Bump zbus from 3.14.1 to 3.15.2
Bumps [zbus](https://github.com/dbus2/zbus) from 3.14.1 to 3.15.2.
- [Release notes](https://github.com/dbus2/zbus/releases)
- [Commits](https://github.com/dbus2/zbus/compare/zbus-3.14.1...zbus-3.15.2)

---
updated-dependencies:
- dependency-name: zbus
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-03-05 09:25:06 +00:00
Thomas Barrett
1811e24a4b vmm: fix openapi queue_affinity config
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-03-05 09:24:55 +00:00
Yi Wang
c72bf0b32d vmm: support injecting NMI
Inject NMI interrupt when needed, by call ioctl KVM_NMI.

Signed-off-by: Yi Wang <foxywang@tencent.com>
2024-03-04 10:02:38 +00:00
Yi Wang
f40dd4a993 vmm: add endpoint api for NMI support
Add http endpoint for trigger nmi.

Signed-off-by: Yi Wang <foxywang@tencent.com>
2024-03-04 10:02:38 +00:00
dependabot[bot]
b072671e82 build: Bump serde_json from 1.0.109 to 1.0.114
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.109 to 1.0.114.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.109...v1.0.114)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-03-02 12:41:30 +00:00
dependabot[bot]
d3fade85a7 build: Bump clap from 4.4.7 to 4.5.1
Bumps [clap](https://github.com/clap-rs/clap) from 4.4.7 to 4.5.1.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/clap_complete-v4.4.7...clap_complete-v4.5.1)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-03-01 07:25:13 +00:00
Rob Bradford
b8f5687707 vmm: seccomp: Add munmap() to the "event-monitor" thread
Needed for Rust 1.74 on aarch64 with musl

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-02-29 19:42:16 +00:00
Alexandru Matei
1091494320 vmm: http: graceful shutdown of the http api thread
This commit ensures that the HttpApi thread flushes all the responses
before the application shuts down. Without this step, in case of a
VmmShutdown request the application might terminate before the
thread sends a response.

Fixes: #6247

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2024-02-29 12:34:30 +00:00
Thomas Barrett
b750c332aa vmm: add NVIDIA GPUDirect P2P support
On platforms where PCIe P2P is supported, inject a PCI capability into
NVIDIA GPU to indicate support.

Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-02-29 09:26:29 +00:00
Muminul Islam
1a4c890f83 vmm: pass host data to SevSnp guest
Host data that is passed to the hypervisor. Then
the firmware includes the data in the attestation report.
The data might include any key or secret that the SevSnp guest
might need later.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2024-02-23 13:32:56 -08:00
Muminul Islam
e51fb0ee36 vmm: validate host data for SevSnp guest
Host data for SevSnp guest should either be empty
or 64 character hex value.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2024-02-23 13:32:56 -08:00
Muminul Islam
aa6c486a6b vmm: add host-data as a command line argument
The host data provided at launch. Data is passed
to the hypervisor during the completion of the
isolated import.

Host Data provided by the hypervisor during guest launch.
The firmware includes this value in all attestation
reports for the guest.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2024-02-23 13:32:56 -08:00
Muminul Islam
b77f779c90 vmm: Add seccomp rules for MSHV SevSnp guest
There are new IOCTLs added for SevSnp guest support.
This patch adds necessary seccomp ruled.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2024-02-23 09:45:04 +00:00
dependabot[bot]
845bdfb1b2 build: Bump bitflags from 2.4.1 to 2.4.2
Bumps [bitflags](https://github.com/bitflags/bitflags) from 2.4.1 to 2.4.2.
- [Release notes](https://github.com/bitflags/bitflags/releases)
- [Changelog](https://github.com/bitflags/bitflags/blob/main/CHANGELOG.md)
- [Commits](https://github.com/bitflags/bitflags/compare/2.4.1...2.4.2)

---
updated-dependencies:
- dependency-name: bitflags
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-02-20 23:54:02 +00:00
Rob Bradford
adb318f4cd misc: Remove redundant "use" imports
With the nightly toolchain (2024-02-18) cargo check will flag up
redundant imports either because they are pulled in by the prelude on
earlier match.

Remove those redundant imports.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-02-19 17:54:30 +00:00
Chris Webb
09f3658999 vmm: Avoid zombie sigwinch_listener processes
When a guest running on a terminal reboots, the sigwinch_listener
subprocess exits and a new one restarts. The parent never wait()s
for children, so the old subprocess remains as a zombie. With further
reboots, more and more zombies build up.

As there are no other children for which we want the exit status,
the easiest fix is to take advantage of the implicit reaping specified
by POSIX when we set the disposition of SIGCHLD to SIG_IGN.

For this to work, we also need to set the correct default exit signal
of SIGCHLD when using clone3() CLONE_CLEAR_SIGHAND. Unlike the fallback
fork() path, clone_args::default() initialises the exit signal to zero,
which results in a child with non-standard reaping behaviour.

Signed-off-by: Chris Webb <chris@arachsys.com>
2024-02-19 17:08:47 +00:00
Stefan Nuernberger
09cf8c3118 arch: x86_64: bring back bzImage support
Allow cloud-hypervisor to direct boot the bzImage kernel format using
the regular 32 bit entry point. This can share the memory and vcpu
setup with the regular PVH boot code, but requires the setup of the
'zero page'.

Signed-off-by: Stefan Nuernberger <stefan.nuernberger@cyberus-technology.de>
2024-02-19 17:07:50 +00:00
Jinank Jain
9f8aeacd3d vmm: Force enable IOMMU incase of SEV-SNP guest
In case of SEV-SNP guest devices use sw-iotlb to gain access guest
memory for DMA. For that F_IOMMU/F_ACCESS_PLATFORM must be exposed in
the feature set of virtio devices.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
2024-02-16 09:28:00 +00:00
Thomas Barrett
ce7db3f7c3 arch: x86_64: allow more than 2 E820_RAM ranges
The 'generate_ram_ranges' function currently hardcodes the assumption
that there are only 2 E820 RAM entries. This is not flexible enough to
handle vendor specific memory holes. Returning a Vec is also more
convenient for users of this function.

Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-02-15 08:49:06 +00:00
Jinank Jain
eee31f1d41 vmm: Don't set rsdp addr in case of SEV-SNP guest
Since the ACPI tables are generated inside the IGVM file in case of
SEV-SNP guest. So, we don't need to generate it inside the cloud
hypervisor.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
2024-02-13 11:29:46 -08:00
Peteris Rudzusiks
612a8dfb1b vmm: seccomp: Allow all threads to call sched_yield()
We occasionally saw cloud-hypervisor crashed due to seccomp violations. The
coredumps showed the HTTP API thread crashing after it attempted to call
sched_yield(). The call came from rust stdlib's mpmc module, which calls
sched_yield() if several attempts to busy-wait for a condition to fulfil fall
short.

Since the system call is harmless and it comes from the stdlib, I opted to allow
all threads to call it.

Signed-off-by: Peteris Rudzusiks <rye@stripe.com>
2024-02-13 10:20:07 +00:00
acarp
035c4b20fb block: Set an option to pin virtio block threads to host cpus
Currently the only way to set the affinity for virtio block threads is
to boot the VM, search for the tid of each of the virtio block threads,
then set the affinity manually. This commit adds an option to pin virtio
block queues to specific host cpus (similar to pinning vcpus to host
cpus). A queue_affinity option has been added to the disk flag in
the cli to specify a mapping of queue indices to host cpus.

Signed-off-by: acarp <acarp@crusoeenergy.com>
2024-02-13 09:05:57 +00:00
dependabot[bot]
5b0de115f0 build: Bump serde from 1.0.193 to 1.0.196
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.193 to 1.0.196.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.193...v1.0.196)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-02-09 23:45:54 +00:00
Jinank Jain
441b58437f vmm: Don't configure vcpu in case of SEV-SNP
Traditional way to configure vcpu don't work for sev-snp guests. All the
vCPU configuration for SEV-SNP guest is provided via VMSA.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
2024-02-09 09:46:19 -08:00
dependabot[bot]
5641e3a283 build: Bump libc from 0.2.151 to 0.2.153
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.151 to 0.2.153.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.151...0.2.153)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-02-08 09:51:55 +00:00
Sean Banko
7633d47293 vmm: prefault memory in parallel to optimize boot time
On guests with large amounts of memory, using the `prefault` option can
lead to a very long boot time. This commit implements the strategy
taken by QEMU to prefault memory in parallel using multiple threads,
decreasing the time to allocate memory for large guests by
an order of magnitude or more.

For example, this commit reduces the time to allocate memory for a
guest configured with 704 GiB of memory on 1 NUMA node using 1 GiB
hugepages from 81.44134669s to just 6.865287881s.

Signed-off-by: Sean Banko <sbanko@crusoeenergy.com>
2024-02-07 08:59:03 -08:00
dependabot[bot]
939cc348ed build: Bump futures from 0.3.28 to 0.3.30
Bumps [futures](https://github.com/rust-lang/futures-rs) from 0.3.28 to 0.3.30.
- [Release notes](https://github.com/rust-lang/futures-rs/releases)
- [Changelog](https://github.com/rust-lang/futures-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/futures-rs/compare/0.3.28...0.3.30)

---
updated-dependencies:
- dependency-name: futures
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-02-07 10:10:40 +00:00
Rob Bradford
9dfc39d336 vmm: Make thread local initialiser constant
Beta clippy fix:

warning: initializer for `thread_local` value can be made `const`
  --> vmm/src/sigwinch_listener.rs:27:40
   |
27 |     static TX: RefCell<Option<File>> = RefCell::new(None);
   |                                        ^^^^^^^^^^^^^^^^^^ help: replace with: `const { RefCell::new(None) }`
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#thread_local_initializer_can_be_made_const
   = note: `#[warn(clippy::thread_local_initializer_can_be_made_const)]` on by default

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-02-07 09:25:40 +00:00
Rob Bradford
e70bf59809 vmm: Directly clone console resize pipe
Beta clippy fix:

warning: this call to `as_ref.map(...)` does nothing
    --> vmm/src/device_manager.rs🔢9
     |
1234 |         self.console_resize_pipe.as_ref().map(Arc::clone)
     |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try: `self.console_resize_pipe.clone()`
     |
     = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#useless_asref
     = note: `#[warn(clippy::useless_asref)]` on by default

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-02-07 09:25:40 +00:00
Muminul Islam
9b84c6c3f5 vmm: check correct buffer size during import
When we import a page, we have a page with
some data or empty, empty does not mean there is no data,
it rather means it's full of zeros. We can skip writing the
data as guest memory of the page is already zeroed.

A page could be partially filled and the rest of the content is zero.
Our IGVM generation tool only fills data here if there is some data
without zeros. Rest of them are padded. We only write data
without padding and compare whether we complete  writing
the buffer content. Still it's a full page and update the variable
with length of the full page.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2024-02-06 14:28:42 -08:00
dependabot[bot]
3945253619 build: Bump zerocopy from 0.7.31 to 0.7.32
Bumps [zerocopy](https://github.com/google/zerocopy) from 0.7.31 to 0.7.32.
- [Release notes](https://github.com/google/zerocopy/releases)
- [Changelog](https://github.com/google/zerocopy/blob/main/CHANGELOG.md)
- [Commits](https://github.com/google/zerocopy/compare/v0.7.31...v0.7.32)

---
updated-dependencies:
- dependency-name: zerocopy
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-27 00:35:32 +00:00
dependabot[bot]
7acfff5da7 build: Bump gdbstub from 0.7.0 to 0.7.1
Bumps [gdbstub](https://github.com/daniel5151/gdbstub) from 0.7.0 to 0.7.1.
- [Release notes](https://github.com/daniel5151/gdbstub/releases)
- [Changelog](https://github.com/daniel5151/gdbstub/blob/master/CHANGELOG.md)
- [Commits](https://github.com/daniel5151/gdbstub/compare/0.7.0...0.7.1)

---
updated-dependencies:
- dependency-name: gdbstub
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-26 10:03:23 +00:00
Philipp Schuster
e50a641126 devices: add debug-console device
This commit adds the debug-console (or debugcon) device to CHV. It is a
very simple device on I/O port 0xe9 supported by QEMU and BOCHS. It is
meant for printing information as easy as possible, without any
necessary configuration from the guest at all.

It is primarily interesting to OS/kernel and firmware developers as they
can produce output as soon as the guest starts without any configuration
of a serial device or similar. Furthermore, a kernel hacker might use
this device for information of type B whereas information of type A are
printed to the serial device.

This device is not used by default by Linux, Windows, or any other
"real" OS, but only by toy kernels and during firmware development.

In the CLI, it can be configured similar to --console or --serial with
the --debug-console parameter.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
2024-01-25 10:25:14 -08:00
dependabot[bot]
8f90fba250 build: Bump serde from 1.0.168 to 1.0.193
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.168 to 1.0.193.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.168...v1.0.193)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-25 11:09:33 +00:00
Bo Chen
3ce0fef7fd build: Bump vmm-sys-util crate and its consumers
This patch bumps the following crates, including `kvm-bindings@0.7.0`*,
`kvm-ioctls@0.16.0`**, `linux-loader@0.11.0`, `versionize@0.2.0`,
`versionize_derive@0.1.6`***, `vhost@0.10.0`,
`vhost-user-backend@0.13.1`, `virtio-queue@0.11.0`, `vm-memory@0.14.0`,
`vmm-sys-util@0.12.1`, and the latest of `vfio-bindings`, `vfio-ioctls`,
`mshv-bindings`,`mshv-ioctls`, and `vfio-user`.

* A fork of the `kvm-bindings` crate is being used to support
serialization of various structs for migration [1]. Also, code changes
are made to accommodate the updated `struct xsave` from the Linux
kernel. Note: these changes related to `struct xsave` break
live-upgrade.

** The new `kvm-ioctls` crate introduced breaking changes for
the `get/set_one_reg` API on `aarch64` [2], so code changes are made to
the new APIs.

*** A fork of the `versionize_derive` crate is being used to support
versionize on packed structs [3].

[1] https://github.com/cloud-hypervisor/kvm-bindings/tree/ch-v0.7.0
[2] https://github.com/rust-vmm/kvm-ioctls/pull/223
[3] https://github.com/cloud-hypervisor/versionize_derive/tree/ch-0.1.6

Fixes: #6072

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-01-25 10:14:54 +00:00
Muminul Islam
51ebc3ac92 vmm: set SEV control register for SEV-Enabled guest
Set the SEV control register so we know where to
start running.  This register configures the
SEV feature control state on a virtual processor.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
2024-01-24 14:32:16 -08:00
Alyssa Ross
7674196113 vmm: remove Default impls for config
These Default implementations either don't produce valid configs, are
no longer used outside of tests, or both.

For the tests, we can define our own local "default" values that make
the most sense for the tests, without worrying about what's
a (somewhat) sensible "global" default value.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2024-01-23 12:44:44 +00:00
dependabot[bot]
c71cb00a5a build: Bump anyhow from 1.0.75 to 1.0.79
Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.75 to 1.0.79.
- [Release notes](https://github.com/dtolnay/anyhow/releases)
- [Commits](https://github.com/dtolnay/anyhow/compare/1.0.75...1.0.79)

---
updated-dependencies:
- dependency-name: anyhow
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-23 12:44:21 +00:00
Ravi kumar Veeramally
895dc12a74 vmm: Replace Debug with Display rendering in HTTP error message
Bumping anyhow crate from 1.0.75 to 1.0.79 will cause seccomp
failures through integration tests. Newly added backtrace support
relies on readlink and many other syscalls.

Issue noticed with test_api_http_pause_resume test, where second time
of VM PAUSE or VM RESUME prints error and causes panic.
Noticed that panic message in a thread which is not allowed to write
output triggered the issue.

So implementing Display trait for HttpError and ApiError enums to avoid
adding many syscalls to seccomp filter section.

Signed-off-by: Ravi kumar Veeramally <ravikumar.veeramally@intel.com>
2024-01-23 12:44:21 +00:00