Commit Graph

334 Commits

Author SHA1 Message Date
Ruoqing He
297236a7c0 misc: Eliminate use of assert!((...).is_ok())
Asserting on .is_ok()/.is_err() leads to hard to debug failures (as if
the test fails, it will only say "assertion failed: false". We replace
these with `.unwrap()`, which also prints the exact error variant that
was unexpectedly encountered (we can to this these days thanks to
efforts to implement Display and Debug for our error types). If the
assert!((...).is_ok()) was followed by an .unwrap() anyway, we just drop
the assert.

Inspired by and quoted from @roypat.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-10-03 12:03:49 +00:00
Songqian Li
33c15ca273 vmm: remove pub use vm_config in config
This patch removes pub import vm_config in config.rs to eliminate
the ambiguity of vm_comfig reference.

Signed-off-by: Songqian Li <sionli@tencent.com>
2024-09-30 08:18:02 +00:00
Ruoqing He
61e57e1cb1 misc: Further improve imports styling
By introducing `imports_granularity="Module"` format strategy,
effectively groups imports from the same module into one line or block,
improving maintainability and readability.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-09-29 16:13:48 +00:00
Rob Bradford
88a9f79944 misc: Adapt consistent import style formatting
Historically the Cloud Hypervisor coding style has been to ensure that
all imports are ordered and placed in a single group. Unfortunately
cargo fmt has no support for ensuring that all imports are in a single
group so if whitespace lines were added as part of the import statements
then they would only be odered correctly in the group.

By adopting "group_imports="StdExternalCrate" we can enforce a style
where imports are placed in at most three groups for std, external
crates and the crate itself. Choosing a style enforceable by the tooling
reduces the reviewer burden.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-09-29 13:08:12 +01:00
Yuanchu Xie
5f18ac3bc0 devices: Add pvmemcontrol device
Pvmemcontrol provides a way for the guest to control its physical memory
properties, and enables optimizations and security features. For
example, the guest can provide information to the host where parts of a
hugepage may be unbacked, or sensitive data may not be swapped out, etc.

Pvmemcontrol allows guests to manipulate its gPTE entries in the SLAT,
and also some other properties of the memory map the back's host memory.
This is achieved by using the KVM_CAP_SYNC_MMU capability. When this
capability is available, the changes in the backing of the memory region
on the host are automatically reflected into the guest. For example, an
mmap() or madvise() that affects the region will be made visible
immediately.

There are two components of the implementation: the guest Linux driver
and Virtual Machine Monitor (VMM) device. A guest-allocated shared
buffer is negotiated per-cpu through a few PCI MMIO registers, the VMM
device assigns a unique command for each per-cpu buffer. The guest
writes its pvmemcontrol request in the per-cpu buffer, then writes the
corresponding command into the command register, calling into the VMM
device to perform the pvmemcontrol request.

The synchronous per-cpu shared buffer approach avoids the kick and busy
waiting that the guest would have to do with virtio virtqueue transport.

The Cloud Hypervisor component can be enabled with --pvmemcontrol.

Co-developed-by: Stanko Novakovic <stanko@google.com>
Co-developed-by: Pasha Tatashin <tatashin@google.com>
Signed-off-by: Yuanchu Xie <yuanchu@google.com>
2024-08-05 22:41:56 +00:00
Praveen K Paladugu
bd180bc3eb main: rename landlock_config to landlock_rules
To keep the naming consistent, rename all uses of landlock_config
to landlock_rules.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-08-05 17:46:30 +00:00
Praveen K Paladugu
d2f0e8aebb Revert "vmm: make landlock configs VMM-level config"
This reverts commit 94929889ac.
This revert moves landlock config back to VMConfig.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-08-05 17:46:30 +00:00
Wei Liu
94929889ac vmm: make landlock configs VMM-level config
This requires stashing the config values in `struct Vmm`. The configs
should be validated before before creating the VMM thread. Refactor the
code and update documentation where necessary.

The place where the rules are applied remain the same.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2024-07-06 04:42:58 +00:00
Praveen K Paladugu
af5a9677c8 vmm: Introduce Landlock module
This module introduces methods to apply Landlock LSM to cloud-hypervisor
threads.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-07-06 04:42:58 +00:00
Praveen K Paladugu
1d89f98edf vmm: Introduce landlock-rules cmdline param
Users can use this parameter to pass extra paths that 'vmm' and its
child threads can use at runtime. Hotplug is the primary usecase for
this parameter.

In order to hotplug devices that use local files: disks, memory zones,
pmem devices etc, users can use this option to pass the path/s that will
be used during hotplug while starting cloud-hypervisor. Doing this will
allow landlock to add required rules to grant access to these paths when
cloud-hypervisor process starts.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2024-07-06 04:42:58 +00:00
Praveen K Paladugu
287dbd4fc9 vmm: Introduce landlock cmdline parameter
Users can use this cmdline option to enable/disable Landlock based
sandboxing while running cloud-hypervisor.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-07-06 04:42:58 +00:00
Josh Soref
42e9632c53 misc: Fix spelling issues
Misspellings were identified by:
  https://github.com/marketplace/actions/check-spelling

* Initial corrections based on forbidden patterns from the action
* Additional corrections by Google Chrome auto-suggest
* Some manual corrections
* Adding markdown bullets to readme credits section

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2024-06-08 16:31:30 +00:00
Purna Pavan Chandra
584784a0f8 vmm: Support passing Net FDs to Restore
'NetConfig' FDs, when explicitly passed via SCM_RIGHTS during VM
creation, are marked as invalid during snapshot. See: #6332.
So, Restore should support input for the new net FDs. This patch adds
new field 'net_fds' to 'RestoreConfig'. The FDs passed using this new
field are replaced into the 'fds' field of NetConfig appropriately.

The 'validate()' function ensures all net devices from 'VmConfig' backed
by FDs have a corresponding 'RestoreNetConfig' with a matched 'id' and
expected number of FDs.

The unit tests provide different inputs to parse and validate functions
to make sure parsing and error handling is as per expectation.

Fixes #6286

Signed-off-by: Purna Pavan Chandra <paekkaladevi@linux.microsoft.com>
Co-authored-by: Bo Chen <chen.bo@intel.com>
2024-05-14 10:52:46 +00:00
Thomas Barrett
e7e856d8ac vmm: add pci_segment mmio aperture configs
When using multiple PCI segments, the 32-bit and 64-bit mmio
aperture is split equally between each segment. Add an option
to configure the 'weight'. For example, a PCI segment with a
`mmio32_aperture_weight` of 2 will be allocated twice as much
32-bit mmio space as a normal PCI segment.

Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-04-24 09:35:19 +00:00
Wei Liu
f3b0f59646 vmm: validate virtio-fs tag length
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2024-04-04 20:42:36 +00:00
Thomas Barrett
b750c332aa vmm: add NVIDIA GPUDirect P2P support
On platforms where PCIe P2P is supported, inject a PCI capability into
NVIDIA GPU to indicate support.

Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-02-29 09:26:29 +00:00
Muminul Islam
e51fb0ee36 vmm: validate host data for SevSnp guest
Host data for SevSnp guest should either be empty
or 64 character hex value.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2024-02-23 13:32:56 -08:00
Muminul Islam
aa6c486a6b vmm: add host-data as a command line argument
The host data provided at launch. Data is passed
to the hypervisor during the completion of the
isolated import.

Host Data provided by the hypervisor during guest launch.
The firmware includes this value in all attestation
reports for the guest.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2024-02-23 13:32:56 -08:00
Rob Bradford
adb318f4cd misc: Remove redundant "use" imports
With the nightly toolchain (2024-02-18) cargo check will flag up
redundant imports either because they are pulled in by the prelude on
earlier match.

Remove those redundant imports.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-02-19 17:54:30 +00:00
acarp
035c4b20fb block: Set an option to pin virtio block threads to host cpus
Currently the only way to set the affinity for virtio block threads is
to boot the VM, search for the tid of each of the virtio block threads,
then set the affinity manually. This commit adds an option to pin virtio
block queues to specific host cpus (similar to pinning vcpus to host
cpus). A queue_affinity option has been added to the disk flag in
the cli to specify a mapping of queue indices to host cpus.

Signed-off-by: acarp <acarp@crusoeenergy.com>
2024-02-13 09:05:57 +00:00
Philipp Schuster
e50a641126 devices: add debug-console device
This commit adds the debug-console (or debugcon) device to CHV. It is a
very simple device on I/O port 0xe9 supported by QEMU and BOCHS. It is
meant for printing information as easy as possible, without any
necessary configuration from the guest at all.

It is primarily interesting to OS/kernel and firmware developers as they
can produce output as soon as the guest starts without any configuration
of a serial device or similar. Furthermore, a kernel hacker might use
this device for information of type B whereas information of type A are
printed to the serial device.

This device is not used by default by Linux, Windows, or any other
"real" OS, but only by toy kernels and during firmware development.

In the CLI, it can be configured similar to --console or --serial with
the --debug-console parameter.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
2024-01-25 10:25:14 -08:00
Alyssa Ross
7674196113 vmm: remove Default impls for config
These Default implementations either don't produce valid configs, are
no longer used outside of tests, or both.

For the tests, we can define our own local "default" values that make
the most sense for the tests, without worrying about what's
a (somewhat) sensible "global" default value.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2024-01-23 12:44:44 +00:00
Sean Banko
e33279471f vmm: support setting cpu affinity with host cpu indices >255
On hosts with >256 cpus, setting the cpu affinity to a host cpu index
>255 will return an error because type of `host_cpu` is `u8`.
This commit changes the type of `host_cpu` to `usize` to remove this
limitation.

Signed-off-by: Sean Banko <sbanko@crusoeenergy.com>
2024-01-19 09:30:16 +00:00
Alyssa Ross
4ae5503b58 vmm: remove redundant tests
These are all duplicates.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2024-01-18 15:03:24 +00:00
Alyssa Ross
451d3fb2f0 vmm: limit VSOCK CIDs to 32 bits
The VIRTIO specification[1] says:

> The upper 32 bits of the CID are reserved and zeroed.

We should therefore not allow the user to supply a VSOCK CID with
those bits set.  To accomplish this, limit the public API of the
virtio-vsock device to only accept 32-bit CIDs, while still using
64-bit CIDs internally since that's how virtio-vsock works.

[1]: https://docs.oasis-open.org/virtio/virtio/v1.2/csd01/virtio-v1.2-csd01.html#x1-4400004

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2024-01-10 17:28:56 +00:00
Alyssa Ross
7d0b85d727 vmm: forbid using special VSOCK CIDs for guests
I accidentally ran a VM with CID 2 (VMADDR_CID_HOST), and very strange
and difficult to debug behavior ensued.  I don't think a virtio-vsock
device should be allowed to have any of the special CIDs
(VMADDR_CID_ANY, VMADDR_CID_HYPERVISOR, VMADDR_CID_LOCAL, VMADDR_CID_HOST).

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2024-01-10 10:43:21 +00:00
Thomas Barrett
c297d8d796 vmm: use RateLimiterGroup for virtio-blk devices
Add a 'rate_limit_groups' field to VmConfig that defines a set of
named RateLimiterGroups.

When the 'rate_limit_group' field of DiskConfig is defined, all
virtio-blk queues will be rate-limited by a shared RateLimiterGroup.
The lifecycle of all RateLimiterGroups is tied to the Vm.
A RateLimiterGroup may exist even if no Disks are configured to use
the RateLimiterGroup. Disks may be hot-added or hot-removed from the
RateLimiterGroup.

When the 'rate_limiter' field of DiskConfig is defined, we construct
an anonymous RateLimiterGroup whose lifecycle is tied to the Disk.
This is primarily done for api backwards compatability. Importantly,
the behavior is not the same! This implementation rate_limits the
aggregate bandwidth / iops of an individual disk rather than the
bandwidth / iops of an individual queue of a disk.

When neither the 'rate_limit_group' or the 'rate_limiter' fields of
DiskConfig is defined, the Disk is not rate-limited.

Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-01-03 10:21:06 -08:00
Muminul Islam
13ef424bf1 vmm: Add IGVM to the config/commandline
This patch adds igvm to the Vm config and params as well as
the command line argument to pass igvm file to load into
guest memory. The file must maintain the IGVM format.
The CLI option is featured guarded by igvm feature gate.

The IGVM(Independent Guest Virtual Machine) file format
is designed to encapsulate all information required to
launch a virtual machine on any given virtualization stack,
with support for different isolation technologies such as
AMD SEV-SNP and Intel TDX.

At a conceptual level, this file format is a set of commands created
by the tool that generated the file, used by the loader to construct
the initial guest state. The file format also contains measurement
information that the underlying platform will use to confirm that
the file was loaded correctly and signed by the appropriate authorities.

The IGVM file is generated by the tool:
https://github.com/microsoft/igvm-tooling

The IGVM file is parsed by the following crates:
https://github.com/microsoft/igvm

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2023-12-08 09:22:42 -08:00
Yong He
bb38e4e599 vmm: Allow simultaneously set serial and console as TTY mode
Cloud Hypovrisor supports legacy serial device and virito console device
for VMs. Using legacy serial device, CH can capture full VM console logs,
but its implementation is based on KVM PIO emulation and has poor
performance. Using the virtio console device, the VM console logs will
be sent to CH through the virtio ring, the performance is better, but CH
will only capture the VM console logs after the virtio console device is
initialized, the VM early startup logs will be discarded.

This patch provides a way to enable both the legacy serial device and the
virtio console device as a TTY mode by setting the leagcy serial port as
the VM's early printk device and setting the virtio console as the VM's
main console device.

Then CH can capture early boot logs from the legacy serial device and
capture later logs from the virito console device with better performance.

Signed-off-by: Yong He <alexyonghe@tencent.com>
2023-11-02 11:06:30 -07:00
Thomas Barrett
bae13c5c56 block: add aio disk backend
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2023-10-25 10:19:23 -07:00
Bo Chen
43a6eda400 vmm: Add help information for "--numa pci_segments="
See: #5844

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-10-20 11:44:28 -07:00
Wei Liu
7bc3452139 main: switch command parsing to use clap
Partially revert 111225a2a5
and add the new dbus and pvpanic arguments.

As we are switching back to clap observe the following changes.

A few examples:

1. `-v -v -v` needs to be written as`-vvv`
2. `--disk D1 --disk D2` and others need to be written as `--disk D1 D2`.
3. `--option value` needs to be written as `--option=value.`

Change integration tests to adapt to the breaking changes.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Ravi kumar Veeramally <ravikumar.veeramally@intel.com>
2023-10-20 11:44:28 -07:00
Thomas Barrett
3029fbeafd vmm: Allow assignment of PCI segments to NUMA node
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2023-10-18 11:18:15 -07:00
Praveen K Paladugu
6d1077fc3c vmm: Unix socket backend for serial port
Cloud-Hypervisor takes a path for Unix socket, where it will listen
on. Users can connect to the other end of the socket and access serial
port on the guest.

    "--serial socket=/path/to/socket" is the cmdline option to pass to
cloud-hypervisor.

Users can use socat like below to access guest's serial port once the
guest starts to boot:

    socat -,crnl UNIX-CONNECT:/path/to/socket

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2023-10-05 15:26:29 +01:00
Thomas Barrett
c4e8e653ac block: Add support for user specified ID_SERIAL
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2023-09-11 12:50:41 +01:00
Philipp Schuster
7bf0cc1ed5 misc: Fix various spelling errors using typos
This fixes all typos found by the typos utility with respect to the config file.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
2023-09-09 10:46:21 +01:00
Jinank Jain
5fd79571b7 vmm: Add a feature flag for SEV-SNP support
This feature flag gates the development for SEV-SNP enabled guest.

Also add a helper function to identify if SNP should be enabled for the
guest.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
2023-09-07 12:52:27 +01:00
Alyssa Ross
69bd0036d9 vmm: support removing devices before VM is booted
If the VM has been configured but not yet booted, all we need to do to
support removing a device is to remove it from the config, so it will
never be created.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-07-14 10:01:23 -07:00
Yi Wang
d99c0c0d1d devices: pvpanic: add method for DeviceManager
Add method for DeviceManager to invoke.

Signed-off-by: Yi Wang <foxywang@tencent.com>
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2023-07-06 11:14:54 +01:00
Wei Liu
aa14fe214a pci: bump the number of supported PCI segments
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2023-05-02 09:34:05 +01:00
Wei Liu
45e3f49bba vmm: use MAX_NUM_PCI_SEGMENTS in test cases
No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2023-05-02 09:34:05 +01:00
Bo Chen
4baf85857a vmm: Add unit test for 'VmConfig::preserved_fds'
Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-17 16:33:29 +01:00
Bo Chen
e3d2917d5f vmm: Implement Clone and Drop for VmConfig
The custom 'clone' duplicates 'preserved_fds' so that the validation
logic can be safely carried out on the clone of the VmConfig.

The custom 'drop' ensures 'preserved_fds' are safely closed when the
holding VmConfig instance is destroyed.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-17 16:33:29 +01:00
Bo Chen
a84b540b65 vmm: config: Extend 'VmConfig' with 'preserved_fds'
Preserved FDs are the ones that share the same life-time as its holding
VmConfig instance, such as FDs for creating TAP devices.

Preserved FDs will stay open as long as the holding VmConfig instance is
valid, and will be closed when the holding VmConfig instance is destroyed.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-17 16:33:29 +01:00
Bo Chen
8eb162e3d7 Revert "vmm: config: Implement Clone for NetConfig"
This reverts commit ea4a95c4f6.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-17 16:33:29 +01:00
Bo Chen
2804608a1c Revert "vmm: config: Close FDs for TAP devices that are provided to VM"
This reverts commit b14427540b.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-17 16:33:29 +01:00
Bo Chen
e0125653b1 Revert "vmm: config: Don't close reserved FDs from NetConfig::drop()"
This reverts commit 0110fb4edc.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-17 16:33:29 +01:00
Bo Chen
e431a48201 Revert "vmm: config: Avoid closing invalid FDs from 'test_net_parsing()'"
This reverts commit 0567def931.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-17 16:33:29 +01:00
Bo Chen
c143cb3af0 Revert "vmm: config: Replace use of memfd_create with fd pointing to /dev/null"
This reverts commit 46066d6ae1.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-17 16:33:29 +01:00
Yu Li
74dcb37ec3 vmm: config: fix incorrect values of error
The PR #2333 added I/O rate limiter on block device, with some options
in `DiskConfig`.  And the PR #2401 added rate limiter on virtio-net
device with same options, but it still throws `Error::ParseDisk`.

This commit fixes it with correct values.

Fixes: #2401

Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
2023-02-23 09:57:48 +00:00